PythonHub
2.4K subscribers
2.35K photos
49.1K links
News & links about Python programming.
https://pythonhub.dev/
Download Telegram
Minions

Minions is a communication protocol that enables small on-device models to collaborate with frontier models in the cloud. By only reading long contexts locally, we can reduce cloud costs with minimal or no quality degradation.

https://github.com/HazyResearch/minions
Craw4LLM

CRAW4LLM is an efficient web crawling method that prioritizes webpages based on their potential influence on LLM pretraining, replacing traditional graph-connectivity-based priorities. By crawling only 21% of URLs, it achieves the same downstream performance as previous methods, significantly reducing data waste and website burden.

https://github.com/cxcscmu/Craw4LLM
AI Engineering Goes Visual: Web Scraping & Data Prep with PyFlyde

In part one of two part tutorial, we explore how to use Flyde, a visual programming tool, to build a web scraper that feeds into a Retrieval Augmentation Generation (RAG) system. We will cover the process of scraping web content and storing it locally, setting the stage for more advanced AI engineering tasks.

https://blog.kodigy.com/post/visual-ai-engineering-with-pyflyde-pt1-scraper/
NotaGen

NotaGen is a symbolic music generation model leveraging Large Language Models (LLMs) through pre-training on 1.6M musical pieces, fine-tuning on classical compositions, and reinforcement learning using a novel CLaMP-DPO method.

https://github.com/ElectricAlexis/NotaGen
How I Automated My Podcast Transcript Production With Local AI

The author automated podcast transcription using roboscribe, a Python tool that combines WhisperX for diarized transcription and a local Large Language Model (LLM) for cleaning up the transcript, significantly improving readability. By leveraging local AI models, the author maintains control and optimizes the transcription process on their own hardware, achieving high-quality results in ...

https://den.dev/blog/how-i-automated-podcast-transcription-with-local-ai/
2