PythonHub
2.44K subscribers
2.35K photos
49.2K links
News & links about Python programming.
https://pythonhub.dev/
Download Telegram
Speeding up PyTorch inference by 87% on Apple devices with AI-generated Metal kernels

The post describes how AI models can automatically generate optimized Metal GPU kernels that speed up PyTorch inference on Apple devices by an average of 87% across 215 modules, with some kernels running hundreds of times faster than baseline. Using an agentic swarm approach and adding context like CUDA references and profiling data, the system outperforms standalone models, making kerne...

https://gimletlabs.ai/blog/ai-generated-metal-kernels
PageIndex

PageIndex is a reasoning-based RAG system that simulates how human experts navigate and extract knowledge from long documents through tree search, enabling LLMs to think and reason their way to the most relevant document sections.

https://github.com/VectifyAI/PageIndex
1
How I write Django views

The author advocates using Django's base View class over generic class-based or function-based views for simplicity and flexibility in handling HTTP requests. By avoiding complex mixins and leveraging straightforward helper methods, developers can write clearer, more maintainable view code with minimal cognitive overhead.

https://www.loopwerk.io/articles/2025/django-views/
1
Zuban

Zuban is a high-performance Python Language Server and type checker implemented in Rust, by the author of Jedi. Zuban is 20–200× faster than Mypy, while using roughly half the memory and CPU compared to Ty and Pyrefly. It offers both a PyRight-like mode and a Mypy-compatible mode, which behaves just like Mypy; supporting the same config files, command-line flags, and error messages.

https://github.com/zubanls/zuban
Polars GPU Execution. (70% speed up)

Polars' new GPU engine, powered by NVIDIA RAPIDS cuDF, accelerates data processing up to 70% compared to CPU-based execution, enabling faster handling of large datasets. The beta release supports common operations, leveraging GPU parallel processing for significant performance gains in data analytics workflows.

https://dataengineeringcentral.substack.com/p/polars-gpu-execution-70-speed-up
vLLM with torch.compile: Efficient LLM inference on PyTorch

Learn how to optimize PyTorch code with minimal effort using torch.compile, a just-in-time compiler that generates optimized kernels automatically.

https://developers.redhat.com/articles/2025/09/03/vllm-torchcompile-efficient-llm-inference-pytorch
sync-with-uv

The sync-with-uv package automates version synchronization between uv.lock and .pre-commit-config.yaml, ensuring consistent dependency management for tools like black, ruff, and mypy. It integrates as a pre-commit hook, streamlining workflows by aligning versions from a single source while leaving unspecified tools unchanged.

https://github.com/tsvikas/sync-with-uv
PydanticAI: the AI Agent Framework Winner

The video showcases how to use Pydantic AI to build Python applications with AI-powered agents that provide validated, structured outputs by integrating large language models like GPT-5. It demonstrates a healthcare triage assistant that personalizes responses using domain data, dependencies, and customizable prompts, enabling robust, real-world AI integration beyond simple chatbots.

https://www.youtube.com/watch?v=-WB0T0XmDrY