Speeding up PyTorch inference by 87% on Apple devices with AI-generated Metal kernels
The post describes how AI models can automatically generate optimized Metal GPU kernels that speed up PyTorch inference on Apple devices by an average of 87% across 215 modules, with some kernels running hundreds of times faster than baseline. Using an agentic swarm approach and adding context like CUDA references and profiling data, the system outperforms standalone models, making kerne...
https://gimletlabs.ai/blog/ai-generated-metal-kernels
The post describes how AI models can automatically generate optimized Metal GPU kernels that speed up PyTorch inference on Apple devices by an average of 87% across 215 modules, with some kernels running hundreds of times faster than baseline. Using an agentic swarm approach and adding context like CUDA references and profiling data, the system outperforms standalone models, making kerne...
https://gimletlabs.ai/blog/ai-generated-metal-kernels
Gimlet Blog
A blog about research on high performance AI systems.
PageIndex
PageIndex is a reasoning-based RAG system that simulates how human experts navigate and extract knowledge from long documents through tree search, enabling LLMs to think and reason their way to the most relevant document sections.
https://github.com/VectifyAI/PageIndex
PageIndex is a reasoning-based RAG system that simulates how human experts navigate and extract knowledge from long documents through tree search, enabling LLMs to think and reason their way to the most relevant document sections.
https://github.com/VectifyAI/PageIndex
GitHub
GitHub - VectifyAI/PageIndex: 📄🧠 PageIndex: Document Index for Reasoning-based RAG
📄🧠 PageIndex: Document Index for Reasoning-based RAG - VectifyAI/PageIndex
❤1
Niche Python tools, libraries and features - whats your favourite?
https://www.reddit.com/r/Python/comments/1n7r4xb/niche_python_tools_libraries_and_features_whats/
https://www.reddit.com/r/Python/comments/1n7r4xb/niche_python_tools_libraries_and_features_whats/
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
How I write Django views
The author advocates using Django's base View class over generic class-based or function-based views for simplicity and flexibility in handling HTTP requests. By avoiding complex mixins and leveraging straightforward helper methods, developers can write clearer, more maintainable view code with minimal cognitive overhead.
https://www.loopwerk.io/articles/2025/django-views/
The author advocates using Django's base View class over generic class-based or function-based views for simplicity and flexibility in handling HTTP requests. By avoiding complex mixins and leveraging straightforward helper methods, developers can write clearer, more maintainable view code with minimal cognitive overhead.
https://www.loopwerk.io/articles/2025/django-views/
Loopwerk
How I write Django views
Why I only use Django's base View class instead of generic class-based views or function-based views.
❤1
toolfront
Simple data retrieval for AI with unmatched control, precision, and speed.
https://github.com/kruskal-labs/toolfront
Simple data retrieval for AI with unmatched control, precision, and speed.
https://github.com/kruskal-labs/toolfront
GitHub
GitHub - kruskal-labs/toolfront: Data retrieval for AI agents
Data retrieval for AI agents. Contribute to kruskal-labs/toolfront development by creating an account on GitHub.
❤1
Sharing a mutable reference between Rust and Python
https://blog.lilyf.org/posts/python-mutable-reference/
https://blog.lilyf.org/posts/python-mutable-reference/
Lily's Blog
Sharing a mutable reference with Python
Background
As part of my ongoing project to reimplement Django’s templating language in Rust, I have been adding support for custom template tags.
Simple tags
The simplest custom tag will look something like:
# time_tags.py
from datetime import datetime…
As part of my ongoing project to reimplement Django’s templating language in Rust, I have been adding support for custom template tags.
Simple tags
The simplest custom tag will look something like:
# time_tags.py
from datetime import datetime…
Youtu-agent
A simple yet powerful agent framework that delivers with open-source models.
https://github.com/Tencent/Youtu-agent
A simple yet powerful agent framework that delivers with open-source models.
https://github.com/Tencent/Youtu-agent
GitHub
GitHub - TencentCloudADP/youtu-agent: A simple yet powerful agent framework that delivers with open-source models
A simple yet powerful agent framework that delivers with open-source models - TencentCloudADP/youtu-agent
Zuban
Zuban is a high-performance Python Language Server and type checker implemented in Rust, by the author of Jedi. Zuban is 20–200× faster than Mypy, while using roughly half the memory and CPU compared to Ty and Pyrefly. It offers both a PyRight-like mode and a Mypy-compatible mode, which behaves just like Mypy; supporting the same config files, command-line flags, and error messages.
https://github.com/zubanls/zuban
Zuban is a high-performance Python Language Server and type checker implemented in Rust, by the author of Jedi. Zuban is 20–200× faster than Mypy, while using roughly half the memory and CPU compared to Ty and Pyrefly. It offers both a PyRight-like mode and a Mypy-compatible mode, which behaves just like Mypy; supporting the same config files, command-line flags, and error messages.
https://github.com/zubanls/zuban
GitHub
GitHub - zubanls/zuban: Zuban Language Server Issue Tracker
Zuban Language Server Issue Tracker. Contribute to zubanls/zuban development by creating an account on GitHub.
WhisperLiveKit
Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI.
https://github.com/QuentinFuxa/WhisperLiveKit
Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI.
https://github.com/QuentinFuxa/WhisperLiveKit
GitHub
GitHub - QuentinFuxa/WhisperLiveKit: Real-time & local speech-to-text, translation, and speaker diarization. With server & web…
Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI. - QuentinFuxa/WhisperLiveKit
👏1
yesglot
LLM-powered Django translations. Just call me "python manage.py translatemessages"
https://github.com/efe/yesglot
LLM-powered Django translations. Just call me "python manage.py translatemessages"
https://github.com/efe/yesglot
GitHub
GitHub - efe/yesglot: LLM-powered Django translations ✨ Just call me "python manage.py translatemessages"
LLM-powered Django translations ✨ Just call me "python manage.py translatemessages" - efe/yesglot
Polars GPU Execution. (70% speed up)
Polars' new GPU engine, powered by NVIDIA RAPIDS cuDF, accelerates data processing up to 70% compared to CPU-based execution, enabling faster handling of large datasets. The beta release supports common operations, leveraging GPU parallel processing for significant performance gains in data analytics workflows.
https://dataengineeringcentral.substack.com/p/polars-gpu-execution-70-speed-up
Polars' new GPU engine, powered by NVIDIA RAPIDS cuDF, accelerates data processing up to 70% compared to CPU-based execution, enabling faster handling of large datasets. The beta release supports common operations, leveraging GPU parallel processing for significant performance gains in data analytics workflows.
https://dataengineeringcentral.substack.com/p/polars-gpu-execution-70-speed-up
Substack
Polars GPU Execution. (70% speed up)
to the moon
vLLM with torch.compile: Efficient LLM inference on PyTorch
Learn how to optimize PyTorch code with minimal effort using torch.compile, a just-in-time compiler that generates optimized kernels automatically.
https://developers.redhat.com/articles/2025/09/03/vllm-torchcompile-efficient-llm-inference-pytorch
Learn how to optimize PyTorch code with minimal effort using torch.compile, a just-in-time compiler that generates optimized kernels automatically.
https://developers.redhat.com/articles/2025/09/03/vllm-torchcompile-efficient-llm-inference-pytorch
Red Hat Developer
vLLM with torch.compile: Efficient LLM inference on PyTorch | Red Hat Developer
NoteThis post was originally published on the vLLM blog
sync-with-uv
The sync-with-uv package automates version synchronization between uv.lock and .pre-commit-config.yaml, ensuring consistent dependency management for tools like black, ruff, and mypy. It integrates as a pre-commit hook, streamlining workflows by aligning versions from a single source while leaving unspecified tools unchanged.
https://github.com/tsvikas/sync-with-uv
The sync-with-uv package automates version synchronization between uv.lock and .pre-commit-config.yaml, ensuring consistent dependency management for tools like black, ruff, and mypy. It integrates as a pre-commit hook, streamlining workflows by aligning versions from a single source while leaving unspecified tools unchanged.
https://github.com/tsvikas/sync-with-uv
GitHub
GitHub - tsvikas/sync-with-uv: Sync .pre-commit-config.yaml from uv.lock
Sync .pre-commit-config.yaml from uv.lock. Contribute to tsvikas/sync-with-uv development by creating an account on GitHub.
PydanticAI: the AI Agent Framework Winner
The video showcases how to use Pydantic AI to build Python applications with AI-powered agents that provide validated, structured outputs by integrating large language models like GPT-5. It demonstrates a healthcare triage assistant that personalizes responses using domain data, dependencies, and customizable prompts, enabling robust, real-world AI integration beyond simple chatbots.
https://www.youtube.com/watch?v=-WB0T0XmDrY
The video showcases how to use Pydantic AI to build Python applications with AI-powered agents that provide validated, structured outputs by integrating large language models like GPT-5. It demonstrates a healthcare triage assistant that personalizes responses using domain data, dependencies, and customizable prompts, enabling robust, real-world AI integration beyond simple chatbots.
https://www.youtube.com/watch?v=-WB0T0XmDrY
YouTube
PydanticAI: the AI Agent Framework Winner
Check out https://www.squarespace.com/arjancodes to save 10% off your first purchase of a website or domain using code ARJANCODES.
Pydantic AI lets you integrate large language models like GPT-5 **directly into your Python applications**.
In this video,…
Pydantic AI lets you integrate large language models like GPT-5 **directly into your Python applications**.
In this video,…
Kronos
A Foundation Model for the Language of Financial Markets.
https://github.com/shiyu-coder/Kronos
A Foundation Model for the Language of Financial Markets.
https://github.com/shiyu-coder/Kronos
GitHub
GitHub - shiyu-coder/Kronos: Kronos: A Foundation Model for the Language of Financial Markets
Kronos: A Foundation Model for the Language of Financial Markets - shiyu-coder/Kronos
Ducky
An open-source, all-in-one desktop application for network engineers, students, and enthusiasts.
https://github.com/thecmdguy/Ducky
An open-source, all-in-one desktop application for network engineers, students, and enthusiasts.
https://github.com/thecmdguy/Ducky
GitHub
GitHub - thecmdguy/Ducky: Ducky is a powerful, open-source, all-in-one desktop application built with Python and PySide6. It is…
Ducky is a powerful, open-source, all-in-one desktop application built with Python and PySide6. It is designed to be the perfect companion for network engineers, students, and tech enthusiasts, com...