PythonHub
2.43K subscribers
2.35K photos
49.3K links
News & links about Python programming.
https://pythonhub.dev/
Download Telegram
TensorRT-Model-Optimizer

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.

https://github.com/NVIDIA/TensorRT-Model-Optimizer
Jaxformer Scaling Modern Transformers

This is a zero-to-one guide on scaling modern transformers with n-dimensional parallelism. Transformers have driven much of the deep learning revolution, yet no practical guide reflects SOTA architectures and the complexities of large-scale language modelling. While excellent resources such as DeepMind’s How to Scale Your Model and HuggingFace’s Ultra Scale Playbook exist, a gap remains ...

https://jaxformer.com/
Mini-o3

Scaling Up Reasoning Patterns and Interaction Turns for Visual Search.

https://mini-o3.github.io/
Sphinx Docs Instantly in Your Browser (MyST Markdown + reStructuredText)

Edit and preview reStructuredText or MyST Markdown instantly in a Sphinx running in a browser. Runs entirely in Python using WebAssembly, so it’s private, fast, and ideal for learning markup.

https://snippets.documatt.com
Just for fun: animating a mosaic of 90s GIFs

The post describes an experiment in animating a mosaic of vintage 90s GIFs collected from the GeoCities archive, using HTML Canvas for random, lively playback. It celebrates the playful aesthetics of early web graphics and highlights the technical and nostalgic joy of reintroducing these classic GIFs into a modern browser setting.

https://alexplescan.com/posts/2025/09/15/gifs/
Tiny LLM - LLM Serving in a Week

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

https://skyzh.github.io/tiny-llm/
Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK

The guide demonstrates how to use the OpenAI Agents SDK’s Session object to manage short-term memory in AI agents, enabling context trimming and compression for efficient, coherent, and cost-effective multi-turn conversations. Effective session memory ensures agents maintain relevant history across turns while reducing noise, latency, and error risk in longer interactions.

https://cookbook.openai.com/examples/agents_sdk/session_memory
Python Tutorial: Build an AI-assisted Reddit Scraping Pipeline

The video provides an in-depth, hands-on tutorial for building a resilient, AI-assisted Reddit scraping pipeline in Python, covering everything from Jupyter prototyping and LangChain agents to a Django-based background worker architecture. It teaches viewers to automate web scraping, integrate Google’s Gemini LLM for query refinement, and store structured results in PostgreSQL, suitable ...

https://www.youtube.com/watch?v=XI-iP-qk_Vk