PythonHub
2.43K subscribers
2.35K photos
49.3K links
News & links about Python programming.
https://pythonhub.dev/
Download Telegram
Mini-o3

Scaling Up Reasoning Patterns and Interaction Turns for Visual Search.

https://mini-o3.github.io/
Sphinx Docs Instantly in Your Browser (MyST Markdown + reStructuredText)

Edit and preview reStructuredText or MyST Markdown instantly in a Sphinx running in a browser. Runs entirely in Python using WebAssembly, so it’s private, fast, and ideal for learning markup.

https://snippets.documatt.com
Just for fun: animating a mosaic of 90s GIFs

The post describes an experiment in animating a mosaic of vintage 90s GIFs collected from the GeoCities archive, using HTML Canvas for random, lively playback. It celebrates the playful aesthetics of early web graphics and highlights the technical and nostalgic joy of reintroducing these classic GIFs into a modern browser setting.

https://alexplescan.com/posts/2025/09/15/gifs/
Tiny LLM - LLM Serving in a Week

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

https://skyzh.github.io/tiny-llm/
Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK

The guide demonstrates how to use the OpenAI Agents SDK’s Session object to manage short-term memory in AI agents, enabling context trimming and compression for efficient, coherent, and cost-effective multi-turn conversations. Effective session memory ensures agents maintain relevant history across turns while reducing noise, latency, and error risk in longer interactions.

https://cookbook.openai.com/examples/agents_sdk/session_memory
Python Tutorial: Build an AI-assisted Reddit Scraping Pipeline

The video provides an in-depth, hands-on tutorial for building a resilient, AI-assisted Reddit scraping pipeline in Python, covering everything from Jupyter prototyping and LangChain agents to a Django-based background worker architecture. It teaches viewers to automate web scraping, integrate Google’s Gemini LLM for query refinement, and store structured results in PostgreSQL, suitable ...

https://www.youtube.com/watch?v=XI-iP-qk_Vk
Defeating Nondeterminism in LLM Inference

LLM inference is often nondeterministic even with temperature set to zero, primarily due to batch-size-dependent kernel behaviors that change results based on server load rather than randomness or floating-point issues. The solution is to use batch-invariant kernels, ensuring reproducible outputs even in high-concurrency environments, which is now possible but may come with some efficien...

https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference
riffq

A toolkit for building PostgreSQL wire-compatible databases in Python, powered by Rust for performance and concurrency.

https://github.com/ybrs/riffq
Tricks from OpenAI gpt-oss YOU ?? can use with transformers

The post details major upgrades that allow models like OpenAI’s GPT-OSS to run, fine-tune, and scale efficiently, including zero-build kernels, 4-bit MXFP4 quantization, tensor and expert parallelism, dynamic layerwise caching, and continuous batching. These improvements cut memory usage, boost speed, and enable larger models to run on affordable hardware, making cutting-edge techniques ...

https://huggingface.co/blog/faster-transformers