Tiny LLM - LLM Serving in a Week
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
https://skyzh.github.io/tiny-llm/
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
https://skyzh.github.io/tiny-llm/
ApeRAG
Production-ready GraphRAG with multi-modal indexing, AI agents, MCP support, and scalable K8s deployment
https://github.com/apecloud/ApeRAG
Production-ready GraphRAG with multi-modal indexing, AI agents, MCP support, and scalable K8s deployment
https://github.com/apecloud/ApeRAG
GitHub
GitHub - apecloud/ApeRAG: ApeRAG: Production-ready GraphRAG with multi-modal indexing, AI agents, MCP support, and scalable K8s…
ApeRAG: Production-ready GraphRAG with multi-modal indexing, AI agents, MCP support, and scalable K8s deployment - apecloud/ApeRAG
Nallely – A Python signals/MIDI processing system inspired by Smalltalk
https://dr-schlange.github.io/nallely-midi/
https://dr-schlange.github.io/nallely-midi/
Nallely MIDI
Nallely MIDI · Nallely MIDI
Nallely is an experimental organic system for advanced MIDI patching, live coding, generative music, and multimodal art, built for hacker/musicians, developed in Python, inspired by Smalltalk and Systems as Living Things
Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK
The guide demonstrates how to use the OpenAI Agents SDK’s Session object to manage short-term memory in AI agents, enabling context trimming and compression for efficient, coherent, and cost-effective multi-turn conversations. Effective session memory ensures agents maintain relevant history across turns while reducing noise, latency, and error risk in longer interactions.
https://cookbook.openai.com/examples/agents_sdk/session_memory
The guide demonstrates how to use the OpenAI Agents SDK’s Session object to manage short-term memory in AI agents, enabling context trimming and compression for efficient, coherent, and cost-effective multi-turn conversations. Effective session memory ensures agents maintain relevant history across turns while reducing noise, latency, and error risk in longer interactions.
https://cookbook.openai.com/examples/agents_sdk/session_memory
Openai
Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK | OpenAI Cookbook
AI agents often operate in long-running, multi-turn interactions, where keeping the right balance of context is critical. If too much is...
Semlib
Build data processing and data analysis pipelines that leverage the power of LLMs.
https://github.com/anishathalye/semlib
Build data processing and data analysis pipelines that leverage the power of LLMs.
https://github.com/anishathalye/semlib
GitHub
GitHub - anishathalye/semlib: Build data processing and data analysis pipelines that leverage the power of LLMs 🧠
Build data processing and data analysis pipelines that leverage the power of LLMs 🧠 - anishathalye/semlib
Python Tutorial: Build an AI-assisted Reddit Scraping Pipeline
The video provides an in-depth, hands-on tutorial for building a resilient, AI-assisted Reddit scraping pipeline in Python, covering everything from Jupyter prototyping and LangChain agents to a Django-based background worker architecture. It teaches viewers to automate web scraping, integrate Google’s Gemini LLM for query refinement, and store structured results in PostgreSQL, suitable ...
https://www.youtube.com/watch?v=XI-iP-qk_Vk
The video provides an in-depth, hands-on tutorial for building a resilient, AI-assisted Reddit scraping pipeline in Python, covering everything from Jupyter prototyping and LangChain agents to a Django-based background worker architecture. It teaches viewers to automate web scraping, integrate Google’s Gemini LLM for query refinement, and store structured results in PostgreSQL, suitable ...
https://www.youtube.com/watch?v=XI-iP-qk_Vk
YouTube
Python Tutorial: Build an AI-assisted Reddit Scraping Pipeline
🚀 Sign up for Bright Data right now: https://brdta.com/cfe
Automatically find and track topics you care about across Reddit posts. From camping to the latest in AI news, this course will show you how to build a powerful and resilient system in Python.
…
Automatically find and track topics you care about across Reddit posts. From camping to the latest in AI news, this course will show you how to build a powerful and resilient system in Python.
…
Defeating Nondeterminism in LLM Inference
LLM inference is often nondeterministic even with temperature set to zero, primarily due to batch-size-dependent kernel behaviors that change results based on server load rather than randomness or floating-point issues. The solution is to use batch-invariant kernels, ensuring reproducible outputs even in high-concurrency environments, which is now possible but may come with some efficien...
https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference
LLM inference is often nondeterministic even with temperature set to zero, primarily due to batch-size-dependent kernel behaviors that change results based on server load rather than randomness or floating-point issues. The solution is to use batch-invariant kernels, ensuring reproducible outputs even in high-concurrency environments, which is now possible but may come with some efficien...
https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference
Thinking Machines Lab
Defeating Nondeterminism in LLM Inference
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models.
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
ROMA
A meta-agent framework to build high-performance multi-agent systems.
https://github.com/sentient-agi/ROMA
A meta-agent framework to build high-performance multi-agent systems.
https://github.com/sentient-agi/ROMA
GitHub
GitHub - sentient-agi/ROMA: Recursive-Open-Meta-Agent v0.1 (Beta). A meta-agent framework to build high-performance multi-agent…
Recursive-Open-Meta-Agent v0.1 (Beta). A meta-agent framework to build high-performance multi-agent systems. - sentient-agi/ROMA
Alibaba-NLP / DeepResearch
Tongyi Deep Research, the Leading Open-source Deep Research Agent
https://github.com/Alibaba-NLP/DeepResearch
Tongyi Deep Research, the Leading Open-source Deep Research Agent
https://github.com/Alibaba-NLP/DeepResearch
GitHub
GitHub - Alibaba-NLP/DeepResearch: Tongyi Deep Research, the Leading Open-source Deep Research Agent
Tongyi Deep Research, the Leading Open-source Deep Research Agent - Alibaba-NLP/DeepResearch
PyCon Australia 2025
PyCon Australia 2025 talks videos are available now.
https://www.youtube.com/playlist?list=PLs4CJRBY5F1LRkAAUwbqHNGPBlxDkrz-3
PyCon Australia 2025 talks videos are available now.
https://www.youtube.com/playlist?list=PLs4CJRBY5F1LRkAAUwbqHNGPBlxDkrz-3
YouTube
PyCon Australia 2025
Share your videos with friends, family, and the world
👍1
riffq
A toolkit for building PostgreSQL wire-compatible databases in Python, powered by Rust for performance and concurrency.
https://github.com/ybrs/riffq
A toolkit for building PostgreSQL wire-compatible databases in Python, powered by Rust for performance and concurrency.
https://github.com/ybrs/riffq
GitHub
GitHub - ybrs/riffq
Contribute to ybrs/riffq development by creating an account on GitHub.
Tricks from OpenAI gpt-oss YOU ?? can use with transformers
The post details major upgrades that allow models like OpenAI’s GPT-OSS to run, fine-tune, and scale efficiently, including zero-build kernels, 4-bit MXFP4 quantization, tensor and expert parallelism, dynamic layerwise caching, and continuous batching. These improvements cut memory usage, boost speed, and enable larger models to run on affordable hardware, making cutting-edge techniques ...
https://huggingface.co/blog/faster-transformers
The post details major upgrades that allow models like OpenAI’s GPT-OSS to run, fine-tune, and scale efficiently, including zero-build kernels, 4-bit MXFP4 quantization, tensor and expert parallelism, dynamic layerwise caching, and continuous batching. These improvements cut memory usage, boost speed, and enable larger models to run on affordable hardware, making cutting-edge techniques ...
https://huggingface.co/blog/faster-transformers
huggingface.co
Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
VeritasGraph
Enterprise-Grade Graph RAG for Secure, On-Premise AI with Verifiable Attribution.
https://github.com/bibinprathap/VeritasGraph
Enterprise-Grade Graph RAG for Secure, On-Premise AI with Verifiable Attribution.
https://github.com/bibinprathap/VeritasGraph
GitHub
GitHub - bibinprathap/VeritasGraph: VeritasGraph: Enterprise-Grade Graph RAG for Secure, On-Premise AI with Verifiable Attribution
VeritasGraph: Enterprise-Grade Graph RAG for Secure, On-Premise AI with Verifiable Attribution - bibinprathap/VeritasGraph
Avoid Messy Code: Design Patterns for AI Agents in Python
The video demonstrates how to keep Python code for AI agents clean and maintainable by applying design patterns like Chain of Responsibility (for modular pipelines), Observer (for agent logging and context), and Strategy (for pluggable agent personalities). These patterns help decompose logic, improve scalability, and ensure testability for complex AI workflows.
https://www.youtube.com/watch?v=8_liatgLkLc
The video demonstrates how to keep Python code for AI agents clean and maintainable by applying design patterns like Chain of Responsibility (for modular pipelines), Observer (for agent logging and context), and Strategy (for pluggable agent personalities). These patterns help decompose logic, improve scalability, and ensure testability for complex AI workflows.
https://www.youtube.com/watch?v=8_liatgLkLc
YouTube
Avoid Messy Code: Design Patterns for AI Agents in Python
Check out https://www.squarespace.com/arjancodes to save 10% off your first purchase of a website or domain using code ARJANCODES.
If you’re building AI agents in Python and your code is starting to get messy, this video shows you how to use proven design…
If you’re building AI agents in Python and your code is starting to get messy, this video shows you how to use proven design…
Hyperparameter Tuning Tips that 99% of Data Scientists Overlook
This video shows how to tune XGBoost models with Optuna while maximizing speed using XGBoost 3.0’s GPU acceleration for 5–15x faster training. He explains why cross-validation is crucial, recommends smart tuning practices, and demonstrates how Optuna’s visualizations help identify impactful hyperparameters in real-world tabular data workflows.
https://www.youtube.com/watch?v=D9xPjkOwpNk
This video shows how to tune XGBoost models with Optuna while maximizing speed using XGBoost 3.0’s GPU acceleration for 5–15x faster training. He explains why cross-validation is crucial, recommends smart tuning practices, and demonstrates how Optuna’s visualizations help identify impactful hyperparameters in real-world tabular data workflows.
https://www.youtube.com/watch?v=D9xPjkOwpNk
YouTube
Hyperparameter Tuning Tips that 99% of Data Scientists Overlook
In this video you will learn about hyperparameter tuning for XGBoost models using optuna. We also will leverage XGBoost 3.0's GPU support for 5-15x speedup (no code changes required) while training the models.
Kaggle notebook here: https://bit.ly/RobMul…
Kaggle notebook here: https://bit.ly/RobMul…
MathFlow
Likerequestsfor mathematical computing, making complex math feel simple.
https://github.com/cybergeek1943/MathFlow
Likerequestsfor mathematical computing, making complex math feel simple.
https://github.com/cybergeek1943/MathFlow
GitHub
GitHub - cybergeek1943/MathFlow: Like `requests` for mathematical computing, making complex math feel simple.
Like `requests` for mathematical computing, making complex math feel simple. - cybergeek1943/MathFlow
Today I learned that Python doesn't care about how many spaces you indent as long as it's consistent
https://www.reddit.com/r/Python/comments/1nkidxq/today_i_learned_that_python_doesnt_care_about_how/
https://www.reddit.com/r/Python/comments/1nkidxq/today_i_learned_that_python_doesnt_care_about_how/
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
numethods
A lightweight, from-scratch, object-oriented Python package implementing classic numerical methods.
https://github.com/denizd1/numethods
A lightweight, from-scratch, object-oriented Python package implementing classic numerical methods.
https://github.com/denizd1/numethods
GitHub
GitHub - denizd1/numethods: A lightweight, from-scratch, object-oriented Python package implementing classic numerical methods.
A lightweight, from-scratch, object-oriented Python package implementing classic numerical methods. - denizd1/numethods