PythonHub

How Well Do New Python Type Checkers Conform? A Deep Dive into Ty, Pyrefly, and Zuban

The Python type checking landscape in 2025 includes three new Rust-based tools: Astral's ty, Meta's pyrefly, and Zuban. Ty emphasizes gradual adoption with fewer false positives, pyrefly focuses on aggressive inference to catch more issues early, and Zuban aims for seamless mypy compatibility; while conformance tests reveal differences, all show promise for real-world Python development.

https://sinon.github.io/future-python-type-checkers/

Rob's Blog | Python • Rust • Ramblings?

How Well Do New Python Type Checkers Conform? A Deep Dive into Ty, Pyrefly, and Zuban — Rob's Blog | Python • Rust • Ramblings?

A comparison of three new Rust-based Python type checkers through the lens of typing spec conformance: Astral's ty, Meta's pyrefly, and David Halter's zuban

158 views23:15

PythonHub

Cloud-Native Pipelines for Scientific Data Processing with Prefect and Dask

This article explains how to build scalable, cloud-native scientific data processing pipelines using Prefect for workflow orchestration and Dask for parallel computation. It covers cloud-optimized formats (like Zarr), integration with tools like xarray and echopype, and demonstrates end-to-end ETL pipelines that load, process, and store multidimensional data directly in the cloud.

https://oceanstream.io/cloud-native-data-processing-pipelines-with-prefect-and-dask/

OceanStream

Cloud‑Native Pipelines for Scientific Data Processing with Prefect and Dask

An extended tutorial on the open-source libraries that we use to build the OceanStream cloud‑native data processing stack used to ingest data from sonar instruments and other marine sensors.

160 views05:15

PythonHub

LLM-Deflate: Extracting LLMs Into Datasets

LLM-Deflate is a technique for systematically extracting structured datasets from trained large language models by probing their internal knowledge with hierarchical topic exploration and prompt engineering. This reverse-compression process enables model analysis, knowledge transfer, training data augmentation, and debugging, potentially making knowledge extraction a standard tool as inf...

https://www.scalarlm.com/blog/llm-deflate-extracting-llms-into-datasets

ScalarLM

LLM-Deflate: Extracting LLMs Into Datasets

Large Language Models compress massive amounts of training data into their parameters. This compression is lossy but highly effective—billions of parameters can encode the essential patterns from terabytes of text. However, what’s less obvious is that this…

148 views11:15

PythonHub

The Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data

The Kaggle Grandmasters Playbook presents seven proven techniques for tabular data modeling, emphasizing fast experimentation and careful validation powered by GPU acceleration to handle large-scale data effectively. Key strategies include advanced exploratory data analysis, building diverse baselines, extensive feature engineering, ensembling with hill climbing and stacking, pseudo-labe...

https://developer.nvidia.com/blog/the-kaggle-grandmasters-playbook-7-battle-tested-modeling-techniques-for-tabular-data/

NVIDIA Technical Blog

The Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data

Over hundreds of Kaggle competitions, we’ve refined a playbook that consistently lands us near the top of the leaderboard—no matter if we’re working with millions of rows, missing values…

152 views17:15

PythonHub

How to Build Advanced AI Agents – Course for Beginners (LiveKit, Exa, LangChain)

The video teaches beginners how to build advanced AI agents, such as voice sales agents, research assistants, and multi-agent workflows, using LiveKit, Exa, LangChain, and Cerebras. It provides step-by-step guidance, hands-on code, and free API credits to help developers quickly create real-world AI applications.

https://www.youtube.com/watch?v=B0TJC4lmzEM

YouTube

How to Build Advanced AI Agents – Course for Beginners (LiveKit, Exa, LangChain)

Learn how to build real-world AI apps in this 3-part workshop series. You'll learn to build voice agents, deep research tools, multi-agent workflows, and more.‌‍‍‍‌‍‌‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍‍‍‍‍‍‍‌‌‍‌‌‍‍‌‍‍‌‌‌‌‍‌‍‍‌‍‍‌‌‍‍‍‍‍‍‌‍‍‌‍‌‍‌‌‌‍‌‍‍‍‍‍…

159 views23:15

PythonHub

Python Singleton Pattern: Smarter Than You Think?

This video analyzes the strengths and weaknesses of the singleton pattern in Python, explaining why global state is risky but controlled instantiation can be valuable in certain cases. It recommends module-level singletons and thread safety measures, while cautioning against tight coupling and testing pitfalls with traditional singleton implementations.

https://www.youtube.com/watch?v=p_UQ7tzUFLo

YouTube

The Real Reason the Singleton Pattern Exists

💡 Learn how to design great software in 7 steps: https://arjan.codes/designguide.

Singletons are often criticized for introducing global state and making code harder to test—but there’s more to the story. In this video, we explore the real problems with…

157 views05:15

PythonHub

LLMs from Scratch – Practical Engineering from Base Model to PPO RLHF

This video provides a hands-on guide to building a large language model entirely from scratch in PyTorch, covering every step from core transformer design to advanced alignment with RLHF. By the end, viewers gain practical experience in implementing, training, scaling, and aligning their own custom LLMs.

https://www.youtube.com/watch?v=p3sij8QzONQ

YouTube

LLMs from Scratch – Practical Engineering from Base Model to PPO RLHF

Learn to build a complete large language model from scratch using only pure PyTorch. This course takes you through the entire lifecycle, from foundational concepts to advanced alignment techniques. By the end, you'll have the deep, hands-on experience needed…

177 views11:15

PythonHub

Unlocking Performance in Python's Free-Threaded Future: GC Optimizations

A description of the performance optimizations made to the free-threaded garbage collector for Python 3.14.

https://labs.quansight.org/blog/free-threaded-gc-3-14

labs.quansight.org

Unlocking Performance in Python's Free-Threaded Future: GC Optimizations

A description of the performance optimizations made to the free-threaded garbage collector for Python 3.14.

187 views17:15

PythonHub

Air

The new web framework that breathes fresh air into Python web development. Built with FastAPI, Starlette, and Pydantic.

https://github.com/feldroy/air

GitHub

GitHub - feldroy/air: The new Python web framework by the authors of Two Scoops of Django

The new Python web framework by the authors of Two Scoops of Django - feldroy/air

191 views23:15

PythonHub

Python Hub Weekly Digest for 2025-10-05

https://pythonhub.dev/digest/2025-10-05/

pythonhub.dev

Python Hub Weekly Digest for 2025-10-05

166 views18:15

PythonHub

onyx-dot-app / onyx

Open Source AI Platform - AI Chat with advanced features that works with every LLM

https://github.com/onyx-dot-app/onyx

GitHub

GitHub - onyx-dot-app/onyx: Open Source AI Platform - AI Chat with advanced features that works with every LLM

Open Source AI Platform - AI Chat with advanced features that works with every LLM - onyx-dot-app/onyx

165 views20:15

PythonHub

Helium

Private, fast, and honest web browser.

https://github.com/imputnet/helium

GitHub

GitHub - imputnet/helium: Private, fast, and honest web browser

Private, fast, and honest web browser. Contribute to imputnet/helium development by creating an account on GitHub.

150 views23:15

PythonHub

Pyscn – Python code quality analyzer for vibe coders

https://github.com/ludo-technologies/pyscn

GitHub

GitHub - ludo-technologies/pyscn: An Intelligent Python Code Quality Analyzer

An Intelligent Python Code Quality Analyzer. Contribute to ludo-technologies/pyscn development by creating an account on GitHub.

152 views03:15

PythonHub

memvid

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

https://github.com/Olow304/memvid

GitHub

GitHub - Olow304/memvid: Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic…

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed. - Olow304/memvid

149 views07:15

PythonHub

DuckDB vs Polars. Wait. DuckDB and Polars.

The article emphasizes that DuckDB and Polars are not direct competitors but complementary tools in the Modern Data Stack, with each excelling in different contexts: DuckDB is best for SQL-heavy analytics and embedding as a query engine, while Polars suits end-to-end ETL pipelines and DataFrame-centric workflows. The choice depends on your problem context, team comfort, and use case rath...

https://www.confessionsofadataguy.com/duckdb-vs-polars-wait-duckdb-and-polars/

Confessions of a Data Guy

DuckDB vs Polars. Wait. DuckDB and Polars. - Confessions of a Data Guy

So, the classic newbie question. DuckDB vs Polars, which one should you pick? This is an interesting question, and actually drives a lot of search traffic to this website on which you find yourself wasting time. I thank you for that. This is probably the…

144 views11:15

PythonHub

PyOCI – Publish and install private Python packages using OCI/Docker registries

https://github.com/AllexVeldman/pyoci

GitHub

GitHub - AllexVeldman/pyoci: Publish and install private python packages using OCI/docker registries.

Publish and install private python packages using OCI/docker registries. - AllexVeldman/pyoci

👍2

133 views15:15

PythonHub