PythonHub
2.33K subscribers
2.35K photos
49K links
News & links about Python programming.
https://pythonhub.dev/
Download Telegram
Crawling Pages with Infinite Scroll using Scrapy and Playwright

This post provides a detailed guide on how to scrape infinite scroll websites using Scrapy and Playwright in Python. It covers the setup process, explains how to implement a custom downloader middleware to handle JavaScript rendering, and demonstrates how to extract data from dynamically loaded content, offering a practical solution for web scraping challenges posed by modern web applica...

https://www.xiegerts.com/post/infinite-scroll-scrapy-playwright/
FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention

https://pytorch.org/blog/flexattention/
Some more batteries to do stuff with Mapping related data structures

This library provides utility functions for manipulating and transforming data structures which have or include Mapping-like characteristics.

https://github.com/erivlis/mappingtools
CSVs Are Kinda Bad. DSVs Are Kinda Good.

The article argues that CSVs (Comma-Separated Values) are problematic due to various edge cases involving delimiters, quotes, and newlines, and proposes using Delimiter-Separated Values (DSV) with ASCII control characters as a more robust alternative. It demonstrates how DSVs can handle complex data without escaping or quoting issues, but acknowledges that the lack of widespread tool sup...

https://matthodges.com/posts/2024-08-12-csv-bad-dsv-good/
From Boring Object-Oriented to INSANE Functional Code

This video demonstrates that there's a place for both object-oriented and functional code. In Python, these two approaches can be combined effectively, allowing you to leverage the strengths of each for the best results.

https://www.youtube.com/watch?v=DvdZv_DD0DY
Cloudflare R2 x Django. Static Files. User uploads, css, images, js and more. Production-ready.

The video covers setting up and managing Django files, including static and user-uploaded files, using Cloudflare's R2 object storage. It emphasizes best practices for configuring environment variables, securing API keys, and managing static and media files in Django with advanced validation and customization options.

https://www.youtube.com/watch?v=VU3MAN1gs1s
Gemma for Streaming ML with Dataflow

The article demonstrates how to integrate Google's Gemma 2 language model into a Dataflow pipeline for real-time sentiment analysis and response generation in customer support chats. It provides a practical example of using Gemma to process streaming data, including code snippets for creating prompts, running inference, and handling model outputs within a scalable data processing framework.

https://developers.googleblog.com/en/gemma-for-streaming-ml-with-dataflow/