β Tech Stack at Apify
Frontend: React.js, styled-components, Storybook, Cypress
Backend: TypeScript/Node.js, Next.js, Nest.js, Docusaurus, Jest
Infra: AWS, Kubernetes, Helm, MongoDB, Redis, DynamoDB, S3, GitHub Actions
Monitoring: New Relic, LogDNA, Sentry, PagerDuty
Tools: GitHub, ZenHub, Notion, GSuite
AI Tools: Langchain, LlamaIndex, Pinecone, OpenAI API, Web agents
Frontend: React.js, styled-components, Storybook, Cypress
Backend: TypeScript/Node.js, Next.js, Nest.js, Docusaurus, Jest
Infra: AWS, Kubernetes, Helm, MongoDB, Redis, DynamoDB, S3, GitHub Actions
Monitoring: New Relic, LogDNA, Sentry, PagerDuty
Tools: GitHub, ZenHub, Notion, GSuite
AI Tools: Langchain, LlamaIndex, Pinecone, OpenAI API, Web agents
In XML, which of the following is a valid format?
Anonymous Quiz
25%
<element value="attribute">
52%
<element attribute="value">
9%
<element attribute=value>
14%
<element attribute: value>
What is the primary purpose of the TCP/IP protocol suite in software development?
Anonymous Quiz
76%
To provide a standardized way for devices to communicate over the internet
20%
To manage user authentication and authorization
2%
To handle database transactions
2%
To optimize website performance
π GitHub repo: google-play-scraper
Google play scraper for Python
https://github.com/JoMingyu/google-play-scraper
Google play scraper for Python
https://github.com/JoMingyu/google-play-scraper
GitHub
GitHub - JoMingyu/google-play-scraper: Google play scraper for Python inspired by <facundoolano/google-play-scraper>
Google play scraper for Python inspired by <facundoolano/google-play-scraper> - JoMingyu/google-play-scraper
π1
What is the primary advantage of using Playwright over Selenium?
Anonymous Quiz
33%
Playwright is slower than Selenium
6%
Supports only one browser
52%
Can handle multiple browser contexts in a single test
9%
Does not support headless mode
π GitHub repo: Scrapegraph-ai
Python scraper based on AI
https://github.com/ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
https://github.com/ScrapeGraphAI/Scrapegraph-ai
GitHub
GitHub - ScrapeGraphAI/Scrapegraph-ai: Python scraper based on AI
Python scraper based on AI. Contribute to ScrapeGraphAI/Scrapegraph-ai development by creating an account on GitHub.
β€1
π GitHub repo: Google-Maps-Scraper
Google maps scraper with gui
https://github.com/Zubdata/Google-Maps-Scraper
Google maps scraper with gui
https://github.com/Zubdata/Google-Maps-Scraper
GitHub
GitHub - Zubdata/Google-Maps-Scraper: Google maps scraper with gui
Google maps scraper with gui. Contribute to Zubdata/Google-Maps-Scraper development by creating an account on GitHub.
Stream the data (JSON) and handle auto-generated authentication using browser automation π₯
https://youtu.be/g0EgwfQJew8
https://youtu.be/g0EgwfQJew8
YouTube
Should I have used this Web Scraping Technique?
Check Out ProxyScrape here: https://proxyscrape.com/?ref=jhnwr
β‘ JOIN MY MAILING LIST
https://johnwr.com
β‘ COMMUNITY
https://discord.gg/C4J2uckpbR
https://www.patreon.com/johnwatsonrooney
β‘ PROXIES
https://proxyscrape.com/?ref=jhnwr
β‘ HOSTING (Digitalβ¦
β‘ JOIN MY MAILING LIST
https://johnwr.com
β‘ COMMUNITY
https://discord.gg/C4J2uckpbR
https://www.patreon.com/johnwatsonrooney
β‘ PROXIES
https://proxyscrape.com/?ref=jhnwr
β‘ HOSTING (Digitalβ¦
Business leads collecting from Google Maps scraping with Python + Selenium and view π
Medium
Business data scraping made simple with Python and Google Maps
A deep dive into a simple Python tool for extracting detailed business information from Google Maps.
π4β€1
JA3 transport is a way to identify a clientβs TLS configuration. It includes the list of cipher suites supported, the extensions sent, and other details.
This fingerprint can be used to recognize the browser or device making the connection, even if it's using encryption.
Fingerprinting is a common method to detect automation bot and crawler.
How JA3 fingerprints can be impersonated? π€
This fingerprint can be used to recognize the browser or device making the connection, even if it's using encryption.
Fingerprinting is a common method to detect automation bot and crawler.
How JA3 fingerprints can be impersonated? π€
Medium
Impersonating JA3 Fingerprints
Researchers: Max Harley, Matthew Rinaldi
β€3
Camoufox is also fully compatible with the Playwright API, so the code will be similar to any Playwright code that you already have, with only a change in the way the browser is initialized.
An article from ScrapingBee π
Scrapingbee
How to Scrape With Camoufox to Bypass Antibot Technology | ScrapingBee
Discover how Camoufox, a cutting-edge web scraping technology, bypasses antibot measure to scrape any website. Learn how to scrape undetected today!
Cloudflare π© wants bot crawer being charged πΈ
https://blog.cloudflare.com/introducing-pay-per-crawl/
https://blog.cloudflare.com/introducing-pay-per-crawl/
The Cloudflare Blog
Introducing pay per crawl: Enabling content owners to charge AI crawlers for access
Pay per crawl is a new feature to allow content creators to charge AI crawlers for access to their content.
Cloudflare inspection on Perplexity's crawler bot πΈ
The Cloudflare Blog
Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives
Perplexity is repeatedly modifying their user agent and changing IPs and ASNs to hide their crawling activity, in direct conflict with explicit no-crawl preferences expressed by websites.
Perplexity response to Cloudflare's argument: "automated crawling and user-driven fetching is different!"
https://x.com/perplexity_ai/status/1952531537385456019
https://x.com/perplexity_ai/status/1952531537385456019
π1