Data Science Jupyter Notebooks
11.8K subscribers
289 photos
43 videos
9 files
853 links
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
Download Telegram
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)

• Get the page source after JavaScript has executed.
dynamic_html = driver.page_source

• Close the browser window.
driver.quit()


VII. Common Tasks & Best Practices

• Handle pagination by finding the "Next" link.
next_page_url = soup.find('a', text='Next')['href']

• Save data to a CSV file.
import csv
with open('data.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow(['Title', 'Link'])
# writer.writerow([title, url]) in a loop

• Save data to CSV using pandas.
import pandas as pd
df = pd.DataFrame(data, columns=['Title', 'Link'])
df.to_csv('data.csv', index=False)

• Use a proxy with requests.
proxies = {'http': 'http://10.10.1.10:3128', 'https': 'http://10.10.1.10:1080'}
requests.get('http://example.com', proxies=proxies)

• Pause between requests to be polite.
import time
time.sleep(2) # Pause for 2 seconds

• Handle JSON data from an API.
json_response = requests.get('https://api.example.com/data').json()

• Download a file (like an image).
img_url = 'http://example.com/image.jpg'
img_data = requests.get(img_url).content
with open('image.jpg', 'wb') as handler:
handler.write(img_data)

• Parse a sitemap.xml to find all URLs.
# Get the sitemap.xml file and parse it like any other XML/HTML to extract <loc> tags.


VIII. Advanced Frameworks (Scrapy)

• Create a Scrapy spider (conceptual command).
scrapy genspider example example.com

• Define a parse method to process the response.
# In your spider class:
def parse(self, response):
# parsing logic here
pass

• Extract data using Scrapy's CSS selectors.
titles = response.css('h1::text').getall()

• Extract data using Scrapy's XPath selectors.
links = response.xpath('//a/@href').getall()

• Yield a dictionary of scraped data.
yield {'title': response.css('title::text').get()}

• Follow a link to parse the next page.
next_page = response.css('li.next a::attr(href)').get()
if next_page is not None:
yield response.follow(next_page, callback=self.parse)

• Run a spider from the command line.
scrapy crawl example -o output.json

• Pass arguments to a spider.
scrapy crawl example -a category=books

• Create a Scrapy Item for structured data.
import scrapy
class ProductItem(scrapy.Item):
name = scrapy.Field()
price = scrapy.Field()

• Use an Item Loader to populate Items.
from scrapy.loader import ItemLoader
loader = ItemLoader(item=ProductItem(), response=response)
loader.add_css('name', 'h1.product-name::text')


#Python #WebScraping #BeautifulSoup #Selenium #Requests

━━━━━━━━━━━━━━━
By: @DataScienceN
3
🔥 Trending Repository: localstack

📝 Description: 💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline

🔗 Repository URL: https://github.com/localstack/localstack

🌐 Website: https://localstack.cloud

📖 Readme: https://github.com/localstack/localstack#readme

📊 Statistics:
🌟 Stars: 61.1K stars
👀 Watchers: 514
🍴 Forks: 4.3K forks

💻 Programming Languages: Python - Shell - Makefile - ANTLR - JavaScript - Java

🏷️ Related Topics:
#python #testing #aws #cloud #continuous_integration #developer_tools #localstack


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: TrendRadar

📝 Description: 🎯 告别信息过载,AI 助你看懂新闻资讯热点,简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台(抖音、知乎、B站、华尔街见闻、财联社等),智能筛选+自动推送+AI对话分析(用自然语言深度挖掘新闻:趋势追踪、情感分析、相似检索等13种工具)。支持企业微信/飞书/钉钉/Telegram/邮件/ntfy推送,30秒网页部署,1分钟手机通知,无需编程。支持Docker部署 让算法为你服务,用AI理解热点

🔗 Repository URL: https://github.com/sansan0/TrendRadar

🌐 Website: https://github.com/sansan0

📖 Readme: https://github.com/sansan0/TrendRadar#readme

📊 Statistics:
🌟 Stars: 6K stars
👀 Watchers: 21
🍴 Forks: 4.5K forks

💻 Programming Languages: Python - HTML - Batchfile - Shell - Dockerfile

🏷️ Related Topics:
#python #docker #mail #news #telegram_bot #mcp #data_analysis #trending_topics #wechat_robot #dingtalk_robot #ntfy #hot_news #feishu_robot #mcp_server


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: LEANN

📝 Description: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

🔗 Repository URL: https://github.com/yichuan-w/LEANN

📖 Readme: https://github.com/yichuan-w/LEANN#readme

📊 Statistics:
🌟 Stars: 3.9K stars
👀 Watchers: 34
🍴 Forks: 403 forks

💻 Programming Languages: Python

🏷️ Related Topics:
#python #privacy #ai #offline_first #localstorage #vectors #faiss #rag #vector_search #vector_database #llm #langchain #llama_index #retrieval_augmented_generation #ollama #gpt_oss


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: PythonRobotics

📝 Description: Python sample codes and textbook for robotics algorithms.

🔗 Repository URL: https://github.com/AtsushiSakai/PythonRobotics

🌐 Website: https://atsushisakai.github.io/PythonRobotics/

📖 Readme: https://github.com/AtsushiSakai/PythonRobotics#readme

📊 Statistics:
🌟 Stars: 26.3K stars
👀 Watchers: 509
🍴 Forks: 7K forks

💻 Programming Languages: Python

🏷️ Related Topics:
#python #algorithm #control #robot #localization #robotics #mapping #animation #path_planning #slam #autonomous_driving #autonomous_vehicles #ekf #hacktoberfest #cvxpy #autonomous_navigation


==================================
🧠 By: https://t.me/DataScienceM
Error Handling: Always wrap dispatch logic in try-except blocks to gracefully handle network issues, authentication failures, or incorrect receiver addresses.
Security: Never hardcode credentials directly in scripts. Use environment variables (os.environ.get()) or a secure configuration management system. Ensure starttls() is called for encrypted communication.
Rate Limits: SMTP servers impose limits on the number of messages one can send per hour or day. Implement pauses (time.sleep()) between dispatches to respect these limits and avoid being flagged as a spammer.
Opt-Outs: For promotional dispatches, ensure compliance with regulations (like GDPR, CAN-SPAM) by including clear unsubscribe options.

Concluding Thoughts

Automating electronic message dispatch empowers users to scale their communication efforts with remarkable efficiency. By leveraging Python's native capabilities, anyone can construct a powerful, flexible system for broadcasting anything from routine updates to extensive promotional campaigns. The journey into programmatic dispatch unveils a world of streamlined operations and enhanced communicative reach.

#python #automation #email #smtplib #emailautomation #programming #scripting #communication #developer #efficiency

━━━━━━━━━━━━━━━
By: @DataScienceN
🔥 Trending Repository: Memori

📝 Description: Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems

🔗 Repository URL: https://github.com/GibsonAI/Memori

🌐 Website: https://memorilabs.ai

📖 Readme: https://github.com/GibsonAI/Memori#readme

📊 Statistics:
🌟 Stars: 2.3K stars
👀 Watchers: 18
🍴 Forks: 216 forks

💻 Programming Languages: Python - PLpgSQL

🏷️ Related Topics:
#python #agent #awesome #state_management #ai #memory #memory_management #hacktoberfest #long_short_term_memory #rag #llm #memori_ai #hacktoberfest2025 #chatgpt #aiagent


==================================
🧠 By: https://t.me/DataScienceM
Media is too big
VIEW IN TELEGRAM
🚀 AutoPilot — a free automation suite that replaces a dozen services at once

If you love tools that save time, money, and nerves — this is 100% for you.

This is an open-source panel on Python + Streamlit, packed with a whole arsenal of useful automations.

You open it — and it’s like gaining a superpower: doing everything faster.

What it can do:
🖼 Background Remover — removes photo backgrounds in a second.
🧾 QR Generator — creates QR codes for anything.
💻 Fake Data Generator — generates realistic test data.
🎧 Audiobook Converter — turns PDFs into audiobooks.
📥 YouTube Downloader — downloads video and audio.
💬 Bulk Email Sender — mass email sending.
📸 Image Downloader — searches and downloads images by keywords.
📝 Article Summarizer — creates well-written concise summaries.
📊 Resource Monitor — monitors your system resources.
🔍 Code Analyzer — checks code with Pylint and Flake8.
🧹 Clipboard Manager — stores clipboard history.
🔗 Link Checker — checks which links are alive.
📷 Image Editor — a mini-Photoshop: crop, blur, resize, watermark, formatting, and lots of effects.
🗞 News Reader — reads out current news.

And that’s just part of the list.

Why do you need this?
🟢 a ready set of utilities for developers, marketers, designers, or SMM;
🟢 huge time savings;
🟢 local, free, and without limits;
🟢 can be integrated into your projects, bots, or workflow.

⚡️ How to run (quickly)

git clone https://github.com/Ai-Quill/automated.git
cd automated
pip install -r requirements.txt
streamlit run app.py


🖥Open in your browser: http://localhost:8501

And enjoy the panel where all tools are just one click away.

♎️ GitHub/Instructions

#python #soft #github

https://t.me/DataScienceN 🌟
Please open Telegram to view this post
VIEW IN TELEGRAM
4
Media is too big
VIEW IN TELEGRAM
If you love automating everything, this is for you

AutoPilot is an open-source panel built with #Python + #Streamlit, packed with a whole arsenal of useful automations.

Open it, and you have tools at your fingertips like background removal for photos, QR code generation, YouTube downloading, fake data creation, audiobooks, email sending, code analysis, image editing, and even a news reader.

One window instead of a dozen services. 🙂
https://github.com/Ai-Quill/automated


👉 https://t.me/DataScienceN
Please open Telegram to view this post
VIEW IN TELEGRAM
2
🔥 Trending Repository: Memori

📝 Description: Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems

🔗 Repository URL: https://github.com/MemoriLabs/Memori

🌐 Website: https://memorilabs.ai

📖 Readme: https://github.com/MemoriLabs/Memori#readme

📊 Statistics:
🌟 Stars: 8.8K stars
👀 Watchers: 46
🍴 Forks: 629 forks

💻 Programming Languages: Python - PLpgSQL

🏷️ Related Topics:
#python #agent #awesome #state_management #ai #memory #memory_management #hacktoberfest #long_short_term_memory #rag #llm #memori_ai #hacktoberfest2025 #chatgpt #aiagent


==================================
🧠 By: https://t.me/DataScienceM
4
🔥 Trending Repository: ML-For-Beginners

📝 Description: 12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

🔗 Repository URL: https://github.com/microsoft/ML-For-Beginners

📖 Readme: https://github.com/microsoft/ML-For-Beginners#readme

📊 Statistics:
🌟 Stars: 79.6K stars
👀 Watchers: 1.1k
🍴 Forks: 18.5K forks

💻 Programming Languages: Jupyter Notebook - HTML - Python - Vue - JavaScript - Dockerfile

🏷️ Related Topics:
#python #education #data_science #machine_learning #r #scikit_learn #machine_learning_algorithms #ml #machinelearning #machinelearning_python #scikit_learn_python #microsoft_for_beginners


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: Resume-Matcher

📝 Description: Improve your resumes with Resume Matcher. Get insights, keyword suggestions and tune your resumes to job descriptions.

🔗 Repository URL: https://github.com/srbhr/Resume-Matcher

🌐 Website: https://resumematcher.fyi/

📖 Readme: https://github.com/srbhr/Resume-Matcher#readme

📊 Statistics:
🌟 Stars: 24.1K stars
👀 Watchers: 85
🍴 Forks: 4.5K forks

💻 Programming Languages: Python - TypeScript - PowerShell - Shell - CSS - JavaScript - Makefile

🏷️ Related Topics:
#python #resume #machine_learning #natural_language_processing #typescript #nextjs #text_similarity #word_embeddings #ats #resume_parser #hacktoberfest #resume_builder #applicant_tracking_system #vector_search


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: PentestGPT

📝 Description: A GPT-empowered penetration testing tool

🔗 Repository URL: https://github.com/GreyDGL/PentestGPT

📖 Readme: https://github.com/GreyDGL/PentestGPT#readme

📊 Statistics:
🌟 Stars: 9.4K stars
👀 Watchers: 163
🍴 Forks: 1.4K forks

💻 Programming Languages: Python - HTML - Shell - Makefile - Dockerfile

🏷️ Related Topics:
#python #penetration_testing #large_language_models #llm


==================================
🧠 By: https://t.me/DataScienceM
1
🔥 Trending Repository: cocoindex

📝 Description: Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!

🔗 Repository URL: https://github.com/cocoindex-io/cocoindex

🌐 Website: https://cocoindex.io

📖 Readme: https://github.com/cocoindex-io/cocoindex#readme

📊 Statistics:
🌟 Stars: 4.2K stars
👀 Watchers: 30
🍴 Forks: 343 forks

💻 Programming Languages: Rust - Python - Handlebars

🏷️ Related Topics:
#python #rust #data #real_time #ai #pipeline #etl #indexing #data_engineering #knowledge_graph #help_wanted #data_processing #semantic_search #hacktoberfest #change_data_capture #data_infrastructure #data_indexing #rag #llm #context_engineering


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: tensorflow

📝 Description: An Open Source Machine Learning Framework for Everyone

🔗 Repository URL: https://github.com/tensorflow/tensorflow

🌐 Website: https://tensorflow.org

📖 Readme: https://github.com/tensorflow/tensorflow#readme

📊 Statistics:
🌟 Stars: 193K stars
👀 Watchers: 7.5k
🍴 Forks: 75.2K forks

💻 Programming Languages: C++ - Python - MLIR - HTML - Starlark - Go

🏷️ Related Topics:
#python #machine_learning #deep_neural_networks #deep_learning #neural_network #tensorflow #ml #distributed


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: rendercv

📝 Description: Typst-based CV/resume generator for academics and engineers

🔗 Repository URL: https://github.com/rendercv/rendercv

🌐 Website: https://docs.rendercv.com

📖 Readme: https://github.com/rendercv/rendercv#readme

📊 Statistics:
🌟 Stars: 4.3K stars
👀 Watchers: 9
🍴 Forks: 356 forks

💻 Programming Languages: Python - Typst

🏷️ Related Topics:
#python #resume_template #resume #cv #cv_generator #cv_template #resume_builder #resume_generator #cv_builder #typst


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: core

📝 Description: 🏡 Open source home automation that puts local control and privacy first.

🔗 Repository URL: https://github.com/home-assistant/core

🌐 Website: https://www.home-assistant.io

📖 Readme: https://github.com/home-assistant/core#readme

📊 Statistics:
🌟 Stars: 83.3K stars
👀 Watchers: 1.3k
🍴 Forks: 36.2K forks

💻 Programming Languages: Python

🏷️ Related Topics:
#python #home_automation #mqtt #raspberry_pi #iot #internet_of_things #asyncio #hacktoberfest


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: langextract

📝 Description: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

🔗 Repository URL: https://github.com/google/langextract

🌐 Website: https://pypi.org/project/langextract/

📖 Readme: https://github.com/google/langextract#readme

📊 Statistics:
🌟 Stars: 18K stars
👀 Watchers: 94
🍴 Forks: 1.3K forks

💻 Programming Languages: Python

🏷️ Related Topics:
#python #nlp #gemini #structured_data #gemini_api #information_extration #large_language_models #llm #gemini_pro #gemini_ai #gemini_flash


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: plane

📝 Description: 🔥 🔥 🔥 Open Source JIRA, Linear, Monday, and Asana Alternative. Plane helps you track your issues, epics, and cycles the easiest way on the planet.

🔗 Repository URL: https://github.com/makeplane/plane

🌐 Website: http://plane.so

📖 Readme: https://github.com/makeplane/plane#readme

📊 Statistics:
🌟 Stars: 40.9K stars
👀 Watchers: 141
🍴 Forks: 3K forks

💻 Programming Languages: TypeScript - Python - HTML - CSS - Shell - JavaScript

🏷️ Related Topics:
#react #python #docker #redis #django #jira #typescript #rest_api #nextjs #postgresql #issue_tracker #project_management #kanban #linear #product_management #jira_alternative #work_management


==================================
🧠 By: https://t.me/DataScienceM
🔥 Trending Repository: RustPython

📝 Description: A Python Interpreter written in Rust

🔗 Repository URL: https://github.com/RustPython/RustPython

🌐 Website: https://rustpython.github.io

📖 Readme: https://github.com/RustPython/RustPython#readme

📊 Statistics:
🌟 Stars: 21K stars
👀 Watchers: 172
🍴 Forks: 1.4K forks

💻 Programming Languages: Rust - Python - JavaScript - NSIS - EJS - CSS

🏷️ Related Topics:
#language #rust #interpreter #compiler #wasm #jit #python3 #hacktoberfest #python_language


==================================
🧠 By: https://t.me/DataScienceM