Data Analytics
28.9K subscribers
490 photos
14 videos
45 files
277 links
Dive into the world of Data Analytics โ€“ uncover insights, explore trends, and master data-driven decision making.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Pandas vs Polars vs DuckDB: Which Library Should You Choose? ๐Ÿค”๐Ÿ“Š

pandas remains the default choice for notebooks, exploratory analysis, visualization, and machine learning workflows ๐Ÿ“๐Ÿ“ˆ. Polars focus on fast, memory-efficient DataFrame processing โšก๐Ÿ’พ, while DuckDB brings a SQL-first approach for querying local files and embedded analytics ๐Ÿ—„๏ธ๐Ÿ”.

Each tool fits a different kind of local data workflow ๐Ÿ› ๏ธ. In this article, we compare pandas, Polars, and DuckDB across performance, architecture, interoperability, and real-world use cases ๐Ÿ†๐Ÿ”—.

More: https://www.analyticsvidhya.com/blog/2026/05/pandas-vs-polars-vs-duckdb/ ๐Ÿ”—

#DataScience #Pandas #Polars #DuckDB #Python #Analytics
โค3
Found an easy way to learn math for ML: Mathematics for Machine Learning ๐ŸŽ“๐Ÿ“š

This is a curated collection on GitHub, including books, research papers, video lectures, and basic materials on math for studying and reviewing the mathematical foundations of machine learning. ๐Ÿ“–๐Ÿ“Š

It helps build a stronger knowledge base by bringing together trusted resources around topics that machine learning engineers constantly encounter: linear algebra, mathematical analysis, probability theory, statistics, information theory, matrix calculus, and deep learning mathematics. ๐Ÿงฎ๐Ÿค–

Free public repository on GitHub. ๐Ÿ’ปโœจ

https://github.com/dair-ai/Mathematics-for-ML

#MachineLearning #Mathematics #DataScience #Learning #GitHub #AI
โค4
Forwarded from Learn Python Coding
Data validation with Pydantic! ๐Ÿโœจ

In the early stages of development, data validation usually doesn't cause problems. In many Python projects, validation initially looks simple:

if not isinstance(age, int):
raise ValueError("age must be an int")

But then come email, JSON from APIs, query parameters, nested objects, configs, nullable fields, and type conversion. At some point, the code turns into a set of if/else and manual checks.

For such tasks, Pydantic is often used. Installation:

pip install pydantic
pip install "pydantic[email]"

Create a model:

from pydantic import BaseModel

class User(BaseModel):
name: str
age: int

Now the data is validated automatically:

user = User(
name="Alex",
age="30"
)

print(user.age)
print(type(user.age))

The result:
30
<class 'int'>

Pydantic will automatically convert the string "30" to an int. If you pass an incorrect value, you'll get a ValidationError:

User(
name="Alex",
age="test"
)

This is especially convenient when working with APIs, JSON, query parameters, and incoming data from outside.

A common production case is checking email:

from pydantic import BaseModel, EmailStr

class User(BaseModel):
email: EmailStr

User(email="alex@test.com")

If the email is invalid, Pydantic will throw a ValidationError. You can set default values:

from pydantic import BaseModel

class Config(BaseModel):
host: str = "localhost"
port: int = 5432

And allow None:

from pydantic import BaseModel

class User(BaseModel):
nickname: str | None = None

This field becomes optional. A practical example is processing an API response:

from pydantic import BaseModel

class Product(BaseModel):
id: int
title: str
price: float

data = {
"id": "1",
"title": "Keyboard",
"price": "99.5"
}

product = Product(**data)

print(product)

The types will be automatically converted. For nested model structures, you can combine:

from pydantic import BaseModel

class Address(BaseModel):
city: str
zip_code: str

class User(BaseModel):
name: str
address: Address

user = User(
name="Alex",
address={
"city": "Berlin",
"zip_code": "10115"
}
)

print(user)

The nested object will also be validated. Serialization in Pydantic v2:

print(user.model_dump())
print(user.model_dump_json())

Pydantic is actively used in FastAPI, ETL, microservices, data pipelines, and API clients.

For working with environment variables in Pydantic v2, a separate package is usually used:

pip install pydantic-settings

It's important to understand: Pydantic is not an ORM and does not replace business logic. Its task is to validate data, convert types, and describe schemas.

๐Ÿ”ฅ Pydantic significantly reduces the amount of manual data validation and makes processing incoming structures more predictable.

#Python #Pydantic #DataValidation #FastAPI #Coding #DevOps

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk
โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค5
Assembling GPT-like LLMs from scratch on PyTorch ๐Ÿ”ฅ

https://github.com/analyticalrohit/llms-from-scratch

๐Ÿ“š 10 notebooks. Step-by-step explanation.

๐Ÿงฉ Breaks down the architecture of LLMs into simple parts.

โœ… Suitable for beginners.

๐Ÿ›  Completely hands-on.

#PyTorch #LLM #AI #MachineLearning #DeepLearning #Code

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค4
๐Ÿ“ฐ Anthropic is rolling out Claude Opus 4.8 ๐Ÿš€

The model has become significantly more honest in evaluating its own work and notices problems in its own code four times more often. ๐Ÿ”โœจ

Plus, dynamic workflows have appeared โ€” hundreds of AI subagents can work on large projects and migrations in parallel. ๐Ÿค–โšก

โ›“๏ธ More details here
https://www.anthropic.com/news/claude-opus-4-8

#Anthropic #ClaudeOpus48 #AI #ArtificialIntelligence #TechNews #Innovation

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค3
๐Ÿš€ HelloEncyclo Presale is LIVE!

Master the skills that matter โ€” Gen-AI, Data Science, Machine Learning and more โ€” all in one place.

๐ŸŽ First 250 members get a flat 40% OFF

Use code: PRESALE-BOOK-WAVE-2GFG

โœ… 13 full courses live right now

โœ… 40+ more dropping in the next 2โ€“3 weeks

โœ… Complete library within 2 months โ€” built and refined by industry experts

โœ… 15-day money-back guarantee โ€” don't love it? Get a full refund.

โš ๏ธ Coupon works only after you log in with Gmail, and it's valid once per member.

๐Ÿ‘‰ Log in now and start learning:

https://helloencyclo.com

Don't wait โ€” the 40% deal disappears after the first 250 seats. ๐Ÿ”ฅ
โค2
Learning AI doesnโ€™t need another random tutorial rabbit hole. ๐Ÿšซ๐Ÿ‡

AI-Study-Group is a public GitHub learning journal for builders trying to navigate AI resources across books, courses, videos, tools, models, datasets, papers, and notes. ๐Ÿ“š๐Ÿค–

It helps you make your own learning path by collecting the materials the author used while learning AI, with quick-start recommendations up front and sections you can scan by resource type. ๐Ÿ—บ๏ธโœจ

Key features: ๐ŸŒŸ

โ€ข TL;DR starting path โ€“ points to one book, one LLM video, and the Hugging Face Agents Course ๐Ÿ“–๐ŸŽฅ
โ€ข Books section โ€“ lists AI/ML/DL books with short notes on where each one helps ๐Ÿ“š
โ€ข Courses and videos โ€“ collects practical lectures, tutorials, and talks from sources like MIT, NVIDIA, Hugging Face, Karpathy, and 3Blue1Brown ๐ŸŽ“
โ€ข Tools and libraries map โ€“ groups frameworks, platforms, visualization tools, and Python libraries for builders ๐Ÿ› ๏ธ
โ€ข Broader study material โ€“ includes models, model hubs, articles, papers, datasets, and AI notes ๐Ÿ“„

Free public GitHub repo. ๐Ÿ†“

https://github.com/ArturoNereu/AI-Study-Group

#AI #MachineLearning #DeepLearning #GitHub #StudyGroup #TechLearning

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค3
๐Ÿš€ Create an LLM from Scratch!

I came across a great find from Vizuara โ€” a series of 43 lectures that truly delivers on its promise: showing how to build a large language model from scratch. ๐Ÿง โœจ

Most people use ChatGPT.
But only a few actually understand how it works under the hood. โš™๏ธ

This playlist step by step breaks down all the key concepts without overloading with complex explanations.

๐Ÿ“š What you will learn:
โ†’ The architecture of Transformer ๐Ÿ—๏ธ
โ†’ The internal structure of GPT
โ†’ Tokenization and BPE ๐Ÿงฉ
โ†’ Attention mechanisms ๐Ÿ”
โ†’ The process of training an LLM ๐Ÿ“ˆ
โ†’ Full implementations in Python ๐Ÿ

โœ… Suitable for:
โ€ข ML engineers
โ€ข AI enthusiasts
โ€ข Developers entering the GenAI field
โ€ข Anyone who is tired of explaining AI as a "black box" ๐Ÿ•ต๏ธ

If you really want to understand what lies at the heart of models like ChatGPT, Claude, and Gemini โ€” this material is worth watching. ๐Ÿ‘€

๐Ÿ”— Link to the playlist:
https://www.youtube.com/playlist?list=PLPTV0NXA_ZSgsLAr8YCgCwhPIJNNtexWu

#LLM #AI #MachineLearning #Python #GenAI #DeepLearning

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค5
๐Ÿ”– Found a huge database on System Design for GenAI and LLM! ๐Ÿค–๐Ÿ“š

500+ real reviews of GenAI, LLM, and ML systems from OpenAI, Anthropic, Google, Microsoft, Netflix, and dozens of other companies. ๐ŸŒ๐Ÿข

A real find for those who are building AI products or want to understand how market leaders do it. ๐Ÿš€๐Ÿ’ก

โ›“๏ธ Link to GitHub
https://github.com/themanojdesai/genai-llm-ml-case-studies


#SystemDesign #GenAI #LLM #MachineLearning #AI #Tech

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
โค5
Transformers &amp; LLMs Cheatsheet.pdf
1.4 MB
The only LLM cheat sheet you'll ever need ๐Ÿš€

Covers the main concepts, architectures, and practical applications.

### Basics
- Tokens (tokenization, BPE)
- Embeddings (cosine similarity)
- Attention mechanism (Attention formula, Multi-Head Attention)

### Transformer architecture and its variants
- BERT (models with only an encoder)
- GPT (models with only a decoder)
- T5 (models with an encoder and a decoder)

### Large language models (LLMs)
- Prompting (context length, Chain-of-Thought)
- Pre-training (SFT, PEFT/LoRA)
- Preference tuning (Reward Model, Reinforcement Learning)
- Optimizations (Mixture of Experts, Distillation, Quantization)

### Applications
- LLM-as-a-Judge (LaaJ)
- RAG (Retrieval-Augmented Generation)
- Agents (ReAct)
- Reasoning models (Scaling)

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

#LLM #AI #MachineLearning #DeepLearning #PromptEngineering #Tech
โค5
The ultimate guide to fine tuning.pdf
15.2 MB
๐Ÿ”– The Big Book on Fine-Tuning LLMs

A free 115-page book dedicated to the retraining of large language models. ๐Ÿ“š

It's suitable for those who want to understand how to prepare datasets, configure training, and improve the quality of LLMs for their tasks. ๐Ÿš€

#LLM #FineTuning #AI #MachineLearning #DataScience #Tech

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

๐Ÿš€ Level up your AI & Data Science skills with HelloEncyclo โ€” a growing all-in-one platform featuring hands-on courses in LLMs, Deep Learning, MLOps, Data Engineering, and more.
โœ… 13 courses live + 40+ coming soon
๐ŸŽฏ One access, lifetime updates
๐Ÿ”‘ Use code: PRESALE-BOOK-WAVE-2GFG
๐Ÿ‘‰ https://helloencyclo.com/?ref=HUSSEINSHEIKHO
โค1
๐Ÿ”– LLM Scraper โ€” parsing websites through neural networks

The tool allows you to convert any web pages into structured data using LLM.

Useful for data collection, site monitoring, and preparing datasets without writing complex parsers.

โ›“๏ธ Link to GitHub: https://github.com/mishushakov/llm-scraper

โœจ Join Best TG Channels https://t.me/addlist/0f6vfFbEMdAwODBk

โญ๏ธ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

๐Ÿš€ Level up your AI & Data Science skills with HelloEncyclo โ€” a growing all-in-one platform featuring hands-on courses in LLMs, Deep Learning, MLOps, Data Engineering, and more.
โœ… 13 courses live + 40+ coming soon
๐ŸŽฏ One access, lifetime updates
๐Ÿ”‘ Use code: PRESALE-BOOK-WAVE-2GFG
๐Ÿ‘‰ https://helloencyclo.com/?ref=HUSSEINSHEIKHO

#LLM #Scraper #WebScraping #DataCollection #AI #Automation
โค4
๐ŸŽ SPOTO Mid-Year Sale โ€“ Grab Your IT Certification Success Kit!

๐Ÿ”ฅ Whether you're prepping for #Python, #AI, #Cisco, #PMI, #Fortinet, #AWS, #Azure, #Excel, #Comptia, #ITIL, #Cloud or any other hot certification โ€“ SPOTO has your back with real exam dumps and hands-on training!

โœ… Free Resources:
ใƒปFree Python, Excel, Cyber Security, Cisco, SQL, ITIL, PMP, AWS courses: https://bit.ly/4alTSfk
ใƒปIT Certs E-book: https://bit.ly/49ub0zq
ใƒปIT Exams Skill Test: https://bit.ly/4dVPapB
ใƒปFree AI material and support tools: https://bit.ly/4elzcpl
ใƒปFree Cloud Study Guide: https://bit.ly/4u7sdG0

๐ŸŽ Join SPOTO Mid-Year Lucky Draw:
๐Ÿ“ฑ iPhone 17 ๐Ÿ›’ Free Order
๐Ÿ›’ Amazon Gift $100 ๐Ÿ“˜PMP/ AWS/ CCNA Course


๐Ÿ‘‰ Enter the Draw Now โ†’ https://bit.ly/4uN3lVt

๐Ÿ‘‰ Join Our IT Learning Community for free resources & support:
https://chat.whatsapp.com/FmbIbbqm2QhKglVpVTSH4d
๐Ÿ’ฌ Want exam help? Chat with an admin now:
https://wa.link/knicza

โฐ Mid-Year Deal Ends Soon โ€“ Don't Miss Out!
๐Ÿšจ ONLY THE FIRST 5 GET THIS.

I'm sharing this link with my network once โ€” and only the first 5 people who enroll through it lock in a deal that has never been offered before.

๐Ÿ‘‘ Lifetime access to HelloEncyclo โ€” every AI, ML & Data Science course ever built โ€” for ~$41. Once. Forever.
This isn't a drill. This isn't a rerun.
This is the founding-member price โ€” and it disappears the moment the first 250 seats globally are gone.


โœ… 13 courses live right now
โœ… 40+ more in 2โ€“3 weeks
โœ… Every future course included automatically
โœ… 15-day money-back โ€” full refund, no questions

Code: PRESALE-BOOK-WAVE-2GFG

(Log in with Gmail ยท valid once ยท applies at checkout)

๐Ÿ‘‡ First 5. That's it.

https://helloencyclo.com/?ref=HUSSEINSHEIKHO

โณ Once those 5 seats go through this link โ€”

I'm not sharing it again. ๐Ÿ”ฅ
โค1