Python | Machine Learning | Coding | R
62K subscribers
1.11K photos
67 videos
138 files
764 links
List of our channels:
https://t.me/addlist/8_rRW2scgfRhOTc0

Discover powerful insights with Python, Machine Learning, Coding, and Rโ€”your essential toolkit for data-driven solutions, smart alg

Help and ads: @hussein_sheikho

https://telega.io/?r=nikapsOH
Download Telegram
Pandas Data Cleaning (Guide)

๐Ÿ”‘ Tags: #Pandas #DataCleaning #ML

https://t.me/DataScienceM โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Pandas.pdf
14.9 MB
Pandas Data Cleaning (Guide)

๐Ÿ”‘ Tags: #Pandas #DataCleaning #ML

https://t.me/DataScienceM โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
A Popular Interview Question: Discriminative vs. Generative Models

More Details: https://blog.dailydoseofds.com/p/a-popular-interview-question-discriminative

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
Convert PDF to docx using Python

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
Confusion matrix (TP, FP, TN, FN), clearly explained

๐Ÿ”‘ Tags: #PYTHON #AI #ML

https://t.me/CodeProgrammer โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
You can buy promotion or ads in our channel

Channel: @codeprogrammer

Format: 4h in top/2days

Price: 13$

Contact t.me/HusseinSheikho
Python | Machine Learning | Coding | R pinned ยซYou can buy promotion or ads in our channel Channel: @codeprogrammer Format: 4h in top/2days Price: 13$ Contact t.me/HusseinSheikhoยป
๐Ÿณ๏ธโ€๐ŸŒˆ Python became GitHub's first language!

๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป In a recent GitHub report, with the expansion of artificial intelligence, Python could finally overtake JavaScript and become the most popular language on GitHub in 2024. This happened after 10 years of JavaScript dominance and it is not very strange.

โœ”๏ธ Because with the growth of artificial intelligence, developers are turning to Python more than ever, and Python's applications in data science and analytics are increasing every day. You can read the full GitHub report here:๐Ÿ‘‡

โ”Œ ๐Ÿฑ Top programming along GitHub
โ”œ
๐Ÿ’ฐ Report


โช I also introduced the most important Python libraries for working with data and AI here: ๐Ÿ‘‡


๐Ÿ–ฅ Data Manipulation & Analysis
โ–ถ๏ธ pandas
โ–ถ๏ธ Apache Spark
โ–ถ๏ธ Polars
โ–ถ๏ธ DuckDB


๐Ÿ“Š Data Visualization
โžก๏ธ matplotlib
โžก๏ธ plotly
โžก๏ธ seaborn


๐Ÿ–ฅ Machine & Deep Learning
โžก๏ธ TensorFlow
โžก๏ธ PyTorch
โžก๏ธ Keras
โžก๏ธ scikit-learn
โžก๏ธ XGBoost
โžก๏ธ LightGBM
โžก๏ธ Prophet


๐ŸŒซ NLP & Large Language Models
โžก๏ธ Hugging Face Transformers
โžก๏ธ LangChain
โžก๏ธ LlamaIndex

๐Ÿ”‘ Tags: #PYTHON #AI #ML #NLP

https://t.me/CodeProgrammer โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Pandas ๐Ÿผ to Polars Guide

๐Ÿ”‘ Tags: #PYTHON #AI #ML #pandas #Polars

https://t.me/CodeProgrammer โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Hey guys,

As you all know, the purpose of this community is to share notes and grow together. Hence, today I am sharing with you an app called DevBytes. It keeps you updated about dev and tech news.

This brilliant app provides curated, bite-sized updates on the latest tech news/dev content. Whether itโ€™s new frameworks, AI breakthroughs, or cloud services, DevBytes brings the essentials straight to you.

If you're tired of information overload and want a smarter way to stay informed, give DevBytes a try.

Download here: https://play.google.com/store/apps/details?id=com.candelalabs.devbytes&hl=en-IN
Itโ€™s time to read less and know more!
This media is not supported in your browser
VIEW IN TELEGRAM
What is a ๐—ฉ๐—ฒ๐—ฐ๐˜๐—ผ๐—ฟ ๐——๐—ฎ๐˜๐—ฎ๐—ฏ๐—ฎ๐˜€๐—ฒ?

With the rise of Foundational Models, Vector Databases skyrocketed in popularity. The truth is that a Vector Database is also useful outside of a Large Language Model context.

When it comes to Machine Learning, we often deal with Vector Embeddings. Vector Databases were created to perform specifically well when working with them:

โžก๏ธ Storing.
โžก๏ธ Updating.
โžก๏ธ Retrieving.

When we talk about retrieval, we refer to retrieving set of vectors that are most similar to a query in a form of a vector that is embedded in the same Latent space. This retrieval procedure is called Approximate Nearest Neighbour (ANN) search.

A query here could be in a form of an object like an image for which we would like to find similar images. Or it could be a question for which we want to retrieve relevant context that could later be transformed into an answer via a LLM.

Letโ€™s look into how one would interact with a Vector Database:

๐—ช๐—ฟ๐—ถ๐˜๐—ถ๐—ป๐—ด/๐—จ๐—ฝ๐—ฑ๐—ฎ๐˜๐—ถ๐—ป๐—ด ๐——๐—ฎ๐˜๐—ฎ.

1. Choose a ML model to be used to generate Vector Embeddings.
2. Embed any type of information: text, images, audio, tabular. Choice of ML model used for embedding will depend on the type of data.
3. Get a Vector representation of your data by running it through the Embedding Model.
4. Store additional metadata together with the Vector Embedding. This data would later be used to pre-filter or post-filter ANN search results.
5. Vector DB indexes Vector Embedding and metadata separately. There are multiple methods that can be used for creating vector indexes, some of them: Random Projection, Product Quantization, Locality-sensitive Hashing.
6. Vector data is stored together with indexes for Vector Embeddings and metadata connected to the Embedded objects.

๐—ฅ๐—ฒ๐—ฎ๐—ฑ๐—ถ๐—ป๐—ด ๐——๐—ฎ๐˜๐—ฎ.

7. A query to be executed against a Vector Database will usually consist of two parts:

โžก๏ธ Data that will be used for ANN search. e.g. an image for which you want to find similar ones.
โžก๏ธ Metadata query to exclude Vectors that hold specific qualities known beforehand. E.g. given that you are looking for similar images of apartments - exclude apartments in a specific location.

8. You execute Metadata Query against the metadata index. It could be done before or after the ANN search procedure.
9. You embed the data into the Latent space with the same model that was used for writing the data to the Vector DB.
10. ANN search procedure is applied and a set of Vector embeddings are retrieved. Popular similarity measures for ANN search include: Cosine Similarity, Euclidean Distance, Dot Product.

How are you using Vector DBs? Let me know in the comment section!

#RAG #LLM #DataEngineering

https://t.me/CodeProgrammer โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Media is too big
VIEW IN TELEGRAM
๐Ÿ“น 3blue1brown presented the shortest and most understandable lecture on neural networks!

In the new episode, he talks about the mechanism of attention and transformers. The lecture has become even more concise and exciting!

Ideal for absolute beginners and even those who are far from technical.

The author managed to explain the key aspects of the neural network in just 9 minutes using bright graphics and simple examples.

๐Ÿ“Œ Original

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Transformer

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ“• Python Basics Made Simple!

๐Ÿ“ท Course: AI Python for Beginners

๐Ÿ‘จโ€๐Ÿ’ป Instructor: Andrew Ng

In the #AIPythonforBeginners course series you'll learn how to identify strings, integers, and floats with the type() function, and build a solid Python foundation for your AI journey.

Enroll Free: https://learn.deeplearning.ai/courses/ai-python-for-beginners

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Transformer

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ’  All free Kaggle courses for data science
๐Ÿ“ Along with the course completion certificate

โœ… Python โฌ…๏ธ link

โœ… An introduction to machine learning โฌ…๏ธ link

โœ… Pandas โฌ…๏ธ link

โœ… Medium machine learning โฌ…๏ธ link

โœ… Data visualization โฌ…๏ธ link

โœ… Feature engineering โฌ…๏ธ link

โœ… An introduction to the SQL language โฌ…๏ธ link

โœ… Advanced SQL language โฌ…๏ธ link

โœ… An introduction to deep learning โฌ…๏ธ link

โœ… Computer vision โฌ…๏ธ link

โœ… Time series โฌ…๏ธ link

โœ… Data cleanup โฌ…๏ธ link

โœ… Geographical analysis โฌ…๏ธ link

โœ… Explainability of machine learning โฌ…๏ธ link

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Transformer

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
Python List Methods clearly Explained

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Transformer

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿฑ 7 of the best GitHub repos
โœ… To enter the world of data analysis and data science


1โƒฃ 100 Days of ML Code repo

โœ๏ธ A hundred-day program for learning and practicing machine learning coding.


๐Ÿ”ข Awesome Data Science repo

โœ๏ธ A curated list of great data science resources such as books, software, and tools.


๐Ÿ”ข Data Science for Beginners repo

โœ๏ธ A repository from Microsoft that has a 10-week course with 20 lessons for beginners. Each lesson includes videos, quizzes, challenges and more.


๐Ÿ”ข Data Science Interviews Repo

โœ๏ธ A repository of questions and answers for science job interviews.


๐Ÿ”ข ML Technical Interviews repo

โœ๏ธ A good guide for machine learning and artificial intelligence technical interviews.


๐Ÿ”ข ML Interviews repo

โœ๏ธ A repository containing machine learning interview questions from basic topics to complex topics such as neural networks and reinforcement learning.


๐Ÿ”ข Data Science Python Notebooks repo

โœ๏ธ A collection of notebooks in various fields of data science such as deep learning, machine learning, data analysis and Python topics.

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Transformer

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
List of Running Processes using Python

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Transformer

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
Pandas is getting outdated.

5 reasons you should move to FireDucks ๐Ÿ‘‡

1. Requires changing ONLY ONE line of code:
โ†ณ Replace "๐—ถ๐—บ๐—ฝ๐—ผ๐—ฟ๐˜ ๐—ฝ๐—ฎ๐—ป๐—ฑ๐—ฎ๐˜€ ๐—ฎ๐˜€ ๐—ฝ๐—ฑ" with "๐—ถ๐—บ๐—ฝ๐—ผ๐—ฟ๐—ฒ ๐—ณ๐—ถ๐—ฟ๐—ฒ๐—ฑ๐˜‚๐—ฐ๐—ธ๐˜€.๐—ฝ๐—ฎ๐—ป๐—ฑ๐—ฎ๐˜€ ๐—ฎ๐˜€ ๐—ฝ๐—ฑ"
โ†ณ The rest of the entire code remains the same.
โ†ณ So, if you know Pandas, you already know how to use FireDucks.
โ†ณ Done!

2. Ridiculously faster as per official benchmarks:
โ†ณ Modin had an average speed-up of 0.9x over Pandas.
โ†ณ Polars had an average speed-up of 39x over Pandas.
โ†ณ But FireDucks had an average speed-up of 50x over Pandas.

3. Pandas is single-core; FireDucks is multi-core.

4. Pandas follows eager execution; FireDucks is based on lazy execution. This way, FireDucks can build a logical execution plan and apply possible optimizations.

5. That said, even under eager execution, FireDucks is way faster than Pandas, as depicted in the image below.

๐Ÿ“‚ Tags: #DataScience #Python #ML #AI #LLM #BIGDATA #Courses #Pandas #FireDucks

http://t.me/codeprogrammer โญ๏ธ
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM