Python from scratch
by University of Waterloo
0. Introduction
1. First steps
2. Built-in functions
3. Storing and using information
4. Creating functions
5. Booleans
6. Branching
7. Building better programs
8. Iteration using while
9. Storing elements in a sequence
10. Iteration using for
11. Bundling information into objects
12. Structuring data
13. Recursion
https://open.cs.uwaterloo.ca/python-from-scratch/
#python
by University of Waterloo
0. Introduction
1. First steps
2. Built-in functions
3. Storing and using information
4. Creating functions
5. Booleans
6. Branching
7. Building better programs
8. Iteration using while
9. Storing elements in a sequence
10. Iteration using for
11. Bundling information into objects
12. Structuring data
13. Recursion
https://open.cs.uwaterloo.ca/python-from-scratch/
#python
๐1
Guide to Building an AI Agent
1๏ธโฃ ๐๐ต๐ผ๐ผ๐๐ฒ ๐๐ต๐ฒ ๐ฅ๐ถ๐ด๐ต๐ ๐๐๐
Not all LLMs are equal. Pick one that:
- Excels in reasoning benchmarks
- Supports chain-of-thought (CoT) prompting
- Delivers consistent responses
๐ Tip: Experiment with models & fine-tune prompts to enhance reasoning.
2๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐๐ผ๐ป๐๐ฟ๐ผ๐น ๐๐ผ๐ด๐ถ๐ฐ
Your agent needs a strategy:
- Tool Use: Call tools when needed; otherwise, respond directly.
- Basic Reflection: Generate, critique, and refine responses.
- ReAct: Plan, execute, observe, and iterate.
- Plan-then-Execute: Outline all steps first, then execute.
๐ Choosing the right approach improves reasoning & reliability.
3๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ผ๐ฟ๐ฒ ๐๐ป๐๐๐ฟ๐๐ฐ๐๐ถ๐ผ๐ป๐ & ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐
Set operational rules:
- How to handle unclear queries? (Ask clarifying questions)
- When to use external tools?
- Formatting rules? (Markdown, JSON, etc.)
- Interaction style?
๐ Clear system prompts shape agent behavior.
4๏ธโฃ ๐๐บ๐ฝ๐น๐ฒ๐บ๐ฒ๐ป๐ ๐ฎ ๐ ๐ฒ๐บ๐ผ๐ฟ๐ ๐ฆ๐๐ฟ๐ฎ๐๐ฒ๐ด๐
LLMs forget past interactions. Memory strategies:
- Sliding Window: Retain recent turns, discard old ones.
- Summarized Memory: Condense key points for recall.
- Long-Term Memory: Store user preferences for personalization.
๐ Example: A financial AI recalls risk tolerance from past chats.
5๏ธโฃ ๐๐พ๐๐ถ๐ฝ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐ ๐๐ถ๐๐ต ๐ง๐ผ๐ผ๐น๐ & ๐๐ฃ๐๐
Extend capabilities with external tools:
- Name: Clear, intuitive (e.g., "StockPriceRetriever")
- Description: What does it do?
- Schemas: Define input/output formats
- Error Handling: How to manage failures?
๐ Example: A support AI retrieves order details via CRM API.
6๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐ฅ๐ผ๐น๐ฒ & ๐๐ฒ๐ ๐ง๐ฎ๐๐ธ๐
Narrowly defined agents perform better. Clarify:
- Mission: (e.g., "I analyze datasets for insights.")
- Key Tasks: (Summarizing, visualizing, analyzing)
- Limitations: ("I donโt offer legal advice.")
๐ Example: A financial AI focuses on finance, not general knowledge.
7๏ธโฃ ๐๐ฎ๐ป๐ฑ๐น๐ถ๐ป๐ด ๐ฅ๐ฎ๐ ๐๐๐ ๐ข๐๐๐ฝ๐๐๐
Post-process responses for structure & accuracy:
- Convert AI output to structured formats (JSON, tables)
- Validate correctness before user delivery
- Ensure correct tool execution
๐ Example: A financial AI converts extracted data into JSON.
8๏ธโฃ ๐ฆ๐ฐ๐ฎ๐น๐ถ๐ป๐ด ๐๐ผ ๐ ๐๐น๐๐ถ-๐๐ด๐ฒ๐ป๐ ๐ฆ๐๐๐๐ฒ๐บ๐ (๐๐ฑ๐๐ฎ๐ป๐ฐ๐ฒ๐ฑ)
For complex workflows:
- Info Sharing: What context is passed between agents?
- Error Handling: What if one agent fails?
- State Management: How to pause/resume tasks?
๐ Example:
1๏ธโฃ One agent fetches data
2๏ธโฃ Another summarizes
3๏ธโฃ A third generates a report
Master the fundamentals, experiment, and refine and.. now go build something amazing!
1๏ธโฃ ๐๐ต๐ผ๐ผ๐๐ฒ ๐๐ต๐ฒ ๐ฅ๐ถ๐ด๐ต๐ ๐๐๐
Not all LLMs are equal. Pick one that:
- Excels in reasoning benchmarks
- Supports chain-of-thought (CoT) prompting
- Delivers consistent responses
๐ Tip: Experiment with models & fine-tune prompts to enhance reasoning.
2๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐๐ผ๐ป๐๐ฟ๐ผ๐น ๐๐ผ๐ด๐ถ๐ฐ
Your agent needs a strategy:
- Tool Use: Call tools when needed; otherwise, respond directly.
- Basic Reflection: Generate, critique, and refine responses.
- ReAct: Plan, execute, observe, and iterate.
- Plan-then-Execute: Outline all steps first, then execute.
๐ Choosing the right approach improves reasoning & reliability.
3๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ผ๐ฟ๐ฒ ๐๐ป๐๐๐ฟ๐๐ฐ๐๐ถ๐ผ๐ป๐ & ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐
Set operational rules:
- How to handle unclear queries? (Ask clarifying questions)
- When to use external tools?
- Formatting rules? (Markdown, JSON, etc.)
- Interaction style?
๐ Clear system prompts shape agent behavior.
4๏ธโฃ ๐๐บ๐ฝ๐น๐ฒ๐บ๐ฒ๐ป๐ ๐ฎ ๐ ๐ฒ๐บ๐ผ๐ฟ๐ ๐ฆ๐๐ฟ๐ฎ๐๐ฒ๐ด๐
LLMs forget past interactions. Memory strategies:
- Sliding Window: Retain recent turns, discard old ones.
- Summarized Memory: Condense key points for recall.
- Long-Term Memory: Store user preferences for personalization.
๐ Example: A financial AI recalls risk tolerance from past chats.
5๏ธโฃ ๐๐พ๐๐ถ๐ฝ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐ ๐๐ถ๐๐ต ๐ง๐ผ๐ผ๐น๐ & ๐๐ฃ๐๐
Extend capabilities with external tools:
- Name: Clear, intuitive (e.g., "StockPriceRetriever")
- Description: What does it do?
- Schemas: Define input/output formats
- Error Handling: How to manage failures?
๐ Example: A support AI retrieves order details via CRM API.
6๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐ฅ๐ผ๐น๐ฒ & ๐๐ฒ๐ ๐ง๐ฎ๐๐ธ๐
Narrowly defined agents perform better. Clarify:
- Mission: (e.g., "I analyze datasets for insights.")
- Key Tasks: (Summarizing, visualizing, analyzing)
- Limitations: ("I donโt offer legal advice.")
๐ Example: A financial AI focuses on finance, not general knowledge.
7๏ธโฃ ๐๐ฎ๐ป๐ฑ๐น๐ถ๐ป๐ด ๐ฅ๐ฎ๐ ๐๐๐ ๐ข๐๐๐ฝ๐๐๐
Post-process responses for structure & accuracy:
- Convert AI output to structured formats (JSON, tables)
- Validate correctness before user delivery
- Ensure correct tool execution
๐ Example: A financial AI converts extracted data into JSON.
8๏ธโฃ ๐ฆ๐ฐ๐ฎ๐น๐ถ๐ป๐ด ๐๐ผ ๐ ๐๐น๐๐ถ-๐๐ด๐ฒ๐ป๐ ๐ฆ๐๐๐๐ฒ๐บ๐ (๐๐ฑ๐๐ฎ๐ป๐ฐ๐ฒ๐ฑ)
For complex workflows:
- Info Sharing: What context is passed between agents?
- Error Handling: What if one agent fails?
- State Management: How to pause/resume tasks?
๐ Example:
1๏ธโฃ One agent fetches data
2๏ธโฃ Another summarizes
3๏ธโฃ A third generates a report
Master the fundamentals, experiment, and refine and.. now go build something amazing!
๐2
Complete Roadmap to Learn Python Programming in 2025
Beginner Level
1. Basics of Python
- Understanding syntax and basic concepts
- Variables and data types
- Basic operators and expressions
- Input and output functions
- Conditional statements (if, elif, else)
- Loops (for, while)
2. Data Structures
- Lists
- Tuples
- Sets
- Dictionaries
3. Functions and Modules
- Defining and calling functions
- Arguments and return values
- Lambda functions
- Built-in modules and importing external modules
Intermediate Level
4. File Handling
- Reading from and writing to files
- Working with CSV, JSON, and other file formats
5. Object-Oriented Programming (OOP)
- Classes and objects
- Methods and constructors
- Inheritance and polymorphism
- Encapsulation and abstraction
6. Error Handling and Exceptions
- Try, except, finally blocks
- Raising exceptions
- Custom exceptions
7. Libraries and Frameworks
- Understanding and using popular libraries (NumPy, Pandas, Matplotlib)
- Introduction to web frameworks (Flask, Django)
Advanced Level
8. Advanced Concepts
- Decorators
- Generators
- Context managers
9. Working with Databases
- SQL and NoSQL databases
- ORM (Object-Relational Mapping) with SQLAlchemy or Django ORM
10. Web Development
- Full-stack development with Django or Flask
- RESTful APIs and backend services
11. Data Science and Machine Learning
- Data analysis with Pandas
- Data visualization with Matplotlib and Seaborn
- Machine learning with Scikit-Learn and TensorFlow
Tools and Best Practices
12. Version Control
- Using Git and GitHub for version control
- Collaboration and branching strategies
13. Testing and Debugging
- Unit testing with Unittest or PyTest
- Debugging techniques and tools
14. Development Environment
- Setting up IDEs (PyCharm, VS Code)
- Virtual environments and dependency management
15. Code Quality
- Writing clean and efficient code
- Adhering to PEP 8 standards
- Code reviews and refactoring
Best Resource to learn Python
Python Interview Questions with Answers
Freecodecamp Python ML Course with FREE Certificate
Python for Data Analysis
Python course for beginners by Microsoft
Scientific Computing with Python
Python course by Google
Python Free Resources
Please give us credits while sharing: -> https://t.me/free4unow_backup
ENJOY LEARNING ๐๐
Beginner Level
1. Basics of Python
- Understanding syntax and basic concepts
- Variables and data types
- Basic operators and expressions
- Input and output functions
- Conditional statements (if, elif, else)
- Loops (for, while)
2. Data Structures
- Lists
- Tuples
- Sets
- Dictionaries
3. Functions and Modules
- Defining and calling functions
- Arguments and return values
- Lambda functions
- Built-in modules and importing external modules
Intermediate Level
4. File Handling
- Reading from and writing to files
- Working with CSV, JSON, and other file formats
5. Object-Oriented Programming (OOP)
- Classes and objects
- Methods and constructors
- Inheritance and polymorphism
- Encapsulation and abstraction
6. Error Handling and Exceptions
- Try, except, finally blocks
- Raising exceptions
- Custom exceptions
7. Libraries and Frameworks
- Understanding and using popular libraries (NumPy, Pandas, Matplotlib)
- Introduction to web frameworks (Flask, Django)
Advanced Level
8. Advanced Concepts
- Decorators
- Generators
- Context managers
9. Working with Databases
- SQL and NoSQL databases
- ORM (Object-Relational Mapping) with SQLAlchemy or Django ORM
10. Web Development
- Full-stack development with Django or Flask
- RESTful APIs and backend services
11. Data Science and Machine Learning
- Data analysis with Pandas
- Data visualization with Matplotlib and Seaborn
- Machine learning with Scikit-Learn and TensorFlow
Tools and Best Practices
12. Version Control
- Using Git and GitHub for version control
- Collaboration and branching strategies
13. Testing and Debugging
- Unit testing with Unittest or PyTest
- Debugging techniques and tools
14. Development Environment
- Setting up IDEs (PyCharm, VS Code)
- Virtual environments and dependency management
15. Code Quality
- Writing clean and efficient code
- Adhering to PEP 8 standards
- Code reviews and refactoring
Best Resource to learn Python
Python Interview Questions with Answers
Freecodecamp Python ML Course with FREE Certificate
Python for Data Analysis
Python course for beginners by Microsoft
Scientific Computing with Python
Python course by Google
Python Free Resources
Please give us credits while sharing: -> https://t.me/free4unow_backup
ENJOY LEARNING ๐๐
๐4
15 Best Project Ideas for Python : ๐
๐ Beginner Level:
1. Simple Calculator
2. To-Do List
3. Number Guessing Game
4. Dice Rolling Simulator
5. Word Counter
๐ Intermediate Level:
6. Weather App
7. URL Shortener
8. Movie Recommender System
9. Chatbot
10. Image Caption Generator
๐ Advanced Level:
11. Stock Market Analysis
12. Autonomous Drone Control
13. Music Genre Classification
14. Real-Time Object Detection
15. Natural Language Processing (NLP) Sentiment Analysis
Here you can find essential Python Resources๐
https://whatsapp.com/channel/0029VaiM08SDuMRaGKd9Wv0L
Like this post for more resources like this ๐โฅ๏ธ
๐ Beginner Level:
1. Simple Calculator
2. To-Do List
3. Number Guessing Game
4. Dice Rolling Simulator
5. Word Counter
๐ Intermediate Level:
6. Weather App
7. URL Shortener
8. Movie Recommender System
9. Chatbot
10. Image Caption Generator
๐ Advanced Level:
11. Stock Market Analysis
12. Autonomous Drone Control
13. Music Genre Classification
14. Real-Time Object Detection
15. Natural Language Processing (NLP) Sentiment Analysis
Here you can find essential Python Resources๐
https://whatsapp.com/channel/0029VaiM08SDuMRaGKd9Wv0L
Like this post for more resources like this ๐โฅ๏ธ
๐5
MIT's "Machine Learning" lecture notes
PDF: https://introml.mit.edu/_static/spring24/LectureNotes/6_390_lecture_notes_spring24.pdf
PDF: https://introml.mit.edu/_static/spring24/LectureNotes/6_390_lecture_notes_spring24.pdf
Use Python to turn messy data into valuable insights!
Here are the main functions you need to know:
1. ๐ฑ๐ฟ๐ผ๐ฝ๐ป๐ฎ(): Clean up your dataset by removing missing values. Use df.dropna() to eliminate rows or columns with NaNs and keep your data clean.
2. ๐ณ๐ถ๐น๐น๐ป๐ฎ(): Replace missing values with a specified value or method. With the help of df.fillna(value) you maintain data integrity without losing valuable information.
3. ๐ฑ๐ฟ๐ผ๐ฝ_๐ฑ๐๐ฝ๐น๐ถ๐ฐ๐ฎ๐๐ฒ๐(): Ensure your data is unique and accurate. Use df.drop_duplicates() to remove duplicate rows and avoid skewing your analysis by aggregating redundant data.
4. ๐ฟ๐ฒ๐ฝ๐น๐ฎ๐ฐ๐ฒ(): Substitute specific values throughout your dataset. The function df.replace(to_replace, value) allows for efficient correction of errors and standardization of data.
5. ๐ฎ๐๐๐๐ฝ๐ฒ(): Convert data types for consistency and accuracy. Use the cast function df['column'].astype(dtype) to ensure your data columns are in the correct format you need for your analysis.
6. ๐ฎ๐ฝ๐ฝ๐น๐(): Apply custom functions to your data. df['column'].apply(func) lets you perform complex transformations and calculations. It works with both standard and lambda functions.
7. ๐๐๐ฟ.๐๐๐ฟ๐ถ๐ฝ(): Clean up text data by removing leading and trailing whitespace. Using df['column'].str.strip() helps you to avoid hard-to-spot errors in string comparisons.
8. ๐๐ฎ๐น๐๐ฒ_๐ฐ๐ผ๐๐ป๐๐(): Get a quick summary of the frequency of values in a column. df['column'].value_counts() helps you understand the distribution of your data.
9. ๐ฝ๐ฑ.๐๐ผ_๐ฑ๐ฎ๐๐ฒ๐๐ถ๐บ๐ฒ(): Convert strings to datetime objects for accurate date and time manipulation. For time series analysis the use of pd.to_datetime(df['column']) will often be one of your first steps in data preparation.
10. ๐ด๐ฟ๐ผ๐๐ฝ๐ฏ๐(): Aggregates data based on specific columns. Use df.groupby('column') to perform operations like sum, mean, or count on grouped data.
Learn to use these Python functions, to be able to transform a pile of messy data into the starting point of an impactful analysis.
Here are the main functions you need to know:
1. ๐ฑ๐ฟ๐ผ๐ฝ๐ป๐ฎ(): Clean up your dataset by removing missing values. Use df.dropna() to eliminate rows or columns with NaNs and keep your data clean.
2. ๐ณ๐ถ๐น๐น๐ป๐ฎ(): Replace missing values with a specified value or method. With the help of df.fillna(value) you maintain data integrity without losing valuable information.
3. ๐ฑ๐ฟ๐ผ๐ฝ_๐ฑ๐๐ฝ๐น๐ถ๐ฐ๐ฎ๐๐ฒ๐(): Ensure your data is unique and accurate. Use df.drop_duplicates() to remove duplicate rows and avoid skewing your analysis by aggregating redundant data.
4. ๐ฟ๐ฒ๐ฝ๐น๐ฎ๐ฐ๐ฒ(): Substitute specific values throughout your dataset. The function df.replace(to_replace, value) allows for efficient correction of errors and standardization of data.
5. ๐ฎ๐๐๐๐ฝ๐ฒ(): Convert data types for consistency and accuracy. Use the cast function df['column'].astype(dtype) to ensure your data columns are in the correct format you need for your analysis.
6. ๐ฎ๐ฝ๐ฝ๐น๐(): Apply custom functions to your data. df['column'].apply(func) lets you perform complex transformations and calculations. It works with both standard and lambda functions.
7. ๐๐๐ฟ.๐๐๐ฟ๐ถ๐ฝ(): Clean up text data by removing leading and trailing whitespace. Using df['column'].str.strip() helps you to avoid hard-to-spot errors in string comparisons.
8. ๐๐ฎ๐น๐๐ฒ_๐ฐ๐ผ๐๐ป๐๐(): Get a quick summary of the frequency of values in a column. df['column'].value_counts() helps you understand the distribution of your data.
9. ๐ฝ๐ฑ.๐๐ผ_๐ฑ๐ฎ๐๐ฒ๐๐ถ๐บ๐ฒ(): Convert strings to datetime objects for accurate date and time manipulation. For time series analysis the use of pd.to_datetime(df['column']) will often be one of your first steps in data preparation.
10. ๐ด๐ฟ๐ผ๐๐ฝ๐ฏ๐(): Aggregates data based on specific columns. Use df.groupby('column') to perform operations like sum, mean, or count on grouped data.
Learn to use these Python functions, to be able to transform a pile of messy data into the starting point of an impactful analysis.
๐10
Python project-based interview questions for a data analyst role, along with tips and sample answers [Part-1]
1. Data Cleaning and Preprocessing
- Question: Can you walk me through the data cleaning process you followed in a Python-based project?
- Answer: In my project, I used Pandas for data manipulation. First, I handled missing values by imputing them with the median for numerical columns and the most frequent value for categorical columns using
- Tip: Mention specific functions you used, like
2. Exploratory Data Analysis (EDA)
- Question: How did you perform EDA in a Python project? What tools did you use?
- Answer: I used Pandas for data exploration, generating summary statistics with
- Tip: Focus on how you used visualization tools like Matplotlib, Seaborn, or Plotly, and mention any specific insights you gained from EDA (e.g., data distributions, relationships, outliers).
3. Pandas Operations
- Question: Can you explain a situation where you had to manipulate a large dataset in Python using Pandas?
- Answer: In a project, I worked with a dataset containing over a million rows. I optimized my operations by using vectorized operations instead of Python loops. For example, I used
- Tip: Emphasize your understanding of efficient data manipulation with Pandas, mentioning functions like
4. Data Visualization
- Question: How do you create visualizations in Python to communicate insights from data?
- Answer: I primarily use Matplotlib and Seaborn for static plots and Plotly for interactive dashboards. For example, in one project, I used
- Tip: Mention the specific plots you created and how you customized them (e.g., adding labels, titles, adjusting axis scales). Highlight the importance of clear communication through visualization.
1. Data Cleaning and Preprocessing
- Question: Can you walk me through the data cleaning process you followed in a Python-based project?
- Answer: In my project, I used Pandas for data manipulation. First, I handled missing values by imputing them with the median for numerical columns and the most frequent value for categorical columns using
fillna()
. I also removed outliers by setting a threshold based on the interquartile range (IQR). Additionally, I standardized numerical columns using StandardScaler from Scikit-learn and performed one-hot encoding for categorical variables using Pandas' get_dummies()
function.- Tip: Mention specific functions you used, like
dropna()
, fillna()
, apply()
, or replace()
, and explain your rationale for selecting each method.2. Exploratory Data Analysis (EDA)
- Question: How did you perform EDA in a Python project? What tools did you use?
- Answer: I used Pandas for data exploration, generating summary statistics with
describe()
and checking for correlations with corr()
. For visualization, I used Matplotlib and Seaborn to create histograms, scatter plots, and box plots. For instance, I used sns.pairplot()
to visually assess relationships between numerical features, which helped me detect potential multicollinearity. Additionally, I applied pivot tables to analyze key metrics by different categorical variables.- Tip: Focus on how you used visualization tools like Matplotlib, Seaborn, or Plotly, and mention any specific insights you gained from EDA (e.g., data distributions, relationships, outliers).
3. Pandas Operations
- Question: Can you explain a situation where you had to manipulate a large dataset in Python using Pandas?
- Answer: In a project, I worked with a dataset containing over a million rows. I optimized my operations by using vectorized operations instead of Python loops. For example, I used
apply()
with a lambda function to transform a column, and groupby()
to aggregate data by multiple dimensions efficiently. I also leveraged merge()
to join datasets on common keys.- Tip: Emphasize your understanding of efficient data manipulation with Pandas, mentioning functions like
groupby()
, merge()
, concat()
, or pivot()
.4. Data Visualization
- Question: How do you create visualizations in Python to communicate insights from data?
- Answer: I primarily use Matplotlib and Seaborn for static plots and Plotly for interactive dashboards. For example, in one project, I used
sns.heatmap()
to visualize the correlation matrix and sns.barplot()
for comparing categorical data. For time-series data, I used Matplotlib to create line plots that displayed trends over time. When presenting the results, I tailored visualizations to the audience, ensuring clarity and simplicity.- Tip: Mention the specific plots you created and how you customized them (e.g., adding labels, titles, adjusting axis scales). Highlight the importance of clear communication through visualization.
๐5
Here is the list of few projects (found on kaggle). They cover Basics of Python, Advanced Statistics, Supervised Learning (Regression and Classification problems) & Data Science
Please also check the discussions and notebook submissions for different approaches and solution after you tried yourself.
1. Basic python and statistics
Pima Indians :- https://www.kaggle.com/uciml/pima-indians-diabetes-database
Cardio Goodness fit :- https://www.kaggle.com/saurav9786/cardiogoodfitness
Automobile :- https://www.kaggle.com/toramky/automobile-dataset
2. Advanced Statistics
Game of Thrones:-https://www.kaggle.com/mylesoneill/game-of-thrones
World University Ranking:-https://www.kaggle.com/mylesoneill/world-university-rankings
IMDB Movie Dataset:- https://www.kaggle.com/carolzhangdc/imdb-5000-movie-dataset
3. Supervised Learning
a) Regression Problems
How much did it rain :- https://www.kaggle.com/c/how-much-did-it-rain-ii/overview
Inventory Demand:- https://www.kaggle.com/c/grupo-bimbo-inventory-demand
Property Inspection predictiion:- https://www.kaggle.com/c/liberty-mutual-group-property-inspection-prediction
Restaurant Revenue prediction:- https://www.kaggle.com/c/restaurant-revenue-prediction/data
IMDB Box office Prediction:-https://www.kaggle.com/c/tmdb-box-office-prediction/overview
b) Classification problems
Employee Access challenge :- https://www.kaggle.com/c/amazon-employee-access-challenge/overview
Titanic :- https://www.kaggle.com/c/titanic
San Francisco crime:- https://www.kaggle.com/c/sf-crime
Customer satisfcation:-https://www.kaggle.com/c/santander-customer-satisfaction
Trip type classification:- https://www.kaggle.com/c/walmart-recruiting-trip-type-classification
Categorize cusine:- https://www.kaggle.com/c/whats-cooking
4. Some helpful Data science projects for beginners
https://www.kaggle.com/c/house-prices-advanced-regression-techniques
https://www.kaggle.com/c/digit-recognizer
https://www.kaggle.com/c/titanic
5. Intermediate Level Data science Projects
Black Friday Data : https://www.kaggle.com/sdolezel/black-friday
Human Activity Recognition Data : https://www.kaggle.com/uciml/human-activity-recognition-with-smartphones
Trip History Data : https://www.kaggle.com/pronto/cycle-share-dataset
Million Song Data : https://www.kaggle.com/c/msdchallenge
Census Income Data : https://www.kaggle.com/c/census-income/data
Movie Lens Data : https://www.kaggle.com/grouplens/movielens-20m-dataset
Twitter Classification Data : https://www.kaggle.com/c/twitter-sentiment-analysis2
Share with credits: https://t.me/sqlproject
ENJOY LEARNING ๐๐
Please also check the discussions and notebook submissions for different approaches and solution after you tried yourself.
1. Basic python and statistics
Pima Indians :- https://www.kaggle.com/uciml/pima-indians-diabetes-database
Cardio Goodness fit :- https://www.kaggle.com/saurav9786/cardiogoodfitness
Automobile :- https://www.kaggle.com/toramky/automobile-dataset
2. Advanced Statistics
Game of Thrones:-https://www.kaggle.com/mylesoneill/game-of-thrones
World University Ranking:-https://www.kaggle.com/mylesoneill/world-university-rankings
IMDB Movie Dataset:- https://www.kaggle.com/carolzhangdc/imdb-5000-movie-dataset
3. Supervised Learning
a) Regression Problems
How much did it rain :- https://www.kaggle.com/c/how-much-did-it-rain-ii/overview
Inventory Demand:- https://www.kaggle.com/c/grupo-bimbo-inventory-demand
Property Inspection predictiion:- https://www.kaggle.com/c/liberty-mutual-group-property-inspection-prediction
Restaurant Revenue prediction:- https://www.kaggle.com/c/restaurant-revenue-prediction/data
IMDB Box office Prediction:-https://www.kaggle.com/c/tmdb-box-office-prediction/overview
b) Classification problems
Employee Access challenge :- https://www.kaggle.com/c/amazon-employee-access-challenge/overview
Titanic :- https://www.kaggle.com/c/titanic
San Francisco crime:- https://www.kaggle.com/c/sf-crime
Customer satisfcation:-https://www.kaggle.com/c/santander-customer-satisfaction
Trip type classification:- https://www.kaggle.com/c/walmart-recruiting-trip-type-classification
Categorize cusine:- https://www.kaggle.com/c/whats-cooking
4. Some helpful Data science projects for beginners
https://www.kaggle.com/c/house-prices-advanced-regression-techniques
https://www.kaggle.com/c/digit-recognizer
https://www.kaggle.com/c/titanic
5. Intermediate Level Data science Projects
Black Friday Data : https://www.kaggle.com/sdolezel/black-friday
Human Activity Recognition Data : https://www.kaggle.com/uciml/human-activity-recognition-with-smartphones
Trip History Data : https://www.kaggle.com/pronto/cycle-share-dataset
Million Song Data : https://www.kaggle.com/c/msdchallenge
Census Income Data : https://www.kaggle.com/c/census-income/data
Movie Lens Data : https://www.kaggle.com/grouplens/movielens-20m-dataset
Twitter Classification Data : https://www.kaggle.com/c/twitter-sentiment-analysis2
Share with credits: https://t.me/sqlproject
ENJOY LEARNING ๐๐
๐5