Data Science Portfolio - Kaggle Datasets & AI Projects | Artificial Intelligence
37.3K subscribers
282 photos
76 files
336 links
Free Datasets For Data Science Projects & Portfolio

Buy ads: https://telega.io/c/DataPortfolio

For Promotions/ads: @coderfun @love_data
Download Telegram
Data Cleaning Checklist:

If you're just starting out in the world of data analytics, hopefully this checklist helps demystify the concept of "data cleaning"...

โ˜‘ Missing data - Decide if youโ€™re going to omit the datapoint, mathematically estimate the missing data using statistical methods, or use an external source to fill in the missing data.

โ˜‘ Duplicate data - Identify duplicate data and what it means in context. Is the duplicate an error that needs to be deleted? Or is it possible that you could have two of the same data point?

โ˜‘ Formatting errors - Ensure all data is rounded to the correct decimal place, all data is aligned correctly, and the data format is consistent within columns.

โ˜‘ Incorrect data types - Ensure all of your data is pulled as the correct data type (ex. making sure that integers are not used for money values).

โ˜‘ Outliers - Identify data points that are +/- 2 standard deviations from the mean, and double check that these values are correct. If they are correct, they may require further investigation.
๐Ÿ‘7๐Ÿ”ฅ2
5 Handy Tips to master Data Science โฌ‡๏ธ


1๏ธโƒฃ Begin with introductory projects that cover the fundamental concepts of data science, such as data exploration, cleaning, and visualization. These projects will help you get familiar with common data science tools and libraries like Python (Pandas, NumPy, Matplotlib), R, SQL, and Excel

2๏ธโƒฃ Look for publicly available datasets from sources like Kaggle, UCI Machine Learning Repository. Working with real-world data will expose you to the challenges of messy, incomplete, and heterogeneous data, which is common in practical scenarios.

3๏ธโƒฃ Explore various data science techniques like regression, classification, clustering, and time series analysis. Apply these techniques to different datasets and domains to gain a broader understanding of their strengths, weaknesses, and appropriate use cases.

4๏ธโƒฃ Work on projects that involve the entire data science lifecycle, from data collection and cleaning to model building, evaluation, and deployment. This will help you understand how different components of the data science process fit together.

5๏ธโƒฃ Consistent practice is key to mastering any skill. Set aside dedicated time to work on data science projects, and gradually increase the complexity and scope of your projects as you gain more experience.
๐Ÿ‘5โค4
๐Ÿš€Here are 5 fresh Project ideas for Data Analysts ๐Ÿ‘‡

๐ŸŽฏ ๐—”๐—ถ๐—ฟ๐—ฏ๐—ป๐—ฏ ๐—ข๐—ฝ๐—ฒ๐—ป ๐——๐—ฎ๐˜๐—ฎ ๐Ÿ 
https://www.kaggle.com/datasets/arianazmoudeh/airbnbopendata

๐Ÿ’กThis dataset describes the listing activity of homestays in New York City

๐ŸŽฏ ๐—ง๐—ผ๐—ฝ ๐—ฆ๐—ฝ๐—ผ๐˜๐—ถ๐—ณ๐˜† ๐˜€๐—ผ๐—ป๐—ด๐˜€ ๐—ณ๐—ฟ๐—ผ๐—บ ๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿฌ-๐Ÿฎ๐Ÿฌ๐Ÿญ๐Ÿต ๐ŸŽต

https://www.kaggle.com/datasets/leonardopena/top-spotify-songs-from-20102019-by-year

๐ŸŽฏ๐—ช๐—ฎ๐—น๐—บ๐—ฎ๐—ฟ๐˜ ๐—ฆ๐˜๐—ผ๐—ฟ๐—ฒ ๐—ฆ๐—ฎ๐—น๐—ฒ๐˜€ ๐—™๐—ผ๐—ฟ๐—ฒ๐—ฐ๐—ฎ๐˜€๐˜๐—ถ๐—ป๐—ด ๐Ÿ“ˆ

https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data
๐Ÿ’กUse historical markdown data to predict store sales

๐ŸŽฏ ๐—ก๐—ฒ๐˜๐—ณ๐—น๐—ถ๐˜… ๐— ๐—ผ๐˜ƒ๐—ถ๐—ฒ๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—ง๐—ฉ ๐—ฆ๐—ต๐—ผ๐˜„๐˜€ ๐Ÿ“บ

https://www.kaggle.com/datasets/shivamb/netflix-shows
๐Ÿ’กListings of movies and tv shows on Netflix - Regularly Updated

๐ŸŽฏ๐—Ÿ๐—ถ๐—ป๐—ธ๐—ฒ๐—ฑ๐—œ๐—ป ๐——๐—ฎ๐˜๐—ฎ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜€๐˜ ๐—ท๐—ผ๐—ฏ๐˜€ ๐—น๐—ถ๐˜€๐˜๐—ถ๐—ป๐—ด๐˜€ ๐Ÿ’ผ

https://www.kaggle.com/datasets/cedricaubin/linkedin-data-analyst-jobs-listings
๐Ÿ’กMore than 8400 rows of data analyst jobs from USA, Canada and Africa.

ENJOY LEARNING ๐Ÿ‘๐Ÿ‘
๐Ÿ‘11
Python-2.pdf
5 MB
Python Tutorial in Jupyter Notebook
๐Ÿ‘9โค2
๐Ÿ”’ Dataset Name: Spotify Songs Album

๐Ÿ” This dataset provides concise details about music tracks and their performance across various platforms. It includes essential information like track name, artist(s), release date, and presence in popular playlists and charts on platforms like Spotify, Apple Music, Deezer, and Shazam. Additionally, it features metrics such as BPM, key, mode, danceability, valence, energy, acousticness, instrumentalness, and liveness_speechiness, which offer insights into the musical characteristics and appeal of each track.

๐Ÿ’ก With this data, analysts can evaluate the popularity, genre, and audience engagement of different music offerings across multiple streaming services.

๐ŸคŒ From: Kaggle

๐Ÿค– Size: 47.1 kB
๐Ÿ‘5โค2
๐Ÿ”’ Dataset Name: Employee Data Analysis

๐Ÿ” Unlocking Insights for a Thriving Workplace

๐Ÿš€ Our extensive collection of datasets provides a deep dive into different aspects of employee engagement and organizational dynamics.

๐Ÿ’ก Our extensive collection of datasets provides a deep dive into different aspects of employee engagement and organizational dynamics.

๐ŸคŒ From: Kaggle

๐Ÿค– Size: 120 kB
โค5๐Ÿ‘4
๐Ÿ”ฅ Step-by-step Data Analysis Projects with SQL



Below are popular data projects from Kaggle, GitHub and Medium and YouTube. They will:

- Help you gain skills in working with real data
- Introduce you to SQL for data analysis
- Inspire you to undertake your own data analysis projects



๐Ÿ—บ Real World Fake Data Analysis

๐Ÿ  Housing sales in Nashville

๐Ÿ›’ Walmart Sales Analysis SQL Project

๐Ÿงณ Alex the Analyst SQL Project

๐Ÿค‘ Superstore Sales Analysis using SQL

๐Ÿ’ธ International Debt Analysis using SQL

โšฝ๏ธ Soccer Game Analysis using SQL

๐ŸŒ World Population Analysis 2015 using SQL

๐Ÿ“‰ SQL Project for Data Analysis

๐Ÿš Public Transportation Data Analysis using SQL

๐Ÿ“ธ Instagram User Data Analysis using SQL

๐Ÿ™Œ HR Data Analysis using SQL

๐ŸŽฌ Data Analyst Project: Step-by-step analysis with SQL

๐ŸŽผ Music Store Data Analysis Project Using SQL

โœ… Top 10 SQL Projects with Datasets

โœ… Roadmap to Master SQL


#DataAnalyst #DataAnalytics #DataAnalysis #data_analyst #sql

If you find this useful, give it a
๐Ÿ‘
๐Ÿ‘25โค2
cryptos historical data.zip
26.5 MB
Dataset Name: top 1000 cryptos historical data ( Daily updates )
Instagram fake spammer genuine accounts.zip
6.8 KB
Dataset Name: Instagram fake spammer genuine accounts
    
๐Ÿ‘7โค3
Don't forget to check these 10 SQL projects with corresponding datasets that you could use to practice your SQL skills:

1. Analysis of Sales Data:

(https://www.kaggle.com/kyanyoga/sample-sales-data)

2. HR Analytics:

(https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset)

3. Social Media Analytics:

(https://www.kaggle.com/datasets/ramjasmaurya/top-1000-social-media-channels)

4. Financial Data Analysis:

(https://www.kaggle.com/datasets/nitindatta/finance-data)

5. Healthcare Data Analysis:

(https://www.kaggle.com/cdc/mortality)

6. Customer Relationship Management:

(https://www.kaggle.com/pankajjsh06/ibm-watson-marketing-customer-value-data)

7. Web Analytics:

(https://www.kaggle.com/zynicide/wine-reviews)

8. E-commerce Analysis:

(https://www.kaggle.com/olistbr/brazilian-ecommerce)

9. Supply Chain Management:

(https://www.kaggle.com/datasets/harshsingh2209/supply-chain-analysis)

10. Inventory Management:

(https://www.kaggle.com/datasets?search=inventory+management)

Share this channel with your friends ๐Ÿค๐Ÿคฉ

Join for more -> https://t.me/addlist/ID95piZJZa0wYzk5

ENJOY LEARNING ๐Ÿ‘๐Ÿ‘
๐Ÿ‘8โค3
The key to starting your data analysis career:

โŒIt's not your education
โŒIt's not your experience

It's how you apply these principles:

1. Learn the job through "doing"
2. Build a portfolio
3. Make yourself known

No one starts an expert, but everyone can become one.

If you're looking for a career in data analysis, start by:

โŸถ Watching videos
โŸถ Reading experts advice
โŸถ Doing internships
โŸถ Building a portfolio
โŸถ Learning from seniors

You'll be amazed at how fast you'll learn and how quickly you'll become an expert.

So, start today and let the data analysis career begin
๐Ÿ‘8โค4
Here is the list of few projects (found on kaggle). They cover Basics of Python, Advanced Statistics, Supervised Learning (Regression and Classification problems) & Data Science

Please also check the discussions and notebook submissions for different approaches and solution after you tried yourself.

1. Basic python and statistics

Pima Indians :- https://www.kaggle.com/uciml/pima-indians-diabetes-database
Cardio Goodness fit :- https://www.kaggle.com/saurav9786/cardiogoodfitness
Automobile :- https://www.kaggle.com/toramky/automobile-dataset

2. Advanced Statistics

Game of Thrones:-https://www.kaggle.com/mylesoneill/game-of-thrones
World University Ranking:-https://www.kaggle.com/mylesoneill/world-university-rankings
IMDB Movie Dataset:- https://www.kaggle.com/carolzhangdc/imdb-5000-movie-dataset

3. Supervised Learning

a) Regression Problems

How much did it rain :- https://www.kaggle.com/c/how-much-did-it-rain-ii/overview
Inventory Demand:- https://www.kaggle.com/c/grupo-bimbo-inventory-demand
Property Inspection predictiion:- https://www.kaggle.com/c/liberty-mutual-group-property-inspection-prediction
Restaurant Revenue prediction:- https://www.kaggle.com/c/restaurant-revenue-prediction/data
IMDB Box office Prediction:-https://www.kaggle.com/c/tmdb-box-office-prediction/overview

b) Classification problems

Employee Access challenge :- https://www.kaggle.com/c/amazon-employee-access-challenge/overview
Titanic :- https://www.kaggle.com/c/titanic
San Francisco crime:- https://www.kaggle.com/c/sf-crime
Customer satisfcation:-https://www.kaggle.com/c/santander-customer-satisfaction
Trip type classification:- https://www.kaggle.com/c/walmart-recruiting-trip-type-classification
Categorize cusine:- https://www.kaggle.com/c/whats-cooking

4. Some helpful Data science projects for beginners

https://www.kaggle.com/c/house-prices-advanced-regression-techniques

https://www.kaggle.com/c/digit-recognizer

https://www.kaggle.com/c/titanic

5. Intermediate Level Data science Projects

Black Friday Data : https://www.kaggle.com/sdolezel/black-friday

Human Activity Recognition Data : https://www.kaggle.com/uciml/human-activity-recognition-with-smartphones

Trip History Data : https://www.kaggle.com/pronto/cycle-share-dataset

Million Song Data : https://www.kaggle.com/c/msdchallenge

Census Income Data : https://www.kaggle.com/c/census-income/data

Movie Lens Data : https://www.kaggle.com/grouplens/movielens-20m-dataset

Twitter Classification Data : https://www.kaggle.com/c/twitter-sentiment-analysis2

Share with credits: https://t.me/sqlproject

ENJOY LEARNING ๐Ÿ‘๐Ÿ‘
๐Ÿ‘11โค4
๐’๐๐‹ ๐‚๐š๐ฌ๐ž ๐’๐ญ๐ฎ๐๐ข๐ž๐ฌ ๐Ÿ๐จ๐ซ ๐ˆ๐ง๐ญ๐ž๐ซ๐ฏ๐ข๐ž๐ฐ:

Join for more: https://t.me/sqlanalyst

1. Dannyโ€™s Diner:
Restaurant analytics to understand the customer orders pattern.
Link: https://8weeksqlchallenge.com/case-study-1/

2. Pizza Runner
Pizza shop analytics to optimize the efficiency of the operation
Link: https://8weeksqlchallenge.com/case-study-2/

3. Foodie Fie
Subscription-based food content platform
Link: https://lnkd.in/gzB39qAT

4. Data Bank: Thatโ€™s money
Analytics based on customer activities with the digital bank
Link: https://lnkd.in/gH8pKPyv

5. Data Mart: Fresh is Best
Analytics on Online supermarket
Link: https://lnkd.in/gC5bkcDf

6. Clique Bait: Attention capturing
Analytics on the seafood industry
Link: https://lnkd.in/ggP4JiYG

7. Balanced Tree: Clothing Company
Analytics on the sales performance of clothing store
Link: https://8weeksqlchallenge.com/case-study-7

8. Fresh segments: Extract maximum value
Analytics on online advertising
Link: https://8weeksqlchallenge.com/case-study-8
๐Ÿ‘5โค4
Creating a data science portfolio is a great way to showcase your skills and experience to potential employers. Here are some steps to help you create a strong data science portfolio:

1. Choose relevant projects: Select a few data science projects that demonstrate your skills and interests. These projects can be from your previous work experience, personal projects, or online competitions.

2. Clean and organize your code: Make sure your code is well-documented, organized, and easy to understand. Use comments to explain your thought process and the steps you took in your analysis.

3. Include a variety of projects: Try to include a mix of projects that showcase different aspects of data science, such as data cleaning, exploratory data analysis, machine learning, and data visualization.

4. Create visualizations: Data visualizations can help make your portfolio more engaging and easier to understand. Use tools like Matplotlib, Seaborn, or Tableau to create visually appealing charts and graphs.

5. Write project summaries: For each project, provide a brief summary of the problem you were trying to solve, the dataset you used, the methods you applied, and the results you obtained. Include any insights or recommendations that came out of your analysis.

6. Showcase your technical skills: Highlight the programming languages, libraries, and tools you used in each project. Mention any specific techniques or algorithms you implemented.

7. Link to your code and data: Provide links to your code repositories (e.g., GitHub) and any datasets you used in your projects. This allows potential employers to review your work in more detail.

8. Keep it updated: Regularly update your portfolio with new projects and skills as you gain more experience in data science. This will show that you are actively engaged in the field and continuously improving your skills.

So, start today and let the data analysis career begin
๐Ÿ‘7โค4