Kaggle Data Hub
29.2K subscribers
933 photos
14 videos
309 files
1.2K links
Your go-to hub for Kaggle datasets โ€“ explore, analyze, and leverage data for Machine Learning and Data Science projects.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
heart+disease.zip
125.9 KB
๐Ÿ“ฆ Datasets name: Heart Disease

๐Ÿ’ฌ This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0). The names and social security numbers of the patients were recently removed from the database, replaced with dummy values. One file has been "processed", that one containing the Cleveland database. All four unprocessed files also exist in this directory. To see Test Costs (donated by Peter Turney), please see the folder "Costs"


๐Ÿ” From: UCI Machine Learning repository

๐ŸŸข https://t.me/datasets1
๐Ÿ‘18๐Ÿ”ฅ3โค1
dry+bean+dataset.zip
4.5 MB
๐Ÿ“ฆ Datasets name: Dry Bean Dataset


๐Ÿ’ฌ Seven different types of dry beans were used in this research, taking into account the features such as form, shape, type, and ...

๐Ÿ” From: UCI Machine Learning repository

#Biology #Classification

๐ŸŸข https://t.me/datasets1
๐Ÿ‘11๐Ÿ˜2โค1๐Ÿ”ฅ1
adult.zip
605.7 KB
๐Ÿ“ฆ Datasets name: Adult


๐Ÿ’ฌ Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))
Prediction task is to determine whether a person makes over 50K a year.

๐Ÿ” From: UCI Machine Learning repository

#Social_Science #Classification

๐ŸŸข https://t.me/datasets1
๐Ÿ‘33โค5โคโ€๐Ÿ”ฅ1๐Ÿ”ฅ1๐Ÿค”1
Kaggle Data Hub
adult.zip
๐Ÿ‡ท ๐Ÿ‡ช ๐Ÿ‡ฆ ๐Ÿ‡จ ๐Ÿ‡น ๐Ÿ‡ต ๐Ÿ‡ฑ ๐Ÿ‡ช ๐Ÿ‡ฆ ๐Ÿ‡ธ ๐Ÿ‡ช
โค18๐Ÿ‘7๐Ÿ‘1
TopSongs.zip
436.2 KB
๐Ÿ“ฆ Datasets name: Top Songs of the World

๐Ÿ’ฌ "Top Songs of the World" is a collection of information about popular songs spanning various decades and genres. The dataset includes details such as the ranking of songs, the respective artists, titles, release years, sales figures, streaming statistics, download counts, radio play metrics, and a numerical rating. This dataset provides insights into the commercial success, digital presence, and overall popularity of each song, offering a comprehensive overview of the music industry's landscape over time. Researchers, analysts, and music enthusiasts can utilize this dataset to explore trends, patterns, and correlations within the context of the featured songs and artists.



โš™ Format: CSV file

๐Ÿ” From: Kaggle

๐ŸŸข https://t.me/datasets1
๐Ÿ‘20
Spotify's Greatest.zip
345.4 KB
๐ŸŽ Datasets name: Spotify's Greatest

๐Ÿ’ฌ 8000+ Iconic Tracks," a meticulously curated collection capturing the essence of musical evolution. From the heartbeats of timeless classics to the vibrant energy of today's chart-toppers, this dataset offers a harmonious blend of tracks that have left an indelible mark on the global soundscape. Sourced from the vast libraries of Spotify, each entry in this dataset is a note in the symphony of streaming history, inviting enthusiasts and data maestros alike to explore the melodies of data-driven insights.



๐Ÿ” From: Kaggle

๐ŸŸข https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘22โค1
Road accidents .zip
42.6 MB
๐Ÿ“ฆ Datasets name: Road accidents in the Czech Republic


๐Ÿ’ฌ Detailed dataset of road accidents in the Czech Republic (2016-2022) , The police of Czech Republic regularly gathers and releases detailed data on traffic incidents throughout the nation, typically on an monthly basis. This dataset covers various aspects such as geographic locations, weather conditions, vehicle types, casualty counts, and vehicle maneuvers. The wealth of information makes it a compelling and extensive dataset for analysis and research purposes.


โš™๏ธ Format: CSV file

๐Ÿ”’ From: Kaggle


๐Ÿ‘€ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘12๐Ÿ˜4โค1
COVID-19 Global Statistics .zip
10.4 KB
๐Ÿ“ฆ Datasets name: COVID-19 Global Statistics Dataset


๐Ÿ”… The COVID-19 Global Statistics Dataset offers comprehensive insights into the impact of the COVID-19 pandemic worldwide. It provides detailed statistics on COVID-19 cases, deaths, recoveries, testing, and demographic information across various countries and regions. Sourced from reliable sources, including government health departments and international organizations, this dataset serves as a valuable resource for researchers, policymakers, and public health experts to track the progression of the pandemic, analyze trends, and inform evidence-based decision-making.


โš™๏ธ Format: CSV file

๐Ÿ”’ From: Kaggle


๐ŸŸข https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘18
Diabetes Dataset.zip
8.9 KB
๐Ÿ“ฆ Datasets name: Diabetes Dataset


๐Ÿ”… This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes.


๐Ÿ†— Format: CSV file

๐Ÿ” From: Kaggle


๐Ÿ‘ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘10โค5๐Ÿฅฐ1
Drug overdose death (1).zip
582 B
๐Ÿ“ฆ Datasets name: Drug overdose death


๐Ÿ”… Annual number of deaths in the United States from drug overdose per 100,000 people. Overdoses can result from intentional excessive use of a substance, but can also result from 'poisoning' where substances have been altered or mixed, such that the user is unaware of the drug's potency.

๐Ÿ“ถ Format: CSV file

๐Ÿ”’ From: Kaggle


๐Ÿ™ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘12โค1๐Ÿฅฐ1๐Ÿ‘Œ1
Household Power Consumption.zip
9.1 MB
๐Ÿ“ฆ Datasets name: Household Power Consumption


๐Ÿ“ถ Individual household electric power consumption dataset collected via submeters placed in 3 distinct areas of a home.

๐Ÿ—ฃ Format: CSV file

๐Ÿ”’ From: Kaggle


๐Ÿ™ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘17โค3๐Ÿ”ฅ1
ZARA Sales .zip
16.8 KB
๐Ÿ“ฆ Datasets name: ZARA Sales


๐Ÿ“ถ This Zara sales dataset contains information about product sales from Zara stores over a specific period of time. The dataset includes various attributes relevant to sales, such as product ID, product name, product category, price, sales volume, sales date, and store location. This data can be used to analyze product sales trends, sales performance across different product categories, the effectiveness of promotions, customer purchasing patterns, and other factors that influence Zara's sales performance. Analyzing this dataset can provide valuable insights for Zara management in optimizing marketing strategies, inventory management, and other decision-making processes to enhance revenue and profitability.

๐ŸŒ Format: CSV file

๐Ÿ›ก๏ธ From: Kaggle


๐Ÿ™ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘16โค4๐Ÿ†3๐Ÿ”ฅ2
gemma-rewrite-nbroad.zip
3.7 MB
๐Ÿ“ฆ Datasets name: gemma rewrite nbroad


โ˜ฏ๏ธ 2k+ essays rewritten by gemma. Generation done using gemma-7b-it on A100 in bfloat16 using TGI


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

๐Ÿ˜‘ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ”ฅ7โค3๐Ÿ‘3๐Ÿ‘1
Top Classical Composers.zip
3.7 KB
๐Ÿ“ฆ Datasets name: Top Classical Composers


๐ŸŒน This dataset contains a comprehensive list of the most famous classical composers. The dataset provides insights into the composers details such as their best piece and the duration of that piece. The dataset includes information such as composers name, nationality, birth year, death year , their most famous works, and the duration of their famous piece.


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

๐Ÿ˜• https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘16๐Ÿ”ฅ3๐Ÿฅฐ2๐Ÿ‘2
Diabetes.zip
17.5 KB
๐Ÿ“ฆ Datasets name: Diabetes


๐ŸŒน The Diabetes Dataset contains information about individuals diagnosed with diabetes, including demographic attributes, medical history, and clinical measurements. This dataset serves as a valuable resource for studying diabetes management, risk factors, and predictive modeling for disease outcomes.


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

๐Ÿ‘‰ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘12๐Ÿฅฐ1
According to the latest news about the NotCoin game: the price of 10,000 Netcoin clicks will be 0.5$

Join our team in order to earn more clicks and points as quickly as possible

Notecoin is a verified and official Telegram game based on tapping coins and earning Notecoin points

https://t.me/notcoin_bot?start=r_574156_5692331
๐Ÿ‘5
Bank Customer Complaint Analysis.zip
20 MB
๐Ÿ“ฆ Datasets name: Bank Customer Complaint Analysis


๐ŸŒธ The objective of this internship project is to develop an NLP model for bank customer complaint analysis to classify complaints into predefined categories based on their textual narratives. By automating the classification process, the project aims to improve efficiency in dispute resolution and enhance customer satisfaction by ensuring timely and accurate handling of complaints .


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

๐Ÿคจ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘23๐Ÿ”ฅ8โค2๐Ÿ‘2
Movies Youtube.zip
95.5 KB
๐Ÿ“ฆ Datasets name: Movies Youtube Trailers & Sentiment


๐ŸŒนThis dataset is a result of CinemaFuture, a machine learning project focused on predicting the potential success of films. This ML project aims to revolutionize the film industry by providing data-driven insights into what factors contribute to a film's success.

Utilizing comprehensive data encompassing box office earnings, budgets, cast and crew information, genres, release dates, audience ratings, and social media trends, this tool applies advanced feature engineering and machine learning algorithms to predict outcomes. Additionally, this application scrapes YouTube comments from upcoming movies' trailers and conducts sentiment analysis to predict public hype for a movie. Whether it's estimating box office revenue or categorizing a film as a potential hit or flop, the AI-Powered Film Success Predictor offers valuable predictions and insights.


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

๐ŸŸก https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘16โค3๐Ÿ”ฅ2๐Ÿ‘Œ1
Anime Dataset.zip
216.6 MB
๐Ÿ“ฆ Datasets name: Anime Dataset

๐ŸŒนThe dataset contains 3 files:

animes.csv contains list of anime, with title, title synonyms, genre, duration, rank, populatiry, score, airing date, episodes and many other important data about individual anime providing sufficient information about trends in time about important aspects of anime. Rank is in float format in csv, but it contains only integer value. This is due to NaN values and their representation in pandas.

profiles.csv contains information about users who watch anime, namely username, birth date, gender, and favorite animes list.

reviews.csv contains information about reviews users x animes, with text review and scores.


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

๐ŸŸก https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘12โค8๐Ÿคจ3๐Ÿ”ฅ2๐Ÿ˜ˆ2
Cars Data.zip
1.2 MB
๐Ÿ“ฆ Datasets name: Cars Data

๐ŸŒน a collection of over 90,000 used cars spanning from the year 1970 to 2024. This dataset offers a comprehensive glimpse into the world of automobiles, providing valuable insights for researchers, enthusiasts, and industry professionals alike.


๐ŸŒ Format: CSV file

๐Ÿ”’ From: Kaggle

โบ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘18โค6๐Ÿ˜2๐Ÿ‘1
eCommerce Customer Service Satisfaction.zip
6.3 MB
๐ŸซฐDatasets name: eCommerce Customer Service Satisfaction

๐Ÿ˜ The dataset captures customer satisfaction scores for a one-month period at an e-commerce platform called Shopzilla (a pseudonym). It includes various features such as category and sub-category of interaction, customer remarks, survey response date, category, item price, agent details (name, supervisor, manager), and CSAT score etc.


๐Ÿ˜ Format: CSV file

๐Ÿ˜ From: Kaggle

๐Ÿ˜˜ https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘16๐Ÿ”ฅ6โค3๐Ÿ‘1