Kaggle Data Hub
29.2K subscribers
933 photos
14 videos
309 files
1.2K links
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Top YouTubers Worldwide.zip
83.4 KB
📦 Datasets name: Top YouTubers Worldwide

💬 This dataset provides detailed metrics and categories for a diverse range of popular YouTube channels. Explore key statistics such as subscriber count, video views, category, and geographical information for each channel. Ideal for analysis and insights into trends within the dynamic landscape of online content creation.

Format: CSV file

🔐 From: Kaggle

🟢 https://t.me/datasets1
👍121😁1
NHANES_2017-2018.zip
11.7 MB
📦 Datasets name: National Health & Nutrition Exam Survey 2017-2018


💬 this is the most recent NHANES dataset whose data collection was not affected by COVID-19.


Format: CSV file

🔐 From: Kaggle

🟢 https://t.me/datasets1
👍8
In which area do you prefer datasets to be placed in the channel?
Anonymous Poll
8%
Sound processing
12%
Video processing
51%
Natural Language Processing
29%
Image Processing
🤔8🔥3🤩2
Car F and P.csv
1.4 MB
📦 Datasets name: Car Features and Prices Dataset


💬 This dataset which has different features of cars like model, year, engine and other properties along with its price. It has 28 years of data from 1990 to 2017.


Format: CSV file

🔐 From: Kaggle

🟢 https://t.me/datasets1
👍15🤩3
NFLX.csv
18.2 KB
📦 Datasets name: Netflix stock price data for 2023-24.


💬Netflix's stock prices, spanning from 2023 to 2024. It is an invaluable resource for data analysts and financial experts, enabling them to perform regression analysis, predict future trends, and create insightful data visualizations. With this dataset, users can gain a deeper understanding of how Netflix's stock prices over time, and use this knowledge to make informed decisions about investments and financial strategies.


Format: CSV file

🔐 From: Kaggle


🟢 https://t.me/datasets1
👍16
preprocessed_airline_dataset.csv
1.4 MB
📦 Datasets name: British Airways Review Dataset(2012-2023)


💬Skytrax dataset for British airways


Format: CSV file

🔐 From: Kaggle


🟢 https://t.me/datasets1
👍12🤩2
breast-cancer-wisconsin-data.csv
122.2 KB
📦 Datasets name: Breast Cancer Dataset [Wisconsin Diagnostic UCI]

💬Predict Breast Cancer with ML: A guide to the Wisconsin Diagnostic Dataset - Breast cancer is when breast cells mutate and become cancerous cells that multiply and form tumors. It accounts for 25% of all cancer cases and affected over 2.1 Million people in 2015 alone. Breast cancer typically affects women and people assigned female at birth (AFAB) age 50 and older, but it can also affect men and people assigned male at birth (AMAB), as well as younger women. Healthcare providers may treat breast cancer with surgery to remove tumors or treatment to kill cancerous cells.
Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.



Format: CSV file

🔐 From: Kaggle

🟢 https://t.me/datasets1
👍234
Tumor.zip
83.7 MB
📦 Datasets name: Brain Tumor Image DataSet : Semantic Segmentation

💬The Tumor Segmentation Dataset is designed specifically for the TumorSeg Computer Vision Project, which focuses on Semantic Segmentation. The project aims to identify tumor regions accurately within Medical Images using advanced techniques.

Format: Images files

🔐 From: Kaggle

🟢 https://t.me/datasets1
👍134💋1
heart+disease.zip
125.9 KB
📦 Datasets name: Heart Disease

💬 This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0). The names and social security numbers of the patients were recently removed from the database, replaced with dummy values. One file has been "processed", that one containing the Cleveland database. All four unprocessed files also exist in this directory. To see Test Costs (donated by Peter Turney), please see the folder "Costs"


🔐 From: UCI Machine Learning repository

🟢 https://t.me/datasets1
👍18🔥31
dry+bean+dataset.zip
4.5 MB
📦 Datasets name: Dry Bean Dataset


💬 Seven different types of dry beans were used in this research, taking into account the features such as form, shape, type, and ...

🔐 From: UCI Machine Learning repository

#Biology #Classification

🟢 https://t.me/datasets1
👍11😍21🔥1
adult.zip
605.7 KB
📦 Datasets name: Adult


💬 Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))
Prediction task is to determine whether a person makes over 50K a year.

🔐 From: UCI Machine Learning repository

#Social_Science #Classification

🟢 https://t.me/datasets1
👍335❤‍🔥1🔥1🤔1
Kaggle Data Hub
adult.zip
🇷 🇪 🇦 🇨 🇹 🇵 🇱 🇪 🇦 🇸 🇪
18👍7👏1
TopSongs.zip
436.2 KB
📦 Datasets name: Top Songs of the World

💬 "Top Songs of the World" is a collection of information about popular songs spanning various decades and genres. The dataset includes details such as the ranking of songs, the respective artists, titles, release years, sales figures, streaming statistics, download counts, radio play metrics, and a numerical rating. This dataset provides insights into the commercial success, digital presence, and overall popularity of each song, offering a comprehensive overview of the music industry's landscape over time. Researchers, analysts, and music enthusiasts can utilize this dataset to explore trends, patterns, and correlations within the context of the featured songs and artists.



Format: CSV file

🔐 From: Kaggle

🟢 https://t.me/datasets1
👍20
Spotify's Greatest.zip
345.4 KB
🎁 Datasets name: Spotify's Greatest

💬 8000+ Iconic Tracks," a meticulously curated collection capturing the essence of musical evolution. From the heartbeats of timeless classics to the vibrant energy of today's chart-toppers, this dataset offers a harmonious blend of tracks that have left an indelible mark on the global soundscape. Sourced from the vast libraries of Spotify, each entry in this dataset is a note in the symphony of streaming history, inviting enthusiasts and data maestros alike to explore the melodies of data-driven insights.



🔐 From: Kaggle

🟢 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍221
Road accidents .zip
42.6 MB
📦 Datasets name: Road accidents in the Czech Republic


💬 Detailed dataset of road accidents in the Czech Republic (2016-2022) , The police of Czech Republic regularly gathers and releases detailed data on traffic incidents throughout the nation, typically on an monthly basis. This dataset covers various aspects such as geographic locations, weather conditions, vehicle types, casualty counts, and vehicle maneuvers. The wealth of information makes it a compelling and extensive dataset for analysis and research purposes.


⚙️ Format: CSV file

🔒 From: Kaggle


👀 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍12😍41
COVID-19 Global Statistics .zip
10.4 KB
📦 Datasets name: COVID-19 Global Statistics Dataset


🔅 The COVID-19 Global Statistics Dataset offers comprehensive insights into the impact of the COVID-19 pandemic worldwide. It provides detailed statistics on COVID-19 cases, deaths, recoveries, testing, and demographic information across various countries and regions. Sourced from reliable sources, including government health departments and international organizations, this dataset serves as a valuable resource for researchers, policymakers, and public health experts to track the progression of the pandemic, analyze trends, and inform evidence-based decision-making.


⚙️ Format: CSV file

🔒 From: Kaggle


🟢 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍18
Diabetes Dataset.zip
8.9 KB
📦 Datasets name: Diabetes Dataset


🔅 This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes.


🆗 Format: CSV file

🔐 From: Kaggle


👁 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍105🥰1
Drug overdose death (1).zip
582 B
📦 Datasets name: Drug overdose death


🔅 Annual number of deaths in the United States from drug overdose per 100,000 people. Overdoses can result from intentional excessive use of a substance, but can also result from 'poisoning' where substances have been altered or mixed, such that the user is unaware of the drug's potency.

📶 Format: CSV file

🔒 From: Kaggle


🙁 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍121🥰1👌1
Household Power Consumption.zip
9.1 MB
📦 Datasets name: Household Power Consumption


📶 Individual household electric power consumption dataset collected via submeters placed in 3 distinct areas of a home.

🗣 Format: CSV file

🔒 From: Kaggle


🙁 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍173🔥1
ZARA Sales .zip
16.8 KB
📦 Datasets name: ZARA Sales


📶 This Zara sales dataset contains information about product sales from Zara stores over a specific period of time. The dataset includes various attributes relevant to sales, such as product ID, product name, product category, price, sales volume, sales date, and store location. This data can be used to analyze product sales trends, sales performance across different product categories, the effectiveness of promotions, customer purchasing patterns, and other factors that influence Zara's sales performance. Analyzing this dataset can provide valuable insights for Zara management in optimizing marketing strategies, inventory management, and other decision-making processes to enhance revenue and profitability.

🌐 Format: CSV file

🛡️ From: Kaggle


🙁 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍164🏆3🔥2
gemma-rewrite-nbroad.zip
3.7 MB
📦 Datasets name: gemma rewrite nbroad


☯️ 2k+ essays rewritten by gemma. Generation done using gemma-7b-it on A100 in bfloat16 using TGI


🌐 Format: CSV file

🔒 From: Kaggle

😑 https://t.me/datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
🔥73👍3👏1