Kaggle Data Hub
29.2K subscribers
926 photos
14 videos
309 files
1.19K links
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
🟒 Name Of Dataset: HowTo100M

🟒 Description Of Dataset:
HowTo100M is a large-scale dataset of narrated videos with an emphasis on instructional videos where content creators teach complex tasks with an explicit intention of explaining the visual content on screen. HowTo100M features a total of:136M video clips with captions sourced from 1.2M Youtube videos (15 years of video)23k activities from domains such as cooking, hand crafting, personal care, gardening or fitnessEach video is associated with a narration available as subtitles automatically downloaded from Youtube.Source:HowTo100M

🟒 Official Homepage: https://www.di.ens.fr/willow/research/howto100m/

🟒 Number of articles that used this dataset: 286

🟒 Dataset Loaders:
Not found

🟒 Articles related to the dataset:
πŸ“ VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

πŸ“ VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

πŸ“ VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

πŸ“ Self-Supervised MultiModal Versatile Networks

πŸ“ Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization

πŸ“ UnLoc: A Unified Framework for Video Localization Tasks

πŸ“ Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

πŸ“ Harvest Video Foundation Models via Efficient Post-Pretraining

πŸ“ InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

πŸ“ InternVideo: General Video Foundation Models via Generative and Discriminative Learning

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀4
🟒 Name Of Dataset: CoQA (Conversational Question Answering Challenge)

🟒 Description Of Dataset:
CoQAis a large-scale dataset for building Conversational Question Answering systems. The goal of the CoQA challenge is to measure the ability of machines to understand a text passage and answer a series of interconnected questions that appear in a conversation.CoQA contains 127,000+ questions with answers collected from 8000+ conversations. Each conversation is collected by pairing two crowdworkers to chat about a passage in the form of questions and answers. The unique features of CoQA include 1) the questions are conversational; 2) the answers can be free-form text; 3) each answer also comes with an evidence subsequence highlighted in the passage; and 4) the passages are collected from seven diverse domains. CoQA has a lot of challenging phenomena not present in existing reading comprehension datasets, e.g., coreference and pragmatic reasoning.Source:https://stanfordnlp.github.io/coqa/

🟒 Official Homepage: https://stanfordnlp.github.io/coqa/

🟒 Number of articles that used this dataset: 277

🟒 Dataset Loaders:
huggingface/datasets (coqa):
https://huggingface.co/datasets/coqa

huggingface/datasets (pcmr):
https://huggingface.co/datasets/Ruohao/pcmr

huggingface/datasets (coqa):
https://huggingface.co/datasets/stanfordnlp/coqa

facebookresearch/ParlAI:
https://parl.ai/docs/tasks.html#conversational-question-answering-challenge

activeloopai/Hub:
https://docs.activeloop.ai/datasets/coqa-dataset

tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/coqa

🟒 Articles related to the dataset:
πŸ“ MVP: Multi-task Supervised Pre-training for Natural Language Generation

πŸ“ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

πŸ“ Language Models are Unsupervised Multitask Learners

πŸ“ Unified Language Model Pre-training for Natural Language Understanding and Generation

πŸ“ Language Models are Few-Shot Learners

πŸ“ UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning

πŸ“ Pre-Training with Whole Word Masking for Chinese BERT

πŸ“ StarCoder: may the source be with you!

πŸ“ ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

πŸ“ Natural Questions: a Benchmark for Question Answering Research

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀2
🟒 Name Of Dataset: AISHELL-1

🟒 Description Of Dataset:
AISHELL-1 is a corpus for speech recognition research and building speech recognition systems for Mandarin.Source:AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

🟒 Official Homepage: http://www.openslr.org/33/

🟒 Number of articles that used this dataset: 197

🟒 Dataset Loaders:
Not found

🟒 Articles related to the dataset:
πŸ“ FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

πŸ“ Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

πŸ“ AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

πŸ“ PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

πŸ“ Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition

πŸ“ FunASR: A Fundamental End-to-End Speech Recognition Toolkit

πŸ“ BAT: Boundary aware transducer for memory-efficient and low-latency ASR

πŸ“ SAN-M: Memory Equipped Self-Attention for End-to-End Speech Recognition

πŸ“ Extremely Low Footprint End-to-End ASR System for Smart Device

πŸ“ Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀3
❗️ WITH JAY MO YOU WILL START EARNING MONEY

Jay will leave a link with free entry to a channel that draws money every day. Each subscriber gets between $100 and $5,000.

πŸ‘‰πŸ»CLICK HERE TO JOIN THE CHANNEL πŸ‘ˆπŸ»
πŸ‘‰πŸ»CLICK HERE TO JOIN THE CHANNEL!πŸ‘ˆπŸ»
πŸ‘‰πŸ»CLICK HERE TO JOIN THE CHANNEL πŸ‘ˆπŸ»

🚨FREE FOR THE FIRST 500 SUBSCRIBERS ONLY!
❀3
🟒 Name Of Dataset: IDRiD (Indian Diabetic Retinopathy Image Dataset)

🟒 Description Of Dataset:
Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset consists of typical diabetic retinopathy lesions and normal retinal structures annotated at a pixel level. This dataset also provides information on the disease severity of diabetic retinopathy and diabetic macular edema for each image. This dataset is perfect for the development and evaluation of image analysis algorithms for early detection of diabetic retinopathy.

🟒 Official Homepage: https://idrid.grand-challenge.org/

🟒 Number of articles that used this dataset: 16

βœ… Dataset Loaders:
milan01234/MachineLearning:
https://github.com/milan01234/MachineLearning

🟒 Articles related to the dataset:
πŸ“ A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

πŸ“ UniverSeg: Universal Medical Image Segmentation

πŸ“ ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

πŸ“ nnMobileNet: Rethinking CNN for Retinopathy Research

πŸ“ Explainable Diabetic Retinopathy Detection and Retinal Image Generation

πŸ“ Segmentation of Blood Vessels, Optic Disc Localization, Detection of Exudates and Diabetic Retinopathy Diagnosis from Digital Fundus Images

πŸ“ Detection and Classification of Diabetic Retinopathy using Deep Learning Algorithms for Segmentation to Facilitate Referral Recommendation for Test and Treatment Prediction

πŸ“ U-Net with Hierarchical Bottleneck Attention for Landmark Detection in Fundus Images of the Degenerated Retina

πŸ“ Beta-Rank: A Robust Convolutional Filter Pruning Method For Imbalanced Medical Image Analysis

πŸ“ Enhancing pretraining efficiency for medical image segmentation via transferability metrics

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
Please open Telegram to view this post
VIEW IN TELEGRAM
❀6
🟒 Name Of Dataset: DeepMind Control Suite

🟒 Description Of Dataset:
TheDeepMind Control Suite(DMCS) is a set of simulated continuous control environments with a standardized structure and interpretable rewards. The tasks are written and powered by the MuJoCo physics engine, making them easy to identify. Control Suite tasks include Pendulum, Acrobot, Cart-pole, Cart-k-pole, Ball in cup, Point-mass, Reacher, Finger, Hooper, Fish, Cheetah, Walker, Manipulator, Manipulator extra, Stacker, Swimmer, Humanoid, Humanoid_CMU and LQR.Source:Unsupervised Learning of Object Structure and Dynamics from Videos

🟒 Official Homepage: https://github.com/deepmind/dm_control

🟒 Number of articles that used this dataset: 360

🟒 Dataset Loaders:
deepmind/dm_control:
https://github.com/deepmind/dm_control

🟒 Articles related to the dataset:
πŸ“ State Entropy Maximization with Random Encoders for Efficient Exploration

πŸ“ Critic Regularized Regression

πŸ“ The Distracting Control Suite -- A Challenging Benchmark for Reinforcement Learning from Pixels

πŸ“ TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

πŸ“ Unsupervised Learning of Object Structure and Dynamics from Videos

πŸ“ Deep Reinforcement Learning

πŸ“ dm_control: Software and Tasks for Continuous Control

πŸ“ DeepMind Control Suite

πŸ“ CoBERL: Contrastive BERT for Reinforcement Learning

πŸ“ Acme: A Research Framework for Distributed Reinforcement Learning

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀3
🟒 Name Of Dataset: VideoSet

🟒 Description Of Dataset:
VideoSetis a large-scale compressed video quality dataset based on just-noticeable-difference (JND) measurement.The dataset consists of 220 5-second sequences in four resolutions (i.e., 1920Γ—1080, 1280Γ—720, 960Γ—540 and 640Γ—360). Each of the 880 video clips is encoded using the H.264 codec with QP=1,β‹―,51 and measure the first three JND points with 30+ subjects. The dataset is called the "VideoSet", which is an acronym for "Video Subject Evaluation Test (SET)".

🟒 Official Homepage: https://ieee-dataport.org/documents/videoset

🟒 Number of articles that used this dataset: 12

🟒 Dataset Loaders:
Not found

🟒 Articles related to the dataset:
πŸ“ Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling

πŸ“ VideoSet: A Large-Scale Compressed Video Quality Dataset Based on JND Measurement

πŸ“ Full RGB Just Noticeable Difference (JND) Modelling

πŸ“ A user model for JND-based video quality assessment: theory and applications

πŸ“ Prediction of Satisfied User Ratio for Compressed Video

πŸ“ Analysis and prediction of JND-based video quality model

πŸ“ Subjective Image Quality Assessment with Boosted Triplet Comparisons

πŸ“ Subjective and Objective Analysis of Streamed Gaming Videos

πŸ“ A Framework to Map VMAF with the Probability of Just Noticeable Difference between Video Encoding Recipes

πŸ“ On the benefit of parameter-driven approaches for the modeling and the prediction of Satisfied User Ratio for compressed video

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀7
🟒 Name Of Dataset: iNaturalist

🟒 Description Of Dataset:
The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. Those categories belong to 13 super-categories including Plantae (Plant), Insecta (Insect), Aves (Bird), Mammalia (Mammal), and so on. The iNat dataset is highly imbalanced with dramatically different number of images per category. For example, the largest super-category β€œPlantae (Plant)” has 196,613 images from 2,101 categories; whereas the smallest super-category β€œProtozoa” only has 381 images from 4 categories.Source:Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

🟒 Official Homepage: https://github.com/visipedia/inat_comp/tree/master/2017

🟒 Number of articles that used this dataset: 600

🟒 Dataset Loaders:
pytorch/vision:
https://pytorch.org/vision/stable/generated/torchvision.datasets.INaturalist.html

tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/i_naturalist2017

visipedia/inat_comp:
https://github.com/visipedia/inat_comp

🟒 Articles related to the dataset:
πŸ“ The iNaturalist Species Classification and Detection Dataset

πŸ“ SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

πŸ“ A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems

πŸ“ Class-Balanced Distillation for Long-Tailed Visual Recognition

πŸ“ Ranking Neural Checkpoints

πŸ“ DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

πŸ“ Going deeper with Image Transformers

πŸ“ ResNet strikes back: An improved training procedure in timm

πŸ“ LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

πŸ“ On Data Scaling in Masked Image Modeling

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀3
🟒 Name Of Dataset: Common Voice

🟒 Description Of Dataset:
Common Voiceis an audio dataset that consists of a unique MP3 and corresponding text file. There are 9,283 recorded hours in the dataset. The dataset also includes demographic metadata like age, sex, and accent. The dataset consists of 7,335 validated hours in 60 languages.

🟒 Official Homepage: https://commonvoice.mozilla.org

🟒 Number of articles that used this dataset: 438

🟒 Dataset Loaders:
huggingface/datasets (common_voice_21_0):
https://huggingface.co/datasets/2Jyq/common_voice_21_0

huggingface/datasets (common_voice_16_0):
https://huggingface.co/datasets/eldad-akhaumere/common_voice_16_0

huggingface/datasets (common_voice_16_0_):
https://huggingface.co/datasets/eldad-akhaumere/common_voice_16_0_

huggingface/datasets (c-v):
https://huggingface.co/datasets/xi0v/c-v

huggingface/datasets (common_voice):
https://huggingface.co/datasets/common_voice

huggingface/datasets (common_voice_5_1):
https://huggingface.co/datasets/mozilla-foundation/common_voice_5_1

huggingface/datasets (common_voice_7_0):
https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0

huggingface/datasets (common_voice_7_0_test):
https://huggingface.co/datasets/anton-l/common_voice_7_0_test

huggingface/datasets (common_voice_7_0_test1):
https://huggingface.co/datasets/anton-l/common_voice_7_0_test1

huggingface/datasets (common_voice_1_0):
https://huggingface.co/datasets/anton-l/common_voice_1_0

🟒 Articles related to the dataset:
πŸ“ Unsupervised Cross-lingual Representation Learning for Speech Recognition

πŸ“ Robust Speech Recognition via Large-Scale Weak Supervision

πŸ“ YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

πŸ“ Scaling Speech Technology to 1,000+ Languages

πŸ“ Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

πŸ“ Unsupervised Speech Recognition

πŸ“ Simple and Effective Zero-shot Cross-lingual Phoneme Recognition

πŸ“ Towards End-to-end Unsupervised Speech Recognition

πŸ“ Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀7
🟒 Name Of Dataset: SuperGLUE

🟒 Description Of Dataset:
SuperGLUEis a benchmark dataset designed to pose a more rigorous test of language understanding than GLUE. SuperGLUE has the same high-level motivation as GLUE: to provide a simple, hard-to-game measure of progress toward general-purpose language understanding technologies for English. SuperGLUE follows the basic design of GLUE: It consists of a public leaderboard built around eight language understanding tasks, drawing on existing data, accompanied by a single-number performance metric, and an analysis toolkit. However, it improves upon GLUE in several ways:More challenging tasks: SuperGLUE retains the two hardest tasks in GLUE. The remaining tasks were identified from those submitted to an open call for task proposals and were selected based on difficulty for current NLP approaches.More diverse task formats: The task formats in GLUE are limited to sentence- and sentence-pair classification. The authors expand the set of task formats in SuperGLUE to include coreference resolution and question answering (QA).Comprehensive human baselines: the authors include human performance estimates for all benchmark tasks, which verify that substantial headroom exists between a strong BERT-based baseline and human performance.Improved code support: SuperGLUE is distributed with a new, modular toolkit for work on pretraining, multi-task learning, and transfer learning in NLP, built around standard tools including PyTorch (Paszke et al., 2017) and AllenNLP (Gardner et al., 2017).Refined usage rules: The conditions for inclusion on the SuperGLUE leaderboard were revamped to ensure fair competition, an informative leaderboard, and full credit assignment to data and task creators.

🟒 Official Homepage: https://super.gluebenchmark.com/

🟒 Number of articles that used this dataset: 418

🟒 Dataset Loaders:
huggingface/datasets (superglue):
https://huggingface.co/datasets/Hyukkyu/superglue

huggingface/datasets (super_glue):
https://huggingface.co/datasets/super_glue

huggingface/datasets (test_data):
https://huggingface.co/datasets/zzzzhhh/test_data

huggingface/datasets (super_glue):
https://huggingface.co/datasets/aps/super_glue

huggingface/datasets (test):
https://huggingface.co/datasets/ThierryZhou/test

huggingface/datasets (ceshi0119):
https://huggingface.co/datasets/Xieyiyiyi/ceshi0119

facebookresearch/ParlAI:
https://parl.ai/docs/tasks.html#superglue

tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/super_glue

🟒 Articles related to the dataset:
πŸ“ Leveraging redundancy in attention with Reuse Transformers

πŸ“ Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

πŸ“ GLU Variants Improve Transformer

πŸ“ Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

πŸ“ Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT

πŸ“ UL2: Unifying Language Learning Paradigms

πŸ“ Few-shot Learning with Multilingual Language Models

πŸ“ Kosmos-2: Grounding Multimodal Large Language Models to the World

πŸ“ Language Models are Few-Shot Learners

πŸ“ ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀7
Intent | AI-Enhanced Telegram
🌐 Supports real-time translation in 86 languages
πŸ’¬ Simply swipe up during chat to let AI automatically generate contextual replies
πŸŽ™ Instant AI enhanced voice-to-text conversion
🧠 Built-in mainstream models including GPT-4o, Claude 3.7, Gemini 2, Deepseek, etc., activated with one click
🎁 Currently offering generous free AI credits
πŸ“± Supports Android & iOS systems
πŸ”Ž Website | πŸ“¬ Download
❀2
πŸš€ FREE IT Study Kits for 2025 β€” Grab Yours Now!

Just found these zero-cost resources from SPOTOπŸ‘‡
Perfect if you're prepping for #Cisco, #AWS, #PMP, #AI, #Python, #Excel, or #Cybersecurity!
βœ… 100% Free
βœ… No signup traps
βœ… Instantly downloadable

πŸ“˜ IT Certs E-book: https://bit.ly/4fJSoLP
☁️ Cloud & AI Kits: https://bit.ly/3F3lc5B
πŸ“Š Cybersecurity, Python & Excel: https://bit.ly/4mFrA4g
🧠 Skill Test (Free!): https://bit.ly/3PoKH39
Tag a friend & level up together πŸ’ͺ

🌐 Join the IT Study Group: https://chat.whatsapp.com/E3Vkxa19HPO9ZVkWslBO8s
πŸ“² 1-on-1 Exam Help: https://wa.link/k0vy3x
πŸ‘‘Last 24 HOURS to grab Mid-Year Mega Sale prices!Don’t miss Lucky DrawπŸ‘‡
https://bit.ly/43VgcbT
❀2
🟒 Name Of Dataset: ScanNet

🟒 Description Of Dataset:
ScanNetis an instance-level indoor RGB-D dataset that includes both 2D and 3D data. It is a collection of labeled voxels rather than points or objects. Up to now, ScanNet v2, the newest version of ScanNet, has collected 1513 annotated scans with an approximate 90% surface coverage. In the semantic segmentation task, this dataset is marked in 20 classes of annotated 3D voxelized objects.Source:A Review of Point Cloud Semantic Segmentation

🟒 Official Homepage: http://www.scan-net.org/

🟒 Number of articles that used this dataset: 1574

🟒 Dataset Loaders:
Pointcept/Pointcept:
https://github.com/Pointcept/Pointcept

ScanNet/ScanNet:
http://www.scan-net.org/

🟒 Articles related to the dataset:
πŸ“ Mask R-CNN

πŸ“ ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

πŸ“ NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

πŸ“ ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

πŸ“ FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection

πŸ“ PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

πŸ“ Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research

πŸ“ PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

πŸ“ SuperGlue: Learning Feature Matching with Graph Neural Networks

πŸ“ MIMIC-IT: Multi-Modal In-Context Instruction Tuning

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀4
🟒 Name Of Dataset: LIDC-IDRI

🟒 Description Of Dataset:
TheLIDC-IDRIdataset contains lesion annotations from four experienced thoracic radiologists. LIDC-IDRI contains 1,018 low-dose lung CTs from 1010 lung patients.Source:A 3D Probabilistic Deep Learning System for Detection and Diagnosis of Lung Cancer Using Low-Dose CT Scans

🟒 Official Homepage: https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI

🟒 Number of articles that used this dataset: 237

🟒 Dataset Loaders:
Shwe234/himanshumajordataset:
https://github.com/Shwe234/himanshumajordataset

your-username/your-repository:
https://github.com/your-username/your-repository

🟒 Articles related to the dataset:
πŸ“ UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation

πŸ“ Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection

πŸ“ Models Genesis

πŸ“ Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis

πŸ“ nnDetection: A Self-configuring Method for Medical Object Detection

πŸ“ A Probabilistic U-Net for Segmentation of Ambiguous Images

πŸ“ A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities

πŸ“ Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation

πŸ“ FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

πŸ“ Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀5
🟒 Name Of Dataset: ADNI (Alzheimer's Disease NeuroImaging Initiative)

🟒 Description Of Dataset:
Alzheimer's Disease Neuroimaging Initiative (ADNI) is a multisite study that aims to improve clinical trials for the prevention and treatment of Alzheimer’s disease (AD).[1] This cooperative study combines expertise and funding from the private and public sector to study subjects with AD, as well as those who may develop AD and controls with no signs of cognitive impairment.[2] Researchers at 63 sites in the US and Canada track the progression of AD in the human brain with neuroimaging, biochemical, and genetic biological markers.[2][3] This knowledge helps to find better clinical trials for the prevention and treatment of AD. ADNI has made a global impact,[4] firstly by developing a set of standardized protocols to allow the comparison of results from multiple centers,[4] and secondly by its data-sharing policy which makes available all at the data without embargo to qualified researchers worldwide.[5] To date, over 1000 scientific publications have used ADNI data.[6] A number of other initiatives related to AD and other diseases have been designed and implemented using ADNI as a model.[4] ADNI has been running since 2004 and is currently funded until 2021.[7]Source: Wikipedia, https://en.wikipedia.org/wiki/Alzheimer%27s_Disease_Neuroimaging_Initiative

🟒 Official Homepage: http://adni.loni.usc.edu/

🟒 Number of articles that used this dataset: 28

🟒 Dataset Loaders:
Not found

🟒 Articles related to the dataset:
πŸ“ Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation

πŸ“ Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer's Disease

πŸ“ Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge

πŸ“ Alzheimer's Disease Diagnostics by Adaptation of 3D Convolutional Network

πŸ“ An automated machine learning framework to optimize radiomics model construction validated on twelve clinical applications

πŸ“ AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans

πŸ“ The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up

πŸ“ TADPOLE Challenge: Accurate Alzheimer's disease prediction through crowdsourced forecasting of future data

πŸ“ Alzheimer's Disease Brain MRI Classification: Challenges and Insights

πŸ“ Inference of nonlinear causal effects with GWAS summary data

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/DataScienceT
❀6
🟒 Name Of Dataset: MegaDepth

🟒 Description Of Dataset:
The MegaDepth dataset is a dataset for single-view depth prediction that includes 196 different locations reconstructed from COLMAP SfM/MVS.Source:MegaDepth: Learning Single-View Depth Prediction from Internet Photos

🟒 Official Homepage: http://www.cs.cornell.edu/projects/megadepth/

🟒 Number of articles that used this dataset: 150

🟒 Dataset Loaders:
Not found

🟒 Articles related to the dataset:
πŸ“ Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

πŸ“ Depth Anything V2

πŸ“ Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer

πŸ“ LightGlue: Local Feature Matching at Light Speed

πŸ“ LoFTR: Detector-Free Local Feature Matching with Transformers

πŸ“ 3D Ken Burns Effect from a Single Image

πŸ“ Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

πŸ“ Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image

πŸ“ Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction

πŸ“ MegaDepth: Learning Single-View Depth Prediction from Internet Photos

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀8
🟒 Name Of Dataset: CelebA-HQ

🟒 Description Of Dataset:
TheCelebA-HQdataset is a high-quality version of CelebA that consists of 30,000 images at 1024Γ—1024 resolution.Source:IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis

🟒 Official Homepage: https://github.com/tkarras/progressive_growing_of_gans

🟒 Number of articles that used this dataset: 946

🟒 Dataset Loaders:
tkarras/progressive_growing_of_gans:
https://github.com/tkarras/progressive_growing_of_gans

tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/celeb_a_hq

🟒 Articles related to the dataset:
πŸ“ High-Resolution Image Synthesis with Latent Diffusion Models

πŸ“ DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection

πŸ“ Towards Real-World Blind Face Restoration with Generative Facial Prior

πŸ“ Towards Robust Blind Face Restoration with Codebook Lookup Transformer

πŸ“ A Style-Based Generator Architecture for Generative Adversarial Networks

πŸ“ Vector-quantized Image Modeling with Improved VQGAN

πŸ“ Resolution-robust Large Mask Inpainting with Fourier Convolutions

πŸ“ GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond

πŸ“ Texture Memory-Augmented Deep Patch-Based Image Inpainting

πŸ“ High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀4
This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

βœ… https://t.me/addlist/8_rRW2scgfRhOTc0

βœ… https://t.me/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
❀2
🟒 Name Of Dataset: BlendedMVS

🟒 Description Of Dataset:
BlendedMVSis a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. The dataset was created by applying a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, these mesh models were rendered to color images and depth maps.Source:BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

🟒 Official Homepage: https://github.com/YoYo000/BlendedMVS

🟒 Number of articles that used this dataset: 104

🟒 Dataset Loaders:
YoYo000/BlendedMVS:
https://github.com/YoYo000/BlendedMVS

🟒 Articles related to the dataset:
πŸ“ Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

πŸ“ Depth Anything V2

πŸ“ NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

πŸ“ Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

πŸ“ Volume Rendering of Neural Implicit Surfaces

πŸ“ Neural Sparse Voxel Fields

πŸ“ BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

πŸ“ Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction

πŸ“ SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

πŸ“ Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀6
🟒 Name Of Dataset: EPIC-KITCHENS-100

🟒 Description Of Dataset:
This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection also enables evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.

🟒 Official Homepage: https://epic-kitchens.github.io/2021

🟒 Number of articles that used this dataset: 160

🟒 Dataset Loaders:
Not found

🟒 Articles related to the dataset:
πŸ“ MoViNets: Mobile Video Networks for Efficient Video Recognition

πŸ“ Domain-Adversarial Training of Neural Networks

πŸ“ BMN: Boundary-Matching Network for Temporal Action Proposal Generation

πŸ“ Adversarial Discriminative Domain Adaptation

πŸ“ Attention Bottlenecks for Multimodal Fusion

πŸ“ Audiovisual Masked Autoencoders

πŸ“ Multiview Transformers for Video Recognition

πŸ“ ViViT: A Video Vision Transformer

πŸ“ Magma: A Foundation Model for Multimodal AI Agents

πŸ“ V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
❀3
🟒 Name Of Dataset: CARLA (Car Learning to Act)

🟒 Description Of Dataset:
CARLA(CAR Learning to Act) is an open simulator for urban driving, developed as an open-source layer over Unreal Engine 4. Technically, it operates similarly to, as an open source layer over Unreal Engine 4 that provides sensors in the form of RGB cameras (with customizable positions), ground truth depth maps, ground truth semantic segmentation maps with 12 semantic classes designed for driving (road, lane marking, traffic sign, sidewalk and so on), bounding boxes for dynamic objects in the environment, and measurements of the agent itself (vehicle location and orientation).Source:Synthetic Data for Deep Learning

🟒 Official Homepage: https://carla.org/

🟒 Number of articles that used this dataset: 1316

🟒 Dataset Loaders:
joedlopes/carla-simulator-multimodal-sensing:
https://github.com/joedlopes/carla-simulator-multimodal-sensing

🟒 Articles related to the dataset:
πŸ“ Synthetic Dataset Generation for Adversarial Machine Learning Research

πŸ“ End-to-end Autonomous Driving: Challenges and Frontiers

πŸ“ OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving

πŸ“ On the Practicality of Deterministic Epistemic Uncertainty

πŸ“ D4RL: Datasets for Deep Data-Driven Reinforcement Learning

πŸ“ Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)

πŸ“ Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

πŸ“ Label Efficient Visual Abstractions for Autonomous Driving

πŸ“ Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

πŸ“ TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

==================================
πŸ”΄ For more datasets resources:
βœ“ https://t.me/Datasets1
πŸ‘1