π’ Name Of Dataset: VideoSet
π’ Description Of Dataset:
VideoSetis a large-scale compressed video quality dataset based on just-noticeable-difference (JND) measurement.The dataset consists of 220 5-second sequences in four resolutions (i.e., 1920Γ1080, 1280Γ720, 960Γ540 and 640Γ360). Each of the 880 video clips is encoded using the H.264 codec with QP=1,β―,51 and measure the first three JND points with 30+ subjects. The dataset is called the "VideoSet", which is an acronym for "Video Subject Evaluation Test (SET)".
π’ Official Homepage: https://ieee-dataport.org/documents/videoset
π’ Number of articles that used this dataset: 12
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling
π VideoSet: A Large-Scale Compressed Video Quality Dataset Based on JND Measurement
π Full RGB Just Noticeable Difference (JND) Modelling
π A user model for JND-based video quality assessment: theory and applications
π Prediction of Satisfied User Ratio for Compressed Video
π Analysis and prediction of JND-based video quality model
π Subjective Image Quality Assessment with Boosted Triplet Comparisons
π Subjective and Objective Analysis of Streamed Gaming Videos
π A Framework to Map VMAF with the Probability of Just Noticeable Difference between Video Encoding Recipes
π On the benefit of parameter-driven approaches for the modeling and the prediction of Satisfied User Ratio for compressed video
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
VideoSetis a large-scale compressed video quality dataset based on just-noticeable-difference (JND) measurement.The dataset consists of 220 5-second sequences in four resolutions (i.e., 1920Γ1080, 1280Γ720, 960Γ540 and 640Γ360). Each of the 880 video clips is encoded using the H.264 codec with QP=1,β―,51 and measure the first three JND points with 30+ subjects. The dataset is called the "VideoSet", which is an acronym for "Video Subject Evaluation Test (SET)".
π’ Official Homepage: https://ieee-dataport.org/documents/videoset
π’ Number of articles that used this dataset: 12
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling
π VideoSet: A Large-Scale Compressed Video Quality Dataset Based on JND Measurement
π Full RGB Just Noticeable Difference (JND) Modelling
π A user model for JND-based video quality assessment: theory and applications
π Prediction of Satisfied User Ratio for Compressed Video
π Analysis and prediction of JND-based video quality model
π Subjective Image Quality Assessment with Boosted Triplet Comparisons
π Subjective and Objective Analysis of Streamed Gaming Videos
π A Framework to Map VMAF with the Probability of Just Noticeable Difference between Video Encoding Recipes
π On the benefit of parameter-driven approaches for the modeling and the prediction of Satisfied User Ratio for compressed video
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€7
π’ Name Of Dataset: iNaturalist
π’ Description Of Dataset:
The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. Those categories belong to 13 super-categories including Plantae (Plant), Insecta (Insect), Aves (Bird), Mammalia (Mammal), and so on. The iNat dataset is highly imbalanced with dramatically different number of images per category. For example, the largest super-category βPlantae (Plant)β has 196,613 images from 2,101 categories; whereas the smallest super-category βProtozoaβ only has 381 images from 4 categories.Source:Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
π’ Official Homepage: https://github.com/visipedia/inat_comp/tree/master/2017
π’ Number of articles that used this dataset: 600
π’ Dataset Loaders:
pytorch/vision:
https://pytorch.org/vision/stable/generated/torchvision.datasets.INaturalist.html
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/i_naturalist2017
visipedia/inat_comp:
https://github.com/visipedia/inat_comp
π’ Articles related to the dataset:
π The iNaturalist Species Classification and Detection Dataset
π SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
π A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
π Class-Balanced Distillation for Long-Tailed Visual Recognition
π Ranking Neural Checkpoints
π DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
π Going deeper with Image Transformers
π ResNet strikes back: An improved training procedure in timm
π LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
π On Data Scaling in Masked Image Modeling
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. Those categories belong to 13 super-categories including Plantae (Plant), Insecta (Insect), Aves (Bird), Mammalia (Mammal), and so on. The iNat dataset is highly imbalanced with dramatically different number of images per category. For example, the largest super-category βPlantae (Plant)β has 196,613 images from 2,101 categories; whereas the smallest super-category βProtozoaβ only has 381 images from 4 categories.Source:Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
π’ Official Homepage: https://github.com/visipedia/inat_comp/tree/master/2017
π’ Number of articles that used this dataset: 600
π’ Dataset Loaders:
pytorch/vision:
https://pytorch.org/vision/stable/generated/torchvision.datasets.INaturalist.html
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/i_naturalist2017
visipedia/inat_comp:
https://github.com/visipedia/inat_comp
π’ Articles related to the dataset:
π The iNaturalist Species Classification and Detection Dataset
π SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
π A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
π Class-Balanced Distillation for Long-Tailed Visual Recognition
π Ranking Neural Checkpoints
π DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
π Going deeper with Image Transformers
π ResNet strikes back: An improved training procedure in timm
π LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
π On Data Scaling in Masked Image Modeling
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€3
π’ Name Of Dataset: Common Voice
π’ Description Of Dataset:
Common Voiceis an audio dataset that consists of a unique MP3 and corresponding text file. There are 9,283 recorded hours in the dataset. The dataset also includes demographic metadata like age, sex, and accent. The dataset consists of 7,335 validated hours in 60 languages.
π’ Official Homepage: https://commonvoice.mozilla.org
π’ Number of articles that used this dataset: 438
π’ Dataset Loaders:
huggingface/datasets (common_voice_21_0):
https://huggingface.co/datasets/2Jyq/common_voice_21_0
huggingface/datasets (common_voice_16_0):
https://huggingface.co/datasets/eldad-akhaumere/common_voice_16_0
huggingface/datasets (common_voice_16_0_):
https://huggingface.co/datasets/eldad-akhaumere/common_voice_16_0_
huggingface/datasets (c-v):
https://huggingface.co/datasets/xi0v/c-v
huggingface/datasets (common_voice):
https://huggingface.co/datasets/common_voice
huggingface/datasets (common_voice_5_1):
https://huggingface.co/datasets/mozilla-foundation/common_voice_5_1
huggingface/datasets (common_voice_7_0):
https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0
huggingface/datasets (common_voice_7_0_test):
https://huggingface.co/datasets/anton-l/common_voice_7_0_test
huggingface/datasets (common_voice_7_0_test1):
https://huggingface.co/datasets/anton-l/common_voice_7_0_test1
huggingface/datasets (common_voice_1_0):
https://huggingface.co/datasets/anton-l/common_voice_1_0
π’ Articles related to the dataset:
π Unsupervised Cross-lingual Representation Learning for Speech Recognition
π Robust Speech Recognition via Large-Scale Weak Supervision
π YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
π Scaling Speech Technology to 1,000+ Languages
π Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
π Unsupervised Speech Recognition
π Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
π Towards End-to-end Unsupervised Speech Recognition
π Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Common Voiceis an audio dataset that consists of a unique MP3 and corresponding text file. There are 9,283 recorded hours in the dataset. The dataset also includes demographic metadata like age, sex, and accent. The dataset consists of 7,335 validated hours in 60 languages.
π’ Official Homepage: https://commonvoice.mozilla.org
π’ Number of articles that used this dataset: 438
π’ Dataset Loaders:
huggingface/datasets (common_voice_21_0):
https://huggingface.co/datasets/2Jyq/common_voice_21_0
huggingface/datasets (common_voice_16_0):
https://huggingface.co/datasets/eldad-akhaumere/common_voice_16_0
huggingface/datasets (common_voice_16_0_):
https://huggingface.co/datasets/eldad-akhaumere/common_voice_16_0_
huggingface/datasets (c-v):
https://huggingface.co/datasets/xi0v/c-v
huggingface/datasets (common_voice):
https://huggingface.co/datasets/common_voice
huggingface/datasets (common_voice_5_1):
https://huggingface.co/datasets/mozilla-foundation/common_voice_5_1
huggingface/datasets (common_voice_7_0):
https://huggingface.co/datasets/mozilla-foundation/common_voice_7_0
huggingface/datasets (common_voice_7_0_test):
https://huggingface.co/datasets/anton-l/common_voice_7_0_test
huggingface/datasets (common_voice_7_0_test1):
https://huggingface.co/datasets/anton-l/common_voice_7_0_test1
huggingface/datasets (common_voice_1_0):
https://huggingface.co/datasets/anton-l/common_voice_1_0
π’ Articles related to the dataset:
π Unsupervised Cross-lingual Representation Learning for Speech Recognition
π Robust Speech Recognition via Large-Scale Weak Supervision
π YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
π Scaling Speech Technology to 1,000+ Languages
π Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
π Unsupervised Speech Recognition
π Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
π Towards End-to-end Unsupervised Speech Recognition
π Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€7
π’ Name Of Dataset: SuperGLUE
π’ Description Of Dataset:
SuperGLUEis a benchmark dataset designed to pose a more rigorous test of language understanding than GLUE. SuperGLUE has the same high-level motivation as GLUE: to provide a simple, hard-to-game measure of progress toward general-purpose language understanding technologies for English. SuperGLUE follows the basic design of GLUE: It consists of a public leaderboard built around eight language understanding tasks, drawing on existing data, accompanied by a single-number performance metric, and an analysis toolkit. However, it improves upon GLUE in several ways:More challenging tasks: SuperGLUE retains the two hardest tasks in GLUE. The remaining tasks were identified from those submitted to an open call for task proposals and were selected based on difficulty for current NLP approaches.More diverse task formats: The task formats in GLUE are limited to sentence- and sentence-pair classification. The authors expand the set of task formats in SuperGLUE to include coreference resolution and question answering (QA).Comprehensive human baselines: the authors include human performance estimates for all benchmark tasks, which verify that substantial headroom exists between a strong BERT-based baseline and human performance.Improved code support: SuperGLUE is distributed with a new, modular toolkit for work on pretraining, multi-task learning, and transfer learning in NLP, built around standard tools including PyTorch (Paszke et al., 2017) and AllenNLP (Gardner et al., 2017).Refined usage rules: The conditions for inclusion on the SuperGLUE leaderboard were revamped to ensure fair competition, an informative leaderboard, and full credit assignment to data and task creators.
π’ Official Homepage: https://super.gluebenchmark.com/
π’ Number of articles that used this dataset: 418
π’ Dataset Loaders:
huggingface/datasets (superglue):
https://huggingface.co/datasets/Hyukkyu/superglue
huggingface/datasets (super_glue):
https://huggingface.co/datasets/super_glue
huggingface/datasets (test_data):
https://huggingface.co/datasets/zzzzhhh/test_data
huggingface/datasets (super_glue):
https://huggingface.co/datasets/aps/super_glue
huggingface/datasets (test):
https://huggingface.co/datasets/ThierryZhou/test
huggingface/datasets (ceshi0119):
https://huggingface.co/datasets/Xieyiyiyi/ceshi0119
facebookresearch/ParlAI:
https://parl.ai/docs/tasks.html#superglue
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/super_glue
π’ Articles related to the dataset:
π Leveraging redundancy in attention with Reuse Transformers
π Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
π GLU Variants Improve Transformer
π Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
π Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
π UL2: Unifying Language Learning Paradigms
π Few-shot Learning with Multilingual Language Models
π Kosmos-2: Grounding Multimodal Large Language Models to the World
π Language Models are Few-Shot Learners
π ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
SuperGLUEis a benchmark dataset designed to pose a more rigorous test of language understanding than GLUE. SuperGLUE has the same high-level motivation as GLUE: to provide a simple, hard-to-game measure of progress toward general-purpose language understanding technologies for English. SuperGLUE follows the basic design of GLUE: It consists of a public leaderboard built around eight language understanding tasks, drawing on existing data, accompanied by a single-number performance metric, and an analysis toolkit. However, it improves upon GLUE in several ways:More challenging tasks: SuperGLUE retains the two hardest tasks in GLUE. The remaining tasks were identified from those submitted to an open call for task proposals and were selected based on difficulty for current NLP approaches.More diverse task formats: The task formats in GLUE are limited to sentence- and sentence-pair classification. The authors expand the set of task formats in SuperGLUE to include coreference resolution and question answering (QA).Comprehensive human baselines: the authors include human performance estimates for all benchmark tasks, which verify that substantial headroom exists between a strong BERT-based baseline and human performance.Improved code support: SuperGLUE is distributed with a new, modular toolkit for work on pretraining, multi-task learning, and transfer learning in NLP, built around standard tools including PyTorch (Paszke et al., 2017) and AllenNLP (Gardner et al., 2017).Refined usage rules: The conditions for inclusion on the SuperGLUE leaderboard were revamped to ensure fair competition, an informative leaderboard, and full credit assignment to data and task creators.
π’ Official Homepage: https://super.gluebenchmark.com/
π’ Number of articles that used this dataset: 418
π’ Dataset Loaders:
huggingface/datasets (superglue):
https://huggingface.co/datasets/Hyukkyu/superglue
huggingface/datasets (super_glue):
https://huggingface.co/datasets/super_glue
huggingface/datasets (test_data):
https://huggingface.co/datasets/zzzzhhh/test_data
huggingface/datasets (super_glue):
https://huggingface.co/datasets/aps/super_glue
huggingface/datasets (test):
https://huggingface.co/datasets/ThierryZhou/test
huggingface/datasets (ceshi0119):
https://huggingface.co/datasets/Xieyiyiyi/ceshi0119
facebookresearch/ParlAI:
https://parl.ai/docs/tasks.html#superglue
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/super_glue
π’ Articles related to the dataset:
π Leveraging redundancy in attention with Reuse Transformers
π Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
π GLU Variants Improve Transformer
π Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
π Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
π UL2: Unifying Language Learning Paradigms
π Few-shot Learning with Multilingual Language Models
π Kosmos-2: Grounding Multimodal Large Language Models to the World
π Language Models are Few-Shot Learners
π ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€7
Intent | AI-Enhanced Telegram
π Supports real-time translation in 86 languages
π¬ Simply swipe up during chat to let AI automatically generate contextual replies
π Instant AI enhanced voice-to-text conversion
π§ Built-in mainstream models including GPT-4o, Claude 3.7, Gemini 2, Deepseek, etc., activated with one click
π Currently offering generous free AI credits
π± Supports Android & iOS systems
π Website | π¬ Download
π Supports real-time translation in 86 languages
π¬ Simply swipe up during chat to let AI automatically generate contextual replies
π Instant AI enhanced voice-to-text conversion
π§ Built-in mainstream models including GPT-4o, Claude 3.7, Gemini 2, Deepseek, etc., activated with one click
π Currently offering generous free AI credits
π± Supports Android & iOS systems
π Website | π¬ Download
intentchat.app
Lingogram: Real-time Automatic Translation
Lingogram, your Multilingual Telegram Messenger. AI-powered assistant lets you type in your own language and effortlessly connect with people worldwide.
β€2
Forwarded from Machine Learning with Python
π FREE IT Study Kits for 2025 β Grab Yours Now!
Just found these zero-cost resources from SPOTOπ
Perfect if you're prepping for #Cisco, #AWS, #PMP, #AI, #Python, #Excel, or #Cybersecurity!
β 100% Free
β No signup traps
β Instantly downloadable
π IT Certs E-book: https://bit.ly/4fJSoLP
βοΈ Cloud & AI Kits: https://bit.ly/3F3lc5B
π Cybersecurity, Python & Excel: https://bit.ly/4mFrA4g
π§ Skill Test (Free!): https://bit.ly/3PoKH39
Tag a friend & level up together πͺ
π Join the IT Study Group: https://chat.whatsapp.com/E3Vkxa19HPO9ZVkWslBO8s
π² 1-on-1 Exam Help: https://wa.link/k0vy3x
πLast 24 HOURS to grab Mid-Year Mega Sale pricesοΌDonβt miss Lucky Drawπ
https://bit.ly/43VgcbT
Just found these zero-cost resources from SPOTOπ
Perfect if you're prepping for #Cisco, #AWS, #PMP, #AI, #Python, #Excel, or #Cybersecurity!
β 100% Free
β No signup traps
β Instantly downloadable
π IT Certs E-book: https://bit.ly/4fJSoLP
βοΈ Cloud & AI Kits: https://bit.ly/3F3lc5B
π Cybersecurity, Python & Excel: https://bit.ly/4mFrA4g
π§ Skill Test (Free!): https://bit.ly/3PoKH39
Tag a friend & level up together πͺ
π Join the IT Study Group: https://chat.whatsapp.com/E3Vkxa19HPO9ZVkWslBO8s
π² 1-on-1 Exam Help: https://wa.link/k0vy3x
πLast 24 HOURS to grab Mid-Year Mega Sale pricesοΌDonβt miss Lucky Drawπ
https://bit.ly/43VgcbT
β€2
π’ Name Of Dataset: ScanNet
π’ Description Of Dataset:
ScanNetis an instance-level indoor RGB-D dataset that includes both 2D and 3D data. It is a collection of labeled voxels rather than points or objects. Up to now, ScanNet v2, the newest version of ScanNet, has collected 1513 annotated scans with an approximate 90% surface coverage. In the semantic segmentation task, this dataset is marked in 20 classes of annotated 3D voxelized objects.Source:A Review of Point Cloud Semantic Segmentation
π’ Official Homepage: http://www.scan-net.org/
π’ Number of articles that used this dataset: 1574
π’ Dataset Loaders:
Pointcept/Pointcept:
https://github.com/Pointcept/Pointcept
ScanNet/ScanNet:
http://www.scan-net.org/
π’ Articles related to the dataset:
π Mask R-CNN
π ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
π NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
π ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
π FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
π PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
π Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research
π PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
π SuperGlue: Learning Feature Matching with Graph Neural Networks
π MIMIC-IT: Multi-Modal In-Context Instruction Tuning
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
ScanNetis an instance-level indoor RGB-D dataset that includes both 2D and 3D data. It is a collection of labeled voxels rather than points or objects. Up to now, ScanNet v2, the newest version of ScanNet, has collected 1513 annotated scans with an approximate 90% surface coverage. In the semantic segmentation task, this dataset is marked in 20 classes of annotated 3D voxelized objects.Source:A Review of Point Cloud Semantic Segmentation
π’ Official Homepage: http://www.scan-net.org/
π’ Number of articles that used this dataset: 1574
π’ Dataset Loaders:
Pointcept/Pointcept:
https://github.com/Pointcept/Pointcept
ScanNet/ScanNet:
http://www.scan-net.org/
π’ Articles related to the dataset:
π Mask R-CNN
π ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
π NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
π ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
π FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
π PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
π Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research
π PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
π SuperGlue: Learning Feature Matching with Graph Neural Networks
π MIMIC-IT: Multi-Modal In-Context Instruction Tuning
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: LIDC-IDRI
π’ Description Of Dataset:
TheLIDC-IDRIdataset contains lesion annotations from four experienced thoracic radiologists. LIDC-IDRI contains 1,018 low-dose lung CTs from 1010 lung patients.Source:A 3D Probabilistic Deep Learning System for Detection and Diagnosis of Lung Cancer Using Low-Dose CT Scans
π’ Official Homepage: https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI
π’ Number of articles that used this dataset: 237
π’ Dataset Loaders:
Shwe234/himanshumajordataset:
https://github.com/Shwe234/himanshumajordataset
your-username/your-repository:
https://github.com/your-username/your-repository
π’ Articles related to the dataset:
π UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation
π Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection
π Models Genesis
π Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis
π nnDetection: A Self-configuring Method for Medical Object Detection
π A Probabilistic U-Net for Segmentation of Ambiguous Images
π A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities
π Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
π FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings
π Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheLIDC-IDRIdataset contains lesion annotations from four experienced thoracic radiologists. LIDC-IDRI contains 1,018 low-dose lung CTs from 1010 lung patients.Source:A 3D Probabilistic Deep Learning System for Detection and Diagnosis of Lung Cancer Using Low-Dose CT Scans
π’ Official Homepage: https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI
π’ Number of articles that used this dataset: 237
π’ Dataset Loaders:
Shwe234/himanshumajordataset:
https://github.com/Shwe234/himanshumajordataset
your-username/your-repository:
https://github.com/your-username/your-repository
π’ Articles related to the dataset:
π UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation
π Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection
π Models Genesis
π Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis
π nnDetection: A Self-configuring Method for Medical Object Detection
π A Probabilistic U-Net for Segmentation of Ambiguous Images
π A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities
π Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
π FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings
π Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€5
π’ Name Of Dataset: ADNI (Alzheimer's Disease NeuroImaging Initiative)
π’ Description Of Dataset:
Alzheimer's Disease Neuroimaging Initiative (ADNI) is a multisite study that aims to improve clinical trials for the prevention and treatment of Alzheimerβs disease (AD).[1] This cooperative study combines expertise and funding from the private and public sector to study subjects with AD, as well as those who may develop AD and controls with no signs of cognitive impairment.[2] Researchers at 63 sites in the US and Canada track the progression of AD in the human brain with neuroimaging, biochemical, and genetic biological markers.[2][3] This knowledge helps to find better clinical trials for the prevention and treatment of AD. ADNI has made a global impact,[4] firstly by developing a set of standardized protocols to allow the comparison of results from multiple centers,[4] and secondly by its data-sharing policy which makes available all at the data without embargo to qualified researchers worldwide.[5] To date, over 1000 scientific publications have used ADNI data.[6] A number of other initiatives related to AD and other diseases have been designed and implemented using ADNI as a model.[4] ADNI has been running since 2004 and is currently funded until 2021.[7]Source: Wikipedia, https://en.wikipedia.org/wiki/Alzheimer%27s_Disease_Neuroimaging_Initiative
π’ Official Homepage: http://adni.loni.usc.edu/
π’ Number of articles that used this dataset: 28
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
π Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer's Disease
π Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge
π Alzheimer's Disease Diagnostics by Adaptation of 3D Convolutional Network
π An automated machine learning framework to optimize radiomics model construction validated on twelve clinical applications
π AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans
π The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up
π TADPOLE Challenge: Accurate Alzheimer's disease prediction through crowdsourced forecasting of future data
π Alzheimer's Disease Brain MRI Classification: Challenges and Insights
π Inference of nonlinear causal effects with GWAS summary data
==================================
π΄ For more datasets resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
Alzheimer's Disease Neuroimaging Initiative (ADNI) is a multisite study that aims to improve clinical trials for the prevention and treatment of Alzheimerβs disease (AD).[1] This cooperative study combines expertise and funding from the private and public sector to study subjects with AD, as well as those who may develop AD and controls with no signs of cognitive impairment.[2] Researchers at 63 sites in the US and Canada track the progression of AD in the human brain with neuroimaging, biochemical, and genetic biological markers.[2][3] This knowledge helps to find better clinical trials for the prevention and treatment of AD. ADNI has made a global impact,[4] firstly by developing a set of standardized protocols to allow the comparison of results from multiple centers,[4] and secondly by its data-sharing policy which makes available all at the data without embargo to qualified researchers worldwide.[5] To date, over 1000 scientific publications have used ADNI data.[6] A number of other initiatives related to AD and other diseases have been designed and implemented using ADNI as a model.[4] ADNI has been running since 2004 and is currently funded until 2021.[7]Source: Wikipedia, https://en.wikipedia.org/wiki/Alzheimer%27s_Disease_Neuroimaging_Initiative
π’ Official Homepage: http://adni.loni.usc.edu/
π’ Number of articles that used this dataset: 28
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
π Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer's Disease
π Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge
π Alzheimer's Disease Diagnostics by Adaptation of 3D Convolutional Network
π An automated machine learning framework to optimize radiomics model construction validated on twelve clinical applications
π AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans
π The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up
π TADPOLE Challenge: Accurate Alzheimer's disease prediction through crowdsourced forecasting of future data
π Alzheimer's Disease Brain MRI Classification: Challenges and Insights
π Inference of nonlinear causal effects with GWAS summary data
==================================
π΄ For more datasets resources:
β https://t.me/DataScienceT
β€6
π’ Name Of Dataset: MegaDepth
π’ Description Of Dataset:
The MegaDepth dataset is a dataset for single-view depth prediction that includes 196 different locations reconstructed from COLMAP SfM/MVS.Source:MegaDepth: Learning Single-View Depth Prediction from Internet Photos
π’ Official Homepage: http://www.cs.cornell.edu/projects/megadepth/
π’ Number of articles that used this dataset: 150
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
π Depth Anything V2
π Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
π LightGlue: Local Feature Matching at Light Speed
π LoFTR: Detector-Free Local Feature Matching with Transformers
π 3D Ken Burns Effect from a Single Image
π Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
π Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image
π Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction
π MegaDepth: Learning Single-View Depth Prediction from Internet Photos
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The MegaDepth dataset is a dataset for single-view depth prediction that includes 196 different locations reconstructed from COLMAP SfM/MVS.Source:MegaDepth: Learning Single-View Depth Prediction from Internet Photos
π’ Official Homepage: http://www.cs.cornell.edu/projects/megadepth/
π’ Number of articles that used this dataset: 150
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
π Depth Anything V2
π Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
π LightGlue: Local Feature Matching at Light Speed
π LoFTR: Detector-Free Local Feature Matching with Transformers
π 3D Ken Burns Effect from a Single Image
π Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
π Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image
π Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction
π MegaDepth: Learning Single-View Depth Prediction from Internet Photos
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€8
π’ Name Of Dataset: CelebA-HQ
π’ Description Of Dataset:
TheCelebA-HQdataset is a high-quality version of CelebA that consists of 30,000 images at 1024Γ1024 resolution.Source:IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis
π’ Official Homepage: https://github.com/tkarras/progressive_growing_of_gans
π’ Number of articles that used this dataset: 946
π’ Dataset Loaders:
tkarras/progressive_growing_of_gans:
https://github.com/tkarras/progressive_growing_of_gans
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/celeb_a_hq
π’ Articles related to the dataset:
π High-Resolution Image Synthesis with Latent Diffusion Models
π DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection
π Towards Real-World Blind Face Restoration with Generative Facial Prior
π Towards Robust Blind Face Restoration with Codebook Lookup Transformer
π A Style-Based Generator Architecture for Generative Adversarial Networks
π Vector-quantized Image Modeling with Improved VQGAN
π Resolution-robust Large Mask Inpainting with Fourier Convolutions
π GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond
π Texture Memory-Augmented Deep Patch-Based Image Inpainting
π High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheCelebA-HQdataset is a high-quality version of CelebA that consists of 30,000 images at 1024Γ1024 resolution.Source:IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis
π’ Official Homepage: https://github.com/tkarras/progressive_growing_of_gans
π’ Number of articles that used this dataset: 946
π’ Dataset Loaders:
tkarras/progressive_growing_of_gans:
https://github.com/tkarras/progressive_growing_of_gans
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/celeb_a_hq
π’ Articles related to the dataset:
π High-Resolution Image Synthesis with Latent Diffusion Models
π DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection
π Towards Real-World Blind Face Restoration with Generative Facial Prior
π Towards Robust Blind Face Restoration with Codebook Lookup Transformer
π A Style-Based Generator Architecture for Generative Adversarial Networks
π Vector-quantized Image Modeling with Improved VQGAN
π Resolution-robust Large Mask Inpainting with Fourier Convolutions
π GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond
π Texture Memory-Augmented Deep Patch-Based Image Inpainting
π High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
Forwarded from Machine Learning with Python
This channels is for Programmers, Coders, Software Engineers.
0οΈβ£ Python
1οΈβ£ Data Science
2οΈβ£ Machine Learning
3οΈβ£ Data Visualization
4οΈβ£ Artificial Intelligence
5οΈβ£ Data Analysis
6οΈβ£ Statistics
7οΈβ£ Deep Learning
8οΈβ£ programming Languages
β
https://t.me/addlist/8_rRW2scgfRhOTc0
β
https://t.me/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
β€2
π’ Name Of Dataset: BlendedMVS
π’ Description Of Dataset:
BlendedMVSis a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. The dataset was created by applying a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, these mesh models were rendered to color images and depth maps.Source:BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π’ Official Homepage: https://github.com/YoYo000/BlendedMVS
π’ Number of articles that used this dataset: 104
π’ Dataset Loaders:
YoYo000/BlendedMVS:
https://github.com/YoYo000/BlendedMVS
π’ Articles related to the dataset:
π Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
π Depth Anything V2
π NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
π Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
π Volume Rendering of Neural Implicit Surfaces
π Neural Sparse Voxel Fields
π BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction
π SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
π Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
BlendedMVSis a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. The dataset was created by applying a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, these mesh models were rendered to color images and depth maps.Source:BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π’ Official Homepage: https://github.com/YoYo000/BlendedMVS
π’ Number of articles that used this dataset: 104
π’ Dataset Loaders:
YoYo000/BlendedMVS:
https://github.com/YoYo000/BlendedMVS
π’ Articles related to the dataset:
π Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
π Depth Anything V2
π NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
π Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
π Volume Rendering of Neural Implicit Surfaces
π Neural Sparse Voxel Fields
π BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction
π SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
π Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: EPIC-KITCHENS-100
π’ Description Of Dataset:
This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection also enables evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.
π’ Official Homepage: https://epic-kitchens.github.io/2021
π’ Number of articles that used this dataset: 160
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π MoViNets: Mobile Video Networks for Efficient Video Recognition
π Domain-Adversarial Training of Neural Networks
π BMN: Boundary-Matching Network for Temporal Action Proposal Generation
π Adversarial Discriminative Domain Adaptation
π Attention Bottlenecks for Multimodal Fusion
π Audiovisual Masked Autoencoders
π Multiview Transformers for Video Recognition
π ViViT: A Video Vision Transformer
π Magma: A Foundation Model for Multimodal AI Agents
π V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection also enables evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.
π’ Official Homepage: https://epic-kitchens.github.io/2021
π’ Number of articles that used this dataset: 160
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π MoViNets: Mobile Video Networks for Efficient Video Recognition
π Domain-Adversarial Training of Neural Networks
π BMN: Boundary-Matching Network for Temporal Action Proposal Generation
π Adversarial Discriminative Domain Adaptation
π Attention Bottlenecks for Multimodal Fusion
π Audiovisual Masked Autoencoders
π Multiview Transformers for Video Recognition
π ViViT: A Video Vision Transformer
π Magma: A Foundation Model for Multimodal AI Agents
π V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€3
π’ Name Of Dataset: CARLA (Car Learning to Act)
π’ Description Of Dataset:
CARLA(CAR Learning to Act) is an open simulator for urban driving, developed as an open-source layer over Unreal Engine 4. Technically, it operates similarly to, as an open source layer over Unreal Engine 4 that provides sensors in the form of RGB cameras (with customizable positions), ground truth depth maps, ground truth semantic segmentation maps with 12 semantic classes designed for driving (road, lane marking, traffic sign, sidewalk and so on), bounding boxes for dynamic objects in the environment, and measurements of the agent itself (vehicle location and orientation).Source:Synthetic Data for Deep Learning
π’ Official Homepage: https://carla.org/
π’ Number of articles that used this dataset: 1316
π’ Dataset Loaders:
joedlopes/carla-simulator-multimodal-sensing:
https://github.com/joedlopes/carla-simulator-multimodal-sensing
π’ Articles related to the dataset:
π Synthetic Dataset Generation for Adversarial Machine Learning Research
π End-to-end Autonomous Driving: Challenges and Frontiers
π OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving
π On the Practicality of Deterministic Epistemic Uncertainty
π D4RL: Datasets for Deep Data-Driven Reinforcement Learning
π Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
π Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
π Label Efficient Visual Abstractions for Autonomous Driving
π Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
π TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
CARLA(CAR Learning to Act) is an open simulator for urban driving, developed as an open-source layer over Unreal Engine 4. Technically, it operates similarly to, as an open source layer over Unreal Engine 4 that provides sensors in the form of RGB cameras (with customizable positions), ground truth depth maps, ground truth semantic segmentation maps with 12 semantic classes designed for driving (road, lane marking, traffic sign, sidewalk and so on), bounding boxes for dynamic objects in the environment, and measurements of the agent itself (vehicle location and orientation).Source:Synthetic Data for Deep Learning
π’ Official Homepage: https://carla.org/
π’ Number of articles that used this dataset: 1316
π’ Dataset Loaders:
joedlopes/carla-simulator-multimodal-sensing:
https://github.com/joedlopes/carla-simulator-multimodal-sensing
π’ Articles related to the dataset:
π Synthetic Dataset Generation for Adversarial Machine Learning Research
π End-to-end Autonomous Driving: Challenges and Frontiers
π OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving
π On the Practicality of Deterministic Epistemic Uncertainty
π D4RL: Datasets for Deep Data-Driven Reinforcement Learning
π Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
π Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
π Label Efficient Visual Abstractions for Autonomous Driving
π Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
π TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π1
π’ Name Of Dataset: Speech Commands
π’ Description Of Dataset:
Speech Commandsis an audio dataset of spoken words designed to help train and evaluate keyword spotting systems .
π’ Official Homepage: https://arxiv.org/abs/1804.03209
π’ Number of articles that used this dataset: 384
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/speech-commands-dataset
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/speech_commands
pytorch/audio:
https://pytorch.org/audio/stable/datasets.html#torchaudio.datasets.SPEECHCOMMANDS
tk-rusch/lem:
https://github.com/tk-rusch/lem
π’ Articles related to the dataset:
π Towards Learning a Universal Non-Semantic Representation of Speech
π Streaming keyword spotting on mobile devices
π MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition
π Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
π ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
π Efficiently Modeling Long Sequences with Structured State Spaces
π Diagonal State Spaces are as Effective as Structured State Spaces
π Meta-Transformer: A Unified Framework for Multimodal Learning
π AST: Audio Spectrogram Transformer
π Training Keyword Spotters with Limited and Synthesized Speech Data
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Speech Commandsis an audio dataset of spoken words designed to help train and evaluate keyword spotting systems .
π’ Official Homepage: https://arxiv.org/abs/1804.03209
π’ Number of articles that used this dataset: 384
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/speech-commands-dataset
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/speech_commands
pytorch/audio:
https://pytorch.org/audio/stable/datasets.html#torchaudio.datasets.SPEECHCOMMANDS
tk-rusch/lem:
https://github.com/tk-rusch/lem
π’ Articles related to the dataset:
π Towards Learning a Universal Non-Semantic Representation of Speech
π Streaming keyword spotting on mobile devices
π MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition
π Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
π ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
π Efficiently Modeling Long Sequences with Structured State Spaces
π Diagonal State Spaces are as Effective as Structured State Spaces
π Meta-Transformer: A Unified Framework for Multimodal Learning
π AST: Audio Spectrogram Transformer
π Training Keyword Spotters with Limited and Synthesized Speech Data
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€2
π’ Name Of Dataset: TUM RGB-D
π’ Description Of Dataset:
TUM RGB-Dis an RGB-D dataset. It contains the color and depth images of a Microsoft Kinect sensor along the ground-truth trajectory of the sensor. The data was recorded at full frame rate (30 Hz) and sensor resolution (640x480). The ground-truth trajectory was obtained from a high-accuracy motion-capture system with eight high-speed tracking cameras (100 Hz).Source:https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Official Homepage: https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Number of articles that used this dataset: 234
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
π pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM
π DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
π Gaussian Splatting SLAM
π ORB-SLAM: a Versatile and Accurate Monocular SLAM System
π NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
π How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
π Robust Keyframe-based Dense SLAM with an RGB-D Camera
π DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments
π Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TUM RGB-Dis an RGB-D dataset. It contains the color and depth images of a Microsoft Kinect sensor along the ground-truth trajectory of the sensor. The data was recorded at full frame rate (30 Hz) and sensor resolution (640x480). The ground-truth trajectory was obtained from a high-accuracy motion-capture system with eight high-speed tracking cameras (100 Hz).Source:https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Official Homepage: https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Number of articles that used this dataset: 234
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
π pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM
π DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
π Gaussian Splatting SLAM
π ORB-SLAM: a Versatile and Accurate Monocular SLAM System
π NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
π How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
π Robust Keyframe-based Dense SLAM with an RGB-D Camera
π DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments
π Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4π₯1
π’ Name Of Dataset: ITDD (Industrial Textile Defect Detection)
π’ Description Of Dataset:
The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.
π’ Official Homepage: https://github.com/cqylunlun/CRAS?tab=readme-ov-file#dataset-release
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.
π’ Official Homepage: https://github.com/cqylunlun/CRAS?tab=readme-ov-file#dataset-release
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€3
π’ Name Of Dataset: CAMELS Multifield Dataset
π’ Description Of Dataset:
CMD is a publicly available collection of hundreds of thousands 2D maps and 3D grids containing different properties of the gas, dark matter, and stars from more than 2,000 different universes. The data has been generated from thousands of state-of-the-art (magneto-)hydrodynamic and gravity-only N-body simulations from the CAMELS project.Each 2D map and 3D grid has a set of labels associated to it: 2 cosmological parameters characterizing fundamental properties of the Universe, and 4 astrophysical parameters parametrizing the strength of astrophysical processes such as feedback from supernova and supermassive black-holes.The main task this dataset was designed is to perform a robust inference on the value of the cosmological parameters from each map and grid. The data itself was generated from two completely different set of simulations, and it is not obvious that training one model on one will work when predicting on the other. Since simulations of the real Universe may never be perfect, this dataset provides the data to tackle this problem.Solving this problem will help cosmologists to constrain the value of the cosmological parameters with the highest accuracy and therefore unveil the mysteries of our Universe. CMD can also be used for many other tasks, such as field mapping and super-resolution.
π’ Official Homepage: https://camels-multifield-dataset.readthedocs.io
π’ Number of articles that used this dataset: 6
π’ Dataset Loaders:
franciscovillaescusa/CMD:
https://camels-multifield-dataset.readthedocs.io
π’ Articles related to the dataset:
π The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence
π The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites
π Augmenting astrophysical scaling relations with machine learning: application to reducing the Sunyaev-Zeldovich flux-mass scatter
π Multifield Cosmology with Artificial Intelligence
π Robust marginalization of baryonic effects for cosmological inference at the field level
π Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
CMD is a publicly available collection of hundreds of thousands 2D maps and 3D grids containing different properties of the gas, dark matter, and stars from more than 2,000 different universes. The data has been generated from thousands of state-of-the-art (magneto-)hydrodynamic and gravity-only N-body simulations from the CAMELS project.Each 2D map and 3D grid has a set of labels associated to it: 2 cosmological parameters characterizing fundamental properties of the Universe, and 4 astrophysical parameters parametrizing the strength of astrophysical processes such as feedback from supernova and supermassive black-holes.The main task this dataset was designed is to perform a robust inference on the value of the cosmological parameters from each map and grid. The data itself was generated from two completely different set of simulations, and it is not obvious that training one model on one will work when predicting on the other. Since simulations of the real Universe may never be perfect, this dataset provides the data to tackle this problem.Solving this problem will help cosmologists to constrain the value of the cosmological parameters with the highest accuracy and therefore unveil the mysteries of our Universe. CMD can also be used for many other tasks, such as field mapping and super-resolution.
π’ Official Homepage: https://camels-multifield-dataset.readthedocs.io
π’ Number of articles that used this dataset: 6
π’ Dataset Loaders:
franciscovillaescusa/CMD:
https://camels-multifield-dataset.readthedocs.io
π’ Articles related to the dataset:
π The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence
π The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites
π Augmenting astrophysical scaling relations with machine learning: application to reducing the Sunyaev-Zeldovich flux-mass scatter
π Multifield Cosmology with Artificial Intelligence
π Robust marginalization of baryonic effects for cosmological inference at the field level
π Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4π2
π’ Name Of Dataset: ETT (Electricity Transformer Temperature)
π’ Description Of Dataset:
TheElectricity Transformer Temperature(ETT) is a crucial indicator in the electric power long-term deployment. This dataset consists of 2 years data from two separated counties in China. To explore the granularity on the Long sequence time-series forecasting (LSTF) problem, different subsets are created, {ETTh1, ETTh2} for 1-hour-level and ETTm1 for 15-minutes-level. Each data point consists of the target value βoil temperatureβ and 6 power load features. The train/val/test is 12/4/4 months.Source:https://arxiv.org/pdf/2012.07436.pdf
π’ Official Homepage: https://github.com/zhouhaoyi/ETDataset
π’ Number of articles that used this dataset: 318
π’ Dataset Loaders:
zhouhaoyi/ETDataset:
https://github.com/zhouhaoyi/ETDataset
π’ Articles related to the dataset:
π TSMixer: An All-MLP Architecture for Time Series Forecasting
π A decoder-only foundation model for time-series forecasting
π Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting
π Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
π Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
π A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
π iTransformer: Inverted Transformers Are Effective for Time Series Forecasting
π TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
π TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting
π FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheElectricity Transformer Temperature(ETT) is a crucial indicator in the electric power long-term deployment. This dataset consists of 2 years data from two separated counties in China. To explore the granularity on the Long sequence time-series forecasting (LSTF) problem, different subsets are created, {ETTh1, ETTh2} for 1-hour-level and ETTm1 for 15-minutes-level. Each data point consists of the target value βoil temperatureβ and 6 power load features. The train/val/test is 12/4/4 months.Source:https://arxiv.org/pdf/2012.07436.pdf
π’ Official Homepage: https://github.com/zhouhaoyi/ETDataset
π’ Number of articles that used this dataset: 318
π’ Dataset Loaders:
zhouhaoyi/ETDataset:
https://github.com/zhouhaoyi/ETDataset
π’ Articles related to the dataset:
π TSMixer: An All-MLP Architecture for Time Series Forecasting
π A decoder-only foundation model for time-series forecasting
π Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting
π Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
π Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
π A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
π iTransformer: Inverted Transformers Are Effective for Time Series Forecasting
π TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
π TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting
π FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: OoDIS (Anomaly Instance Segmentation Benchmark)
π’ Description Of Dataset:
OoDIS is a benchmark dataset for anomaly instance segmentation, crucial for autonomous vehicle safety. It extends existing anomaly segmentation benchmarks to focus on the segmentation of individual out-of-distribution (OOD) objects.The dataset addresses the need for identifying and segmenting unknown objects, which are critical to avoid accidents. It includes diverse scenes with various anomalies, pushing the boundaries of current segmentation capabilities.The benchmark is focused on evaluation of detection and instance segmentation of unexpected obstacles on roads.For more details, refer to theOoDIS paper
π’ Official Homepage: https://kumuji.github.io/oodis_website/
π’ Number of articles that used this dataset: 5
π’ Dataset Loaders:
kumuji/ugains:
https://github.com/kumuji/ugains
π’ Articles related to the dataset:
π Unmasking Anomalies in Road-Scene Segmentation
π UGainS: Uncertainty Guided Anomaly Instance Segmentation
π OoDIS: Anomaly Instance Segmentation Benchmark
π Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
π On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
OoDIS is a benchmark dataset for anomaly instance segmentation, crucial for autonomous vehicle safety. It extends existing anomaly segmentation benchmarks to focus on the segmentation of individual out-of-distribution (OOD) objects.The dataset addresses the need for identifying and segmenting unknown objects, which are critical to avoid accidents. It includes diverse scenes with various anomalies, pushing the boundaries of current segmentation capabilities.The benchmark is focused on evaluation of detection and instance segmentation of unexpected obstacles on roads.For more details, refer to theOoDIS paper
π’ Official Homepage: https://kumuji.github.io/oodis_website/
π’ Number of articles that used this dataset: 5
π’ Dataset Loaders:
kumuji/ugains:
https://github.com/kumuji/ugains
π’ Articles related to the dataset:
π Unmasking Anomalies in Road-Scene Segmentation
π UGainS: Uncertainty Guided Anomaly Instance Segmentation
π OoDIS: Anomaly Instance Segmentation Benchmark
π Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
π On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4