π’ Name Of Dataset: BlendedMVS
π’ Description Of Dataset:
BlendedMVSis a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. The dataset was created by applying a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, these mesh models were rendered to color images and depth maps.Source:BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π’ Official Homepage: https://github.com/YoYo000/BlendedMVS
π’ Number of articles that used this dataset: 104
π’ Dataset Loaders:
YoYo000/BlendedMVS:
https://github.com/YoYo000/BlendedMVS
π’ Articles related to the dataset:
π Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
π Depth Anything V2
π NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
π Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
π Volume Rendering of Neural Implicit Surfaces
π Neural Sparse Voxel Fields
π BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction
π SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
π Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
BlendedMVSis a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. The dataset was created by applying a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, these mesh models were rendered to color images and depth maps.Source:BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π’ Official Homepage: https://github.com/YoYo000/BlendedMVS
π’ Number of articles that used this dataset: 104
π’ Dataset Loaders:
YoYo000/BlendedMVS:
https://github.com/YoYo000/BlendedMVS
π’ Articles related to the dataset:
π Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
π Depth Anything V2
π NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
π Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
π Volume Rendering of Neural Implicit Surfaces
π Neural Sparse Voxel Fields
π BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
π Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction
π SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
π Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: EPIC-KITCHENS-100
π’ Description Of Dataset:
This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection also enables evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.
π’ Official Homepage: https://epic-kitchens.github.io/2021
π’ Number of articles that used this dataset: 160
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π MoViNets: Mobile Video Networks for Efficient Video Recognition
π Domain-Adversarial Training of Neural Networks
π BMN: Boundary-Matching Network for Temporal Action Proposal Generation
π Adversarial Discriminative Domain Adaptation
π Attention Bottlenecks for Multimodal Fusion
π Audiovisual Masked Autoencoders
π Multiview Transformers for Video Recognition
π ViViT: A Video Vision Transformer
π Magma: A Foundation Model for Multimodal AI Agents
π V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection also enables evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.
π’ Official Homepage: https://epic-kitchens.github.io/2021
π’ Number of articles that used this dataset: 160
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π MoViNets: Mobile Video Networks for Efficient Video Recognition
π Domain-Adversarial Training of Neural Networks
π BMN: Boundary-Matching Network for Temporal Action Proposal Generation
π Adversarial Discriminative Domain Adaptation
π Attention Bottlenecks for Multimodal Fusion
π Audiovisual Masked Autoencoders
π Multiview Transformers for Video Recognition
π ViViT: A Video Vision Transformer
π Magma: A Foundation Model for Multimodal AI Agents
π V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€3
π’ Name Of Dataset: CARLA (Car Learning to Act)
π’ Description Of Dataset:
CARLA(CAR Learning to Act) is an open simulator for urban driving, developed as an open-source layer over Unreal Engine 4. Technically, it operates similarly to, as an open source layer over Unreal Engine 4 that provides sensors in the form of RGB cameras (with customizable positions), ground truth depth maps, ground truth semantic segmentation maps with 12 semantic classes designed for driving (road, lane marking, traffic sign, sidewalk and so on), bounding boxes for dynamic objects in the environment, and measurements of the agent itself (vehicle location and orientation).Source:Synthetic Data for Deep Learning
π’ Official Homepage: https://carla.org/
π’ Number of articles that used this dataset: 1316
π’ Dataset Loaders:
joedlopes/carla-simulator-multimodal-sensing:
https://github.com/joedlopes/carla-simulator-multimodal-sensing
π’ Articles related to the dataset:
π Synthetic Dataset Generation for Adversarial Machine Learning Research
π End-to-end Autonomous Driving: Challenges and Frontiers
π OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving
π On the Practicality of Deterministic Epistemic Uncertainty
π D4RL: Datasets for Deep Data-Driven Reinforcement Learning
π Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
π Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
π Label Efficient Visual Abstractions for Autonomous Driving
π Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
π TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
CARLA(CAR Learning to Act) is an open simulator for urban driving, developed as an open-source layer over Unreal Engine 4. Technically, it operates similarly to, as an open source layer over Unreal Engine 4 that provides sensors in the form of RGB cameras (with customizable positions), ground truth depth maps, ground truth semantic segmentation maps with 12 semantic classes designed for driving (road, lane marking, traffic sign, sidewalk and so on), bounding boxes for dynamic objects in the environment, and measurements of the agent itself (vehicle location and orientation).Source:Synthetic Data for Deep Learning
π’ Official Homepage: https://carla.org/
π’ Number of articles that used this dataset: 1316
π’ Dataset Loaders:
joedlopes/carla-simulator-multimodal-sensing:
https://github.com/joedlopes/carla-simulator-multimodal-sensing
π’ Articles related to the dataset:
π Synthetic Dataset Generation for Adversarial Machine Learning Research
π End-to-end Autonomous Driving: Challenges and Frontiers
π OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving
π On the Practicality of Deterministic Epistemic Uncertainty
π D4RL: Datasets for Deep Data-Driven Reinforcement Learning
π Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
π Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
π Label Efficient Visual Abstractions for Autonomous Driving
π Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
π TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π1
π’ Name Of Dataset: Speech Commands
π’ Description Of Dataset:
Speech Commandsis an audio dataset of spoken words designed to help train and evaluate keyword spotting systems .
π’ Official Homepage: https://arxiv.org/abs/1804.03209
π’ Number of articles that used this dataset: 384
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/speech-commands-dataset
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/speech_commands
pytorch/audio:
https://pytorch.org/audio/stable/datasets.html#torchaudio.datasets.SPEECHCOMMANDS
tk-rusch/lem:
https://github.com/tk-rusch/lem
π’ Articles related to the dataset:
π Towards Learning a Universal Non-Semantic Representation of Speech
π Streaming keyword spotting on mobile devices
π MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition
π Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
π ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
π Efficiently Modeling Long Sequences with Structured State Spaces
π Diagonal State Spaces are as Effective as Structured State Spaces
π Meta-Transformer: A Unified Framework for Multimodal Learning
π AST: Audio Spectrogram Transformer
π Training Keyword Spotters with Limited and Synthesized Speech Data
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Speech Commandsis an audio dataset of spoken words designed to help train and evaluate keyword spotting systems .
π’ Official Homepage: https://arxiv.org/abs/1804.03209
π’ Number of articles that used this dataset: 384
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/speech-commands-dataset
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/speech_commands
pytorch/audio:
https://pytorch.org/audio/stable/datasets.html#torchaudio.datasets.SPEECHCOMMANDS
tk-rusch/lem:
https://github.com/tk-rusch/lem
π’ Articles related to the dataset:
π Towards Learning a Universal Non-Semantic Representation of Speech
π Streaming keyword spotting on mobile devices
π MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition
π Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
π ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
π Efficiently Modeling Long Sequences with Structured State Spaces
π Diagonal State Spaces are as Effective as Structured State Spaces
π Meta-Transformer: A Unified Framework for Multimodal Learning
π AST: Audio Spectrogram Transformer
π Training Keyword Spotters with Limited and Synthesized Speech Data
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€2
π’ Name Of Dataset: TUM RGB-D
π’ Description Of Dataset:
TUM RGB-Dis an RGB-D dataset. It contains the color and depth images of a Microsoft Kinect sensor along the ground-truth trajectory of the sensor. The data was recorded at full frame rate (30 Hz) and sensor resolution (640x480). The ground-truth trajectory was obtained from a high-accuracy motion-capture system with eight high-speed tracking cameras (100 Hz).Source:https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Official Homepage: https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Number of articles that used this dataset: 234
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
π pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM
π DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
π Gaussian Splatting SLAM
π ORB-SLAM: a Versatile and Accurate Monocular SLAM System
π NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
π How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
π Robust Keyframe-based Dense SLAM with an RGB-D Camera
π DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments
π Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TUM RGB-Dis an RGB-D dataset. It contains the color and depth images of a Microsoft Kinect sensor along the ground-truth trajectory of the sensor. The data was recorded at full frame rate (30 Hz) and sensor resolution (640x480). The ground-truth trajectory was obtained from a high-accuracy motion-capture system with eight high-speed tracking cameras (100 Hz).Source:https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Official Homepage: https://vision.in.tum.de/data/datasets/rgbd-dataset
π’ Number of articles that used this dataset: 234
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
π pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM
π DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
π Gaussian Splatting SLAM
π ORB-SLAM: a Versatile and Accurate Monocular SLAM System
π NICE-SLAM: Neural Implicit Scalable Encoding for SLAM
π How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey
π Robust Keyframe-based Dense SLAM with an RGB-D Camera
π DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments
π Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4π₯1
π’ Name Of Dataset: ITDD (Industrial Textile Defect Detection)
π’ Description Of Dataset:
The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.
π’ Official Homepage: https://github.com/cqylunlun/CRAS?tab=readme-ov-file#dataset-release
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.
π’ Official Homepage: https://github.com/cqylunlun/CRAS?tab=readme-ov-file#dataset-release
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€3
π’ Name Of Dataset: CAMELS Multifield Dataset
π’ Description Of Dataset:
CMD is a publicly available collection of hundreds of thousands 2D maps and 3D grids containing different properties of the gas, dark matter, and stars from more than 2,000 different universes. The data has been generated from thousands of state-of-the-art (magneto-)hydrodynamic and gravity-only N-body simulations from the CAMELS project.Each 2D map and 3D grid has a set of labels associated to it: 2 cosmological parameters characterizing fundamental properties of the Universe, and 4 astrophysical parameters parametrizing the strength of astrophysical processes such as feedback from supernova and supermassive black-holes.The main task this dataset was designed is to perform a robust inference on the value of the cosmological parameters from each map and grid. The data itself was generated from two completely different set of simulations, and it is not obvious that training one model on one will work when predicting on the other. Since simulations of the real Universe may never be perfect, this dataset provides the data to tackle this problem.Solving this problem will help cosmologists to constrain the value of the cosmological parameters with the highest accuracy and therefore unveil the mysteries of our Universe. CMD can also be used for many other tasks, such as field mapping and super-resolution.
π’ Official Homepage: https://camels-multifield-dataset.readthedocs.io
π’ Number of articles that used this dataset: 6
π’ Dataset Loaders:
franciscovillaescusa/CMD:
https://camels-multifield-dataset.readthedocs.io
π’ Articles related to the dataset:
π The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence
π The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites
π Augmenting astrophysical scaling relations with machine learning: application to reducing the Sunyaev-Zeldovich flux-mass scatter
π Multifield Cosmology with Artificial Intelligence
π Robust marginalization of baryonic effects for cosmological inference at the field level
π Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
CMD is a publicly available collection of hundreds of thousands 2D maps and 3D grids containing different properties of the gas, dark matter, and stars from more than 2,000 different universes. The data has been generated from thousands of state-of-the-art (magneto-)hydrodynamic and gravity-only N-body simulations from the CAMELS project.Each 2D map and 3D grid has a set of labels associated to it: 2 cosmological parameters characterizing fundamental properties of the Universe, and 4 astrophysical parameters parametrizing the strength of astrophysical processes such as feedback from supernova and supermassive black-holes.The main task this dataset was designed is to perform a robust inference on the value of the cosmological parameters from each map and grid. The data itself was generated from two completely different set of simulations, and it is not obvious that training one model on one will work when predicting on the other. Since simulations of the real Universe may never be perfect, this dataset provides the data to tackle this problem.Solving this problem will help cosmologists to constrain the value of the cosmological parameters with the highest accuracy and therefore unveil the mysteries of our Universe. CMD can also be used for many other tasks, such as field mapping and super-resolution.
π’ Official Homepage: https://camels-multifield-dataset.readthedocs.io
π’ Number of articles that used this dataset: 6
π’ Dataset Loaders:
franciscovillaescusa/CMD:
https://camels-multifield-dataset.readthedocs.io
π’ Articles related to the dataset:
π The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence
π The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites
π Augmenting astrophysical scaling relations with machine learning: application to reducing the Sunyaev-Zeldovich flux-mass scatter
π Multifield Cosmology with Artificial Intelligence
π Robust marginalization of baryonic effects for cosmological inference at the field level
π Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4π2
π’ Name Of Dataset: ETT (Electricity Transformer Temperature)
π’ Description Of Dataset:
TheElectricity Transformer Temperature(ETT) is a crucial indicator in the electric power long-term deployment. This dataset consists of 2 years data from two separated counties in China. To explore the granularity on the Long sequence time-series forecasting (LSTF) problem, different subsets are created, {ETTh1, ETTh2} for 1-hour-level and ETTm1 for 15-minutes-level. Each data point consists of the target value βoil temperatureβ and 6 power load features. The train/val/test is 12/4/4 months.Source:https://arxiv.org/pdf/2012.07436.pdf
π’ Official Homepage: https://github.com/zhouhaoyi/ETDataset
π’ Number of articles that used this dataset: 318
π’ Dataset Loaders:
zhouhaoyi/ETDataset:
https://github.com/zhouhaoyi/ETDataset
π’ Articles related to the dataset:
π TSMixer: An All-MLP Architecture for Time Series Forecasting
π A decoder-only foundation model for time-series forecasting
π Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting
π Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
π Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
π A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
π iTransformer: Inverted Transformers Are Effective for Time Series Forecasting
π TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
π TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting
π FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheElectricity Transformer Temperature(ETT) is a crucial indicator in the electric power long-term deployment. This dataset consists of 2 years data from two separated counties in China. To explore the granularity on the Long sequence time-series forecasting (LSTF) problem, different subsets are created, {ETTh1, ETTh2} for 1-hour-level and ETTm1 for 15-minutes-level. Each data point consists of the target value βoil temperatureβ and 6 power load features. The train/val/test is 12/4/4 months.Source:https://arxiv.org/pdf/2012.07436.pdf
π’ Official Homepage: https://github.com/zhouhaoyi/ETDataset
π’ Number of articles that used this dataset: 318
π’ Dataset Loaders:
zhouhaoyi/ETDataset:
https://github.com/zhouhaoyi/ETDataset
π’ Articles related to the dataset:
π TSMixer: An All-MLP Architecture for Time Series Forecasting
π A decoder-only foundation model for time-series forecasting
π Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting
π Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
π Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
π A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
π iTransformer: Inverted Transformers Are Effective for Time Series Forecasting
π TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
π TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting
π FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: OoDIS (Anomaly Instance Segmentation Benchmark)
π’ Description Of Dataset:
OoDIS is a benchmark dataset for anomaly instance segmentation, crucial for autonomous vehicle safety. It extends existing anomaly segmentation benchmarks to focus on the segmentation of individual out-of-distribution (OOD) objects.The dataset addresses the need for identifying and segmenting unknown objects, which are critical to avoid accidents. It includes diverse scenes with various anomalies, pushing the boundaries of current segmentation capabilities.The benchmark is focused on evaluation of detection and instance segmentation of unexpected obstacles on roads.For more details, refer to theOoDIS paper
π’ Official Homepage: https://kumuji.github.io/oodis_website/
π’ Number of articles that used this dataset: 5
π’ Dataset Loaders:
kumuji/ugains:
https://github.com/kumuji/ugains
π’ Articles related to the dataset:
π Unmasking Anomalies in Road-Scene Segmentation
π UGainS: Uncertainty Guided Anomaly Instance Segmentation
π OoDIS: Anomaly Instance Segmentation Benchmark
π Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
π On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
OoDIS is a benchmark dataset for anomaly instance segmentation, crucial for autonomous vehicle safety. It extends existing anomaly segmentation benchmarks to focus on the segmentation of individual out-of-distribution (OOD) objects.The dataset addresses the need for identifying and segmenting unknown objects, which are critical to avoid accidents. It includes diverse scenes with various anomalies, pushing the boundaries of current segmentation capabilities.The benchmark is focused on evaluation of detection and instance segmentation of unexpected obstacles on roads.For more details, refer to theOoDIS paper
π’ Official Homepage: https://kumuji.github.io/oodis_website/
π’ Number of articles that used this dataset: 5
π’ Dataset Loaders:
kumuji/ugains:
https://github.com/kumuji/ugains
π’ Articles related to the dataset:
π Unmasking Anomalies in Road-Scene Segmentation
π UGainS: Uncertainty Guided Anomaly Instance Segmentation
π OoDIS: Anomaly Instance Segmentation Benchmark
π Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
π On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: InfoSeek (Visual Information Seeking)
π’ Description Of Dataset:
In this project, we introduce InfoSeek, a visual question answering dataset tailored for information-seeking questions that cannot be answered with only common sense knowledge. Using InfoSeek, we analyze various pre-trained visual question answering models and gain insights into their characteristics. Our findings reveal that state-of-the-art pre-trained multi-modal models (e.g., PaLI-X, BLIP2, etc.) face challenges in answering visual information-seeking questions, but fine-tuning on the InfoSeek dataset elicits models to use fine-grained knowledge that was learned during their pre-training.
π’ Official Homepage: https://open-vision-language.github.io/infoseek/
π’ Number of articles that used this dataset: 35
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
π LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
π Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
π Ming-Omni: A Unified Multimodal Model for Perception and Generation
π Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering
π PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
π Safety of Multimodal Large Language Models on Images and Texts
π PaLI-X: On Scaling up a Multilingual Vision and Language Model
π MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
π Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
In this project, we introduce InfoSeek, a visual question answering dataset tailored for information-seeking questions that cannot be answered with only common sense knowledge. Using InfoSeek, we analyze various pre-trained visual question answering models and gain insights into their characteristics. Our findings reveal that state-of-the-art pre-trained multi-modal models (e.g., PaLI-X, BLIP2, etc.) face challenges in answering visual information-seeking questions, but fine-tuning on the InfoSeek dataset elicits models to use fine-grained knowledge that was learned during their pre-training.
π’ Official Homepage: https://open-vision-language.github.io/infoseek/
π’ Number of articles that used this dataset: 35
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
π LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
π Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
π Ming-Omni: A Unified Multimodal Model for Perception and Generation
π Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering
π PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
π Safety of Multimodal Large Language Models on Images and Texts
π PaLI-X: On Scaling up a Multilingual Vision and Language Model
π MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
π Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: UIIS10K (General Underwater Image Instance Segmentation dataset 10K)
π’ Description Of Dataset:
We propose a large-scale underwater instance segmentation dataset, UIIS10K, which includes 10,048 images with pixel-level annotations for 10 categories. As far as we know, this is the largest underwater instance segmentation dataset available and can be used as a benchmark for evaluating underwater segmentation methods.
π’ Official Homepage: https://github.com/LiamLian0727/UIIS10K
π’ Number of articles that used this dataset: 3
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π WaterMask: Instance Segmentation for Underwater Imagery
π A Unified Image-Dense Annotation Generation Model for Underwater Scenes
π UWSAM: Segment Anything Model Guided Underwater Instance Segmentation and A Large-scale Benchmark Dataset
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
We propose a large-scale underwater instance segmentation dataset, UIIS10K, which includes 10,048 images with pixel-level annotations for 10 categories. As far as we know, this is the largest underwater instance segmentation dataset available and can be used as a benchmark for evaluating underwater segmentation methods.
π’ Official Homepage: https://github.com/LiamLian0727/UIIS10K
π’ Number of articles that used this dataset: 3
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π WaterMask: Instance Segmentation for Underwater Imagery
π A Unified Image-Dense Annotation Generation Model for Underwater Scenes
π UWSAM: Segment Anything Model Guided Underwater Instance Segmentation and A Large-scale Benchmark Dataset
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: 1
π’ Description Of Dataset:
111
π’ Official Homepage: Not found
π’ Number of articles that used this dataset: 28
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π NeMo Inverse Text Normalization: From Development To Production
π Open Deep Search: Democratizing Search with Open-source Reasoning Agents
π Deep Learning in Single-Cell Analysis
π Enhancing Fine-grained Sentiment Classification Exploiting Local Context Embedding
π UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
π Representation Learning with Large Language Models for Recommendation
π Short-Term Aggregated Residential Load Forecasting using BiLSTM and CNN-BiLSTM
π K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
π Semi-supervised Sequence Modeling for Elastic Impedance Inversion
π CholecTrack20: A Dataset for Multi-Class Multiple Tool Tracking in Laparoscopic Surgery
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
111
π’ Official Homepage: Not found
π’ Number of articles that used this dataset: 28
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π NeMo Inverse Text Normalization: From Development To Production
π Open Deep Search: Democratizing Search with Open-source Reasoning Agents
π Deep Learning in Single-Cell Analysis
π Enhancing Fine-grained Sentiment Classification Exploiting Local Context Embedding
π UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
π Representation Learning with Large Language Models for Recommendation
π Short-Term Aggregated Residential Load Forecasting using BiLSTM and CNN-BiLSTM
π K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
π Semi-supervised Sequence Modeling for Elastic Impedance Inversion
π CholecTrack20: A Dataset for Multi-Class Multiple Tool Tracking in Laparoscopic Surgery
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
Telegram
Kaggle Data Hub
Your go-to hub for Kaggle datasets β explore, analyze, and leverage data for Machine Learning and Data Science projects.
Admin: @HusseinSheikho || @Hussein_Sheikho
Admin: @HusseinSheikho || @Hussein_Sheikho
β€4π1
π₯ The coolest AI bot on Telegram
π’ Completely free and knows everything, from simple questions to complex problems.
βοΈ Helps you with anything in the easiest and fastest way possible.
β¨οΈ You can even choose girlfriend or boyfriend mode and chat as if youβre talking to a real person π
π΅ Includes weekly and monthly airdrops!βοΈ
π΅βπ« Bot ID: @chatgpt_officialbot
π The best part is, even group admins can use it right inside their groups! β¨
πΊ Try now:
β’ Type
β’ Type
β’ Type
Or just say
π’ Completely free and knows everything, from simple questions to complex problems.
βοΈ Helps you with anything in the easiest and fastest way possible.
β¨οΈ You can even choose girlfriend or boyfriend mode and chat as if youβre talking to a real person π
π΅ Includes weekly and monthly airdrops!βοΈ
π΅βπ« Bot ID: @chatgpt_officialbot
π The best part is, even group admins can use it right inside their groups! β¨
πΊ Try now:
β’ Type
FunFact! for a jaw-dropping AI trivia.β’ Type
RecipePlease! for a quick, tasty meal idea.β’ Type
JokeTime! for an instant laugh.Or just say
Surprise me! and I'll pick something awesome for you. π€β¨β€5
Forwarded from Machine Learning with Python
This channels is for Programmers, Coders, Software Engineers.
0οΈβ£ Python
1οΈβ£ Data Science
2οΈβ£ Machine Learning
3οΈβ£ Data Visualization
4οΈβ£ Artificial Intelligence
5οΈβ£ Data Analysis
6οΈβ£ Statistics
7οΈβ£ Deep Learning
8οΈβ£ programming Languages
β
https://t.me/addlist/8_rRW2scgfRhOTc0
β
https://t.me/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
β€3
Forwarded from Machine Learning with Python
βοΈ JAY HELPS EVERYONE EARN MONEY!$29,000 HE'S GIVING AWAY TODAY!
Everyone can join his channel and make money! He gives away from $200 to $5.000 every day in his channel
https://t.me/+LgzKy2hA4eY0YWNl
β‘οΈFREE ONLY FOR THE FIRST 500 SUBSCRIBERS! FURTHER ENTRY IS PAID! ππ
https://t.me/+LgzKy2hA4eY0YWNl
Everyone can join his channel and make money! He gives away from $200 to $5.000 every day in his channel
https://t.me/+LgzKy2hA4eY0YWNl
β‘οΈFREE ONLY FOR THE FIRST 500 SUBSCRIBERS! FURTHER ENTRY IS PAID! ππ
https://t.me/+LgzKy2hA4eY0YWNl
β€5
π’ Name Of Dataset: MIT-BIH Arrhythmia Database
π’ Description Of Dataset:
The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.The recordings were digitized at 360 samples per second per channel with 11-bit resolution over a 10 mV range. Two or more cardiologists independently annotated each record; disagreements were resolved to obtain the computer-readable reference annotations for each beat (approximately 110,000 annotations in all) included with the database.This directory contains the entire MIT-BIH Arrhythmia Database. About half (25 of 48 complete records, and reference annotation files for all 48 records) of this database has been freely available here since PhysioNet's inception in September 1999. The 23 remaining signal files, which had been available only on the MIT-BIH Arrhythmia Database CD-ROM, were posted here in February 2005.Much more information about this database may be found in theMIT-BIH Arrhythmia Database Directory.
π’ Official Homepage: https://physionet.org/content/mitdb/1.0.0/
π’ Number of articles that used this dataset: 31
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Inter- and intra- patient ECG heartbeat classification for arrhythmia detection: a sequence to sequence deep learning approach
π ECG Heartbeat Classification: A Deep Transferable Representation
π Multi-module Recurrent Convolutional Neural Network with Transformer Encoder for ECG Arrhythmia Classification
π Subject-Aware Contrastive Learning for Biosignals
π Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation
π A Personalized Zero-Shot ECG Arrhythmia Monitoring System: From Sparse Representation Based Domain Adaption to Energy Efficient Abnormal Beat Detection for Practical ECG Surveillance
π AQuA: A Benchmarking Tool for Label Quality Assessment
π Spot The Odd One Out: Regularized Complete Cycle Consistent Anomaly Detector GAN
π Arrhythmia Classifier Using Convolutional Neural Network with Adaptive Loss-aware Multi-bit Networks Quantization
π MedFuncta: Modality-Agnostic Representations Based on Efficient Neural Fields
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.The recordings were digitized at 360 samples per second per channel with 11-bit resolution over a 10 mV range. Two or more cardiologists independently annotated each record; disagreements were resolved to obtain the computer-readable reference annotations for each beat (approximately 110,000 annotations in all) included with the database.This directory contains the entire MIT-BIH Arrhythmia Database. About half (25 of 48 complete records, and reference annotation files for all 48 records) of this database has been freely available here since PhysioNet's inception in September 1999. The 23 remaining signal files, which had been available only on the MIT-BIH Arrhythmia Database CD-ROM, were posted here in February 2005.Much more information about this database may be found in theMIT-BIH Arrhythmia Database Directory.
π’ Official Homepage: https://physionet.org/content/mitdb/1.0.0/
π’ Number of articles that used this dataset: 31
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Inter- and intra- patient ECG heartbeat classification for arrhythmia detection: a sequence to sequence deep learning approach
π ECG Heartbeat Classification: A Deep Transferable Representation
π Multi-module Recurrent Convolutional Neural Network with Transformer Encoder for ECG Arrhythmia Classification
π Subject-Aware Contrastive Learning for Biosignals
π Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation
π A Personalized Zero-Shot ECG Arrhythmia Monitoring System: From Sparse Representation Based Domain Adaption to Energy Efficient Abnormal Beat Detection for Practical ECG Surveillance
π AQuA: A Benchmarking Tool for Label Quality Assessment
π Spot The Odd One Out: Regularized Complete Cycle Consistent Anomaly Detector GAN
π Arrhythmia Classifier Using Convolutional Neural Network with Adaptive Loss-aware Multi-bit Networks Quantization
π MedFuncta: Modality-Agnostic Representations Based on Efficient Neural Fields
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
www.physionet.org
MIT-BIH Arrhythmia Database v1.0.0
Two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979.
β€5
Forwarded from Machine Learning with Python
This channels is for Programmers, Coders, Software Engineers.
0οΈβ£ Python
1οΈβ£ Data Science
2οΈβ£ Machine Learning
3οΈβ£ Data Visualization
4οΈβ£ Artificial Intelligence
5οΈβ£ Data Analysis
6οΈβ£ Statistics
7οΈβ£ Deep Learning
8οΈβ£ programming Languages
β
https://t.me/addlist/8_rRW2scgfRhOTc0
β
https://t.me/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
β€5
π’ Name Of Dataset: RVL-CDIP
π’ Description Of Dataset:
TheRVL-CDIPdataset consists of scanned document images belonging to 16 classes such as letter, form, email, resume, memo, etc. The dataset has 320,000 training, 40,000 validation and 40,000 test images. The images are characterized by low quality, noise, and low resolution, typically 100 dpi.Source:Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning
π’ Official Homepage: https://www.cs.cmu.edu/~aharley/rvl-cdip/
π’ Number of articles that used this dataset: Unknown
π’ Dataset Loaders:
huggingface/datasets (rvl_cdip):
https://huggingface.co/datasets/rvl_cdip
huggingface/datasets (rvl-cdip_easyOCR):
https://huggingface.co/datasets/jordyvl/rvl-cdip_easyOCR
huggingface/datasets (rvl_cdip):
https://huggingface.co/datasets/aharley/rvl_cdip
huggingface/datasets (rvl_cdip_easyocr):
https://huggingface.co/datasets/jordyvl/rvl_cdip_easyocr
huggingface/datasets (rvl_cdip_mini):
https://huggingface.co/datasets/dvgodoy/rvl_cdip_mini
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheRVL-CDIPdataset consists of scanned document images belonging to 16 classes such as letter, form, email, resume, memo, etc. The dataset has 320,000 training, 40,000 validation and 40,000 test images. The images are characterized by low quality, noise, and low resolution, typically 100 dpi.Source:Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning
π’ Official Homepage: https://www.cs.cmu.edu/~aharley/rvl-cdip/
π’ Number of articles that used this dataset: Unknown
π’ Dataset Loaders:
huggingface/datasets (rvl_cdip):
https://huggingface.co/datasets/rvl_cdip
huggingface/datasets (rvl-cdip_easyOCR):
https://huggingface.co/datasets/jordyvl/rvl-cdip_easyOCR
huggingface/datasets (rvl_cdip):
https://huggingface.co/datasets/aharley/rvl_cdip
huggingface/datasets (rvl_cdip_easyocr):
https://huggingface.co/datasets/jordyvl/rvl_cdip_easyocr
huggingface/datasets (rvl_cdip_mini):
https://huggingface.co/datasets/dvgodoy/rvl_cdip_mini
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€5
π’ Name Of Dataset: FUNSD (Form Understanding in Noisy Scanned Documents)
π’ Description Of Dataset:
Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis, and entity labeling/linking.Source:FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
π’ Official Homepage: https://guillaumejaume.github.io/FUNSD/
π’ Number of articles that used this dataset: Unknown
π’ Dataset Loaders:
huggingface/datasets:
https://huggingface.co/datasets/nielsr/FUNSD_layoutlmv2
mindee/doctr:
https://mindee.github.io/doctr/latest/datasets.html#doctr.datasets.FUNSD
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis, and entity labeling/linking.Source:FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
π’ Official Homepage: https://guillaumejaume.github.io/FUNSD/
π’ Number of articles that used this dataset: Unknown
π’ Dataset Loaders:
huggingface/datasets:
https://huggingface.co/datasets/nielsr/FUNSD_layoutlmv2
mindee/doctr:
https://mindee.github.io/doctr/latest/datasets.html#doctr.datasets.FUNSD
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: IIIT-AR-13K
π’ Description Of Dataset:
IIIT-AR-13K is created by manually annotating the bounding boxes of graphical or page objects in publicly available annual reports. This dataset contains a total of 13k annotated page images with objects in five different popular categories - table, figure, natural image, logo, and signature. It is the largest manually annotated dataset for graphical object detection.Source:IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
π’ Official Homepage: http://cvit.iiit.ac.in/usodi/iiitar13k.php
π’ Number of articles that used this dataset: 6
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Deep learning for table detection and structure recognition: A survey
π RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization
π The YOLO model that still excels in document layout analysis
π IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
π Document AI: Benchmarks, Models and Applications
π Robust Table Detection and Structure Recognition from Heterogeneous Document Images
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
IIIT-AR-13K is created by manually annotating the bounding boxes of graphical or page objects in publicly available annual reports. This dataset contains a total of 13k annotated page images with objects in five different popular categories - table, figure, natural image, logo, and signature. It is the largest manually annotated dataset for graphical object detection.Source:IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
π’ Official Homepage: http://cvit.iiit.ac.in/usodi/iiitar13k.php
π’ Number of articles that used this dataset: 6
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Deep learning for table detection and structure recognition: A survey
π RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization
π The YOLO model that still excels in document layout analysis
π IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
π Document AI: Benchmarks, Models and Applications
π Robust Table Detection and Structure Recognition from Heterogeneous Document Images
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
Telegram
Kaggle Data Hub
Your go-to hub for Kaggle datasets β explore, analyze, and leverage data for Machine Learning and Data Science projects.
Admin: @HusseinSheikho || @Hussein_Sheikho
Admin: @HusseinSheikho || @Hussein_Sheikho
β€8