π’ Name Of Dataset: BAH (Behavioural Ambivalence/Hesitancy)
π’ Description Of Dataset:
Recognizing complex emotions linked to ambivalence and hesitancy (A/H) can play a critical role in the personalization and effectiveness of digital behaviour change interventions. These subtle and conflicting emotions are manifested by a discord between multiple modalities, such as facial and vocal expressions, and body language. Although experts can be trained to identify A/H, integrating them into digital interventions is costly and less effective. Automatic learning systems provide a cost-effective alternative that can adapt to individual users, and operate seamlessly within real-time, and resource-limited environments. However, there are currently no datasets available for the design of ML models to recognize A/H.This paper introduces a first Behavioural Ambivalence/Hesitancy ( BAH) dataset collected for subject-based multimodal recognition of A/H in videos. It contains videos from 224 participants captured across 9 provinces in Canada, with different age, and ethnicity. Through our web platform, we recruited participants to answer 7 questions, some of which were designed to elicit A/H while recording themselves via webcam with microphone. BAH amounts to 1,118 videos for a total duration of 8.26 hours with 1.5 hours of A/H. Our behavioural team annotated timestamp segments to indicate where A/H occurs, and provide frame- and video-level annotations with the A/H cues. Video transcripts and their timestamps are also included, along with cropped and aligned faces in each frame, and a variety of participants meta-data.Additionally, this paper provides preliminary benchmarking results baseline models for BAH at frame- and video-level recognition with mono- and multi-modal setups. It also includes results on models for zero-shot prediction, and for personalization using unsupervised domain adaptation. The limited performance of baseline models highlights the challenges of recognizing A/H in real-world videos. The data, code, and pretrained weights are available.
π’ Official Homepage: https://github.com/sbelharbi/bah-dataset
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Behavioural Change
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
Recognizing complex emotions linked to ambivalence and hesitancy (A/H) can play a critical role in the personalization and effectiveness of digital behaviour change interventions. These subtle and conflicting emotions are manifested by a discord between multiple modalities, such as facial and vocal expressions, and body language. Although experts can be trained to identify A/H, integrating them into digital interventions is costly and less effective. Automatic learning systems provide a cost-effective alternative that can adapt to individual users, and operate seamlessly within real-time, and resource-limited environments. However, there are currently no datasets available for the design of ML models to recognize A/H.This paper introduces a first Behavioural Ambivalence/Hesitancy ( BAH) dataset collected for subject-based multimodal recognition of A/H in videos. It contains videos from 224 participants captured across 9 provinces in Canada, with different age, and ethnicity. Through our web platform, we recruited participants to answer 7 questions, some of which were designed to elicit A/H while recording themselves via webcam with microphone. BAH amounts to 1,118 videos for a total duration of 8.26 hours with 1.5 hours of A/H. Our behavioural team annotated timestamp segments to indicate where A/H occurs, and provide frame- and video-level annotations with the A/H cues. Video transcripts and their timestamps are also included, along with cropped and aligned faces in each frame, and a variety of participants meta-data.Additionally, this paper provides preliminary benchmarking results baseline models for BAH at frame- and video-level recognition with mono- and multi-modal setups. It also includes results on models for zero-shot prediction, and for personalization using unsupervised domain adaptation. The limited performance of baseline models highlights the challenges of recognizing A/H in real-world videos. The data, code, and pretrained weights are available.
π’ Official Homepage: https://github.com/sbelharbi/bah-dataset
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Behavioural Change
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
β€4π1
π’ Name Of Dataset: ITDD (Industrial Textile Defect Detection)
π’ Description Of Dataset:
The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.
π’ Official Homepage: https://github.com/cqylunlun/CRAS?tab=readme-ov-file#dataset-release
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.
π’ Official Homepage: https://github.com/cqylunlun/CRAS?tab=readme-ov-file#dataset-release
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Center-aware Residual Anomaly Synthesis for Multi-class Industrial Anomaly Detection
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
β€1
π’ Name Of Dataset: EGC-FPHFS (Early Gastric Cancer Data from First People's Hospital of Foshan)
π’ Description Of Dataset:
High-resolution early gastric cancer (EGC) detection and analysis: Patient DataοΌDatasets often include images from patients diagnosed with gastric cancer, specifically distinguishing between early gastric cancer (EGC) and Non -pathogenic gastric cancer (NGC). The study utilized data from 341 patients, with 124 classified as EGC and 217 as NGC. Image Types: High-resolution images are typically obtained from endoscopy image. Data Volume: The size of datasets mentioned a dataset of 1120 images specifically for EGC detection and 2150 images for NGC.
π’ Official Homepage: https://github.com/liu37972/Fuzzy-Seg-Deep-DuS-KFCM-.git
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Gastric histopathology image segmentation using a hierarchical conditional random field
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
High-resolution early gastric cancer (EGC) detection and analysis: Patient DataοΌDatasets often include images from patients diagnosed with gastric cancer, specifically distinguishing between early gastric cancer (EGC) and Non -pathogenic gastric cancer (NGC). The study utilized data from 341 patients, with 124 classified as EGC and 217 as NGC. Image Types: High-resolution images are typically obtained from endoscopy image. Data Volume: The size of datasets mentioned a dataset of 1120 images specifically for EGC detection and 2150 images for NGC.
π’ Official Homepage: https://github.com/liu37972/Fuzzy-Seg-Deep-DuS-KFCM-.git
π’ Number of articles that used this dataset: 1
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π Gastric histopathology image segmentation using a hierarchical conditional random field
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
π3β€1
π’ Name Of Dataset: MuJoCo
π’ Description Of Dataset:
MuJoCo(multi-joint dynamics with contact) is a physics engine used to implement environments to benchmark Reinforcement Learning methods.
π’ Official Homepage: https://www.mujoco.org/
π’ Number of articles that used this dataset: 1613
π’ Dataset Loaders:
deepmind/mujoco:
https://github.com/deepmind/mujoco
π’ Articles related to the dataset:
π Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
π Proximal Policy Optimization Algorithms
π Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online
π Model-Based Reinforcement Learning via Meta-Policy Optimization
π Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
π Primal Wasserstein Imitation Learning
π Unity: A General Platform for Intelligent Agents
π Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
π Physically Embedded Planning Problems: New Challenges for Reinforcement Learning
π Trust Region Policy Optimization
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
MuJoCo(multi-joint dynamics with contact) is a physics engine used to implement environments to benchmark Reinforcement Learning methods.
π’ Official Homepage: https://www.mujoco.org/
π’ Number of articles that used this dataset: 1613
π’ Dataset Loaders:
deepmind/mujoco:
https://github.com/deepmind/mujoco
π’ Articles related to the dataset:
π Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
π Proximal Policy Optimization Algorithms
π Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online
π Model-Based Reinforcement Learning via Meta-Policy Optimization
π Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
π Primal Wasserstein Imitation Learning
π Unity: A General Platform for Intelligent Agents
π Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
π Physically Embedded Planning Problems: New Challenges for Reinforcement Learning
π Trust Region Policy Optimization
==================================
π΄ For more data science resources:
β https://t.me/DataScienceT
β€4
π’ Name Of Dataset: OpenAI Gym
π’ Description Of Dataset:
OpenAI Gymis a toolkit for developing and comparing reinforcement learning algorithms. It includes environment such as Algorithmic, Atari, Box2D, Classic Control, MuJoCo, Robotics, and Toy Text.Source:https://github.com/openai/gym
π’ Official Homepage: https://gym.openai.com/
π’ Number of articles that used this dataset: 1296
π’ Dataset Loaders:
openai/gym:
https://github.com/openai/gym/blob/master/docs/environments.md
π’ Articles related to the dataset:
π Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
π Parameter Space Noise for Exploration
π Proximal Policy Optimization Algorithms
π Continuous control with deep reinforcement learning
π OpenAI Gym
π NoRML: No-Reward Meta Learning
π SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models
π Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
π FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning
π Dynamic Datasets and Market Environments for Financial Reinforcement Learning
==================================
π΄ For more datasets resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
OpenAI Gymis a toolkit for developing and comparing reinforcement learning algorithms. It includes environment such as Algorithmic, Atari, Box2D, Classic Control, MuJoCo, Robotics, and Toy Text.Source:https://github.com/openai/gym
π’ Official Homepage: https://gym.openai.com/
π’ Number of articles that used this dataset: 1296
π’ Dataset Loaders:
openai/gym:
https://github.com/openai/gym/blob/master/docs/environments.md
π’ Articles related to the dataset:
π Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
π Parameter Space Noise for Exploration
π Proximal Policy Optimization Algorithms
π Continuous control with deep reinforcement learning
π OpenAI Gym
π NoRML: No-Reward Meta Learning
π SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models
π Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
π FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning
π Dynamic Datasets and Market Environments for Financial Reinforcement Learning
==================================
π΄ For more datasets resources:
β https://t.me/DataScienceT
β€4
π’ Name Of Dataset: InBreast
π’ Description Of Dataset:
Rationale and objectives: Computer-aided detection and diagnosis (CAD) systems have been developed in the past two decades to assist radiologists in the detection and diagnosis of lesions seen on breast imaging exams, thus providing a second opinion. Mammographic databases play an important role in the development of algorithms aiming at the detection and diagnosis of mammary lesions. However, available databases often do not take into consideration all the requirements needed for research and study purposes. This article aims to present and detail a new mammographic database.Materials and methods: Images were acquired at a breast center located in a university hospital (Centro Hospitalar de S. JoΓ£o [CHSJ], Breast Centre, Porto) with the permission of the Portuguese National Committee of Data Protection and Hospital's Ethics Committee. MammoNovation Siemens full-field digital mammography, with a solid-state detector of amorphous selenium was used.Results:The new database-INbreast-has a total of 115 cases (410 images) from which 90 cases are from women with both breasts affected (four images per case) and 25 cases are from mastectomy patients (two images per case). Several types of lesions (masses, calcifications, asymmetries, and distortions) were included. Accurate contours made by specialists are also provided in XML format.Conclusion: The strengths of the actually presented database-INbreast-relies on the fact that it was built with full-field digital mammograms (in opposition to digitized mammograms), it presents a wide variability of cases, and is made publicly available together with precise annotations. We believe that this database can be a reference for future works centered or related to breast cancer imaging.
π’ Official Homepage: Not found
π’ Number of articles that used this dataset: 106
π’ Dataset Loaders:
ngohongthong1832004/inBreast:
https://github.com/ngohongthong1832004/inBreast
π’ Articles related to the dataset:
π MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis
π Deep Learning to Improve Breast Cancer Early Detection on Screening Mammography
π End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design
π Multi-view Local Co-occurrence and Global Consistency Learning Improve Mammogram Classification Generalisation
π medigan: a Python library of pretrained generative models for medical image synthesis
π High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks
π Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification
π Detecting and classifying lesions in mammograms with Deep Learning
π Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers
π Unbiased Mean Teacher for Cross-domain Object Detection
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Rationale and objectives: Computer-aided detection and diagnosis (CAD) systems have been developed in the past two decades to assist radiologists in the detection and diagnosis of lesions seen on breast imaging exams, thus providing a second opinion. Mammographic databases play an important role in the development of algorithms aiming at the detection and diagnosis of mammary lesions. However, available databases often do not take into consideration all the requirements needed for research and study purposes. This article aims to present and detail a new mammographic database.Materials and methods: Images were acquired at a breast center located in a university hospital (Centro Hospitalar de S. JoΓ£o [CHSJ], Breast Centre, Porto) with the permission of the Portuguese National Committee of Data Protection and Hospital's Ethics Committee. MammoNovation Siemens full-field digital mammography, with a solid-state detector of amorphous selenium was used.Results:The new database-INbreast-has a total of 115 cases (410 images) from which 90 cases are from women with both breasts affected (four images per case) and 25 cases are from mastectomy patients (two images per case). Several types of lesions (masses, calcifications, asymmetries, and distortions) were included. Accurate contours made by specialists are also provided in XML format.Conclusion: The strengths of the actually presented database-INbreast-relies on the fact that it was built with full-field digital mammograms (in opposition to digitized mammograms), it presents a wide variability of cases, and is made publicly available together with precise annotations. We believe that this database can be a reference for future works centered or related to breast cancer imaging.
π’ Official Homepage: Not found
π’ Number of articles that used this dataset: 106
π’ Dataset Loaders:
ngohongthong1832004/inBreast:
https://github.com/ngohongthong1832004/inBreast
π’ Articles related to the dataset:
π MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis
π Deep Learning to Improve Breast Cancer Early Detection on Screening Mammography
π End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design
π Multi-view Local Co-occurrence and Global Consistency Learning Improve Mammogram Classification Generalisation
π medigan: a Python library of pretrained generative models for medical image synthesis
π High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks
π Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification
π Detecting and classifying lesions in mammograms with Deep Learning
π Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers
π Unbiased Mean Teacher for Cross-domain Object Detection
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
GitHub
GitHub - ngohongthong1832004/inBreast: A project for PBL
A project for PBL. Contribute to ngohongthong1832004/inBreast development by creating an account on GitHub.
β€5
π’ Name Of Dataset: SVHN (Street View House Numbers)
π’ Description Of Dataset:
Street View House Numbers (SVHN) is a digit classification benchmark dataset that contains 600,000 32Γ32 RGB images of printed digits (from 0 to 9) cropped from pictures of house number plates. The cropped images are centered in the digit of interest, but nearby digits and other distractors are kept in the image. SVHN has three sets: training, testing sets and an extra set with 530,000 images that are less difficult and can be used for helping with the training process.Source:Reading Digits in Natural Images with Unsupervised Feature Learning
π’ Official Homepage: http://ufldl.stanford.edu/housenumbers/
π’ Number of articles that used this dataset: 3388
π’ Dataset Loaders:
huggingface/datasets (svhn):
https://huggingface.co/datasets/ufldl-stanford/svhn
huggingface/datasets (svhn):
https://huggingface.co/datasets/svhn
pytorch/vision:
https://pytorch.org/vision/stable/generated/torchvision.datasets.SVHN.html
activeloopai/deeplake:
https://datasets.activeloop.ai/docs/ml/datasets/the-street-view-house-numbers-svhn-dataset/
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/svhn_cropped
Graviti-AI/datasets:
https://gas.graviti.com/dataset/graviti/SVHN
afagarap/pt-datasets:
https://gitlab.com/afagarap/pt-datasets
π’ Articles related to the dataset:
π Domain Separation Networks
π AutoAugment: Learning Augmentation Policies from Data
π Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
π PACT: Parameterized Clipping Activation for Quantized Neural Networks
π A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
π An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems
π Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
π Robust outlier detection by de-biasing VAE likelihoods
π ASPEST: Bridging the Gap Between Active Learning and Selective Prediction
π Dataset Distillation with Infinitely Wide Convolutional Networks
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Street View House Numbers (SVHN) is a digit classification benchmark dataset that contains 600,000 32Γ32 RGB images of printed digits (from 0 to 9) cropped from pictures of house number plates. The cropped images are centered in the digit of interest, but nearby digits and other distractors are kept in the image. SVHN has three sets: training, testing sets and an extra set with 530,000 images that are less difficult and can be used for helping with the training process.Source:Reading Digits in Natural Images with Unsupervised Feature Learning
π’ Official Homepage: http://ufldl.stanford.edu/housenumbers/
π’ Number of articles that used this dataset: 3388
π’ Dataset Loaders:
huggingface/datasets (svhn):
https://huggingface.co/datasets/ufldl-stanford/svhn
huggingface/datasets (svhn):
https://huggingface.co/datasets/svhn
pytorch/vision:
https://pytorch.org/vision/stable/generated/torchvision.datasets.SVHN.html
activeloopai/deeplake:
https://datasets.activeloop.ai/docs/ml/datasets/the-street-view-house-numbers-svhn-dataset/
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/svhn_cropped
Graviti-AI/datasets:
https://gas.graviti.com/dataset/graviti/SVHN
afagarap/pt-datasets:
https://gitlab.com/afagarap/pt-datasets
π’ Articles related to the dataset:
π Domain Separation Networks
π AutoAugment: Learning Augmentation Policies from Data
π Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
π PACT: Parameterized Clipping Activation for Quantized Neural Networks
π A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
π An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems
π Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
π Robust outlier detection by de-biasing VAE likelihoods
π ASPEST: Bridging the Gap Between Active Learning and Selective Prediction
π Dataset Distillation with Infinitely Wide Convolutional Networks
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: BreakHis (Breast Cancer Histopathological Database)
π’ Description Of Dataset:
The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X). It contains 2,480 benign and 5,429 malignant samples (700X460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format). This database has been built in collaboration with the P&D Laboratory - Pathological Anatomy and Cytopathology, Parana, Brazil.Paper:F. A. Spanhol, L. S. Oliveira, C. Petitjean and L. Heutte, "A Dataset for Breast Cancer Histopathological Image Classification," in IEEE Transactions on Biomedical Engineering, vol. 63, no. 7, pp. 1455-1462, July 2016, doi: 10.1109/TBME.2015.2496264Source:https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/
π’ Official Homepage: https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/
π’ Number of articles that used this dataset: 11
π’ Dataset Loaders:
niconaufal21/cnn-breast-cancer:
https://github.com/niconaufal21/cnn-breast-cancer
π’ Articles related to the dataset:
π A Comparative Survey of Deep Active Learning
π Breast Cancer Histopathology Image Classification and Localization using Multiple Instance Learning
π CNN Based Autoencoder Application in Breast Cancer Image Retrieval
π Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images
π VGGIN-Net: Deep Transfer Network for Imbalanced Breast Cancer Dataset
π Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
π Classification of Breast Tumours Based on Histopathology Images Using Deep Features and Ensemble of Gradient Boosting Methods
π Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method
π Breast-NET: a lightweight DCNN model for breast cancer detection and grading using histological samples
π Attention-Map Augmentation for Hypercomplex Breast Cancer Classification
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X). It contains 2,480 benign and 5,429 malignant samples (700X460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format). This database has been built in collaboration with the P&D Laboratory - Pathological Anatomy and Cytopathology, Parana, Brazil.Paper:F. A. Spanhol, L. S. Oliveira, C. Petitjean and L. Heutte, "A Dataset for Breast Cancer Histopathological Image Classification," in IEEE Transactions on Biomedical Engineering, vol. 63, no. 7, pp. 1455-1462, July 2016, doi: 10.1109/TBME.2015.2496264Source:https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/
π’ Official Homepage: https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/
π’ Number of articles that used this dataset: 11
π’ Dataset Loaders:
niconaufal21/cnn-breast-cancer:
https://github.com/niconaufal21/cnn-breast-cancer
π’ Articles related to the dataset:
π A Comparative Survey of Deep Active Learning
π Breast Cancer Histopathology Image Classification and Localization using Multiple Instance Learning
π CNN Based Autoencoder Application in Breast Cancer Image Retrieval
π Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images
π VGGIN-Net: Deep Transfer Network for Imbalanced Breast Cancer Dataset
π Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
π Classification of Breast Tumours Based on Histopathology Images Using Deep Features and Ensemble of Gradient Boosting Methods
π Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method
π Breast-NET: a lightweight DCNN model for breast cancer detection and grading using histological samples
π Attention-Map Augmentation for Hypercomplex Breast Cancer Classification
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4π₯2
π PRIVATE CHANNEL now OPEN
+$154.382 per day raised by Jack subsβ‘οΈ
π https://t.me/+qRKP9TmigCwxM2Iy
Donβt miss signals and start make money π₯
+$154.382 per day raised by Jack subsβ‘οΈ
π https://t.me/+qRKP9TmigCwxM2Iy
Donβt miss signals and start make money π₯
β€3
π’ Name Of Dataset: DRIVE (Digital Retinal Images for Vessel Extraction)
π’ Description Of Dataset:
TheDigital Retinal Images for Vessel Extraction(DRIVE) dataset is a dataset for retinal vessel segmentation. It consists of a total of JPEG 40 color fundus images; including 7 abnormal pathology cases. The images were obtained from a diabetic retinopathy screening program in the Netherlands. The images were acquired using Canon CR5 non-mydriatic 3CCD camera with FOV equals to 45 degrees. Each image resolution is 584*565 pixels with eight bits per color channel (3 channels).The set of 40 images was equally divided into 20 images for the training set and 20 images for the testing set. Inside both sets, for each image, there is circular field of view (FOV) mask of diameter that is approximately 540 pixels. Inside training set, for each image, one manual segmentation by an ophthalmological expert has been applied. Inside testing set, for each image, two manual segmentations have been applied by two different observers, where the first observer segmentation is accepted as the ground-truth for performance evaluation.Source:Ant Colony based Feature Selection Heuristics for Retinal Vessel Segmentation
π’ Official Homepage: https://drive.grand-challenge.org/
π’ Number of articles that used this dataset: 309
π’ Dataset Loaders:
open-mmlab/mmsegmentation:
https://github.com/open-mmlab/mmsegmentation/blob/master/docs/dataset_prepare.md
activeloopai/Hub:
https://docs.activeloop.ai/datasets/drive-dataset
π’ Articles related to the dataset:
π U-Net: Convolutional Networks for Biomedical Image Segmentation
π Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation
π Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation
π Robust Retinal Vessel Segmentation from a Data Augmentation Perspective
π Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model
π Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions
π Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation
π UniverSeg: Universal Medical Image Segmentation
π Road Extraction by Deep Residual U-Net
π Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheDigital Retinal Images for Vessel Extraction(DRIVE) dataset is a dataset for retinal vessel segmentation. It consists of a total of JPEG 40 color fundus images; including 7 abnormal pathology cases. The images were obtained from a diabetic retinopathy screening program in the Netherlands. The images were acquired using Canon CR5 non-mydriatic 3CCD camera with FOV equals to 45 degrees. Each image resolution is 584*565 pixels with eight bits per color channel (3 channels).The set of 40 images was equally divided into 20 images for the training set and 20 images for the testing set. Inside both sets, for each image, there is circular field of view (FOV) mask of diameter that is approximately 540 pixels. Inside training set, for each image, one manual segmentation by an ophthalmological expert has been applied. Inside testing set, for each image, two manual segmentations have been applied by two different observers, where the first observer segmentation is accepted as the ground-truth for performance evaluation.Source:Ant Colony based Feature Selection Heuristics for Retinal Vessel Segmentation
π’ Official Homepage: https://drive.grand-challenge.org/
π’ Number of articles that used this dataset: 309
π’ Dataset Loaders:
open-mmlab/mmsegmentation:
https://github.com/open-mmlab/mmsegmentation/blob/master/docs/dataset_prepare.md
activeloopai/Hub:
https://docs.activeloop.ai/datasets/drive-dataset
π’ Articles related to the dataset:
π U-Net: Convolutional Networks for Biomedical Image Segmentation
π Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation
π Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation
π Robust Retinal Vessel Segmentation from a Data Augmentation Perspective
π Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model
π Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions
π Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation
π UniverSeg: Universal Medical Image Segmentation
π Road Extraction by Deep Residual U-Net
π Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: 100DOH (100 Days Of Hands Dataset)
π’ Description Of Dataset:
The 100 Days Of Hands Dataset (100DOH) is a large-scale video dataset containing hands and hand-object interactions. It consists of 27.3K Youtube videos from 11 categories with nearly 131 days of footage of everyday interaction. The focus of the dataset is hand contact, and it includes both first-person and third-person perspectives. The videos in 100DOH are unconstrained and content-rich, ranging from records of daily life to specific instructional videos. To enforce diversity, the dataset contains no more than 20 videos from each uploader.Source:
π’ Official Homepage: http://fouheylab.eecs.umich.edu/~dandans/projects/100DOH/100DOH.html
π’ Number of articles that used this dataset: 249
π’ Dataset Loaders:
albertogaspar/dts:
https://github.com/albertogaspar/dts
π’ Articles related to the dataset:
π HuggingFace's Transformers: State-of-the-art Natural Language Processing
π Automatic Differentiation in PyTorch
π AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
π Tacotron: Towards End-to-End Speech Synthesis
π SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
π DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
π Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
π Enriching Word Vectors with Subword Information
π An Aerial Weed Detection System for Green Onion Crops Using the You Only Look Once (YOLOv3) Deep Learning Algorithm
π Real Time Pear Fruit Detection and Counting Using YOLOv4 Models and Deep SORT
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
The 100 Days Of Hands Dataset (100DOH) is a large-scale video dataset containing hands and hand-object interactions. It consists of 27.3K Youtube videos from 11 categories with nearly 131 days of footage of everyday interaction. The focus of the dataset is hand contact, and it includes both first-person and third-person perspectives. The videos in 100DOH are unconstrained and content-rich, ranging from records of daily life to specific instructional videos. To enforce diversity, the dataset contains no more than 20 videos from each uploader.Source:
π’ Official Homepage: http://fouheylab.eecs.umich.edu/~dandans/projects/100DOH/100DOH.html
π’ Number of articles that used this dataset: 249
π’ Dataset Loaders:
albertogaspar/dts:
https://github.com/albertogaspar/dts
π’ Articles related to the dataset:
π HuggingFace's Transformers: State-of-the-art Natural Language Processing
π Automatic Differentiation in PyTorch
π AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
π Tacotron: Towards End-to-End Speech Synthesis
π SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
π DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
π Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
π Enriching Word Vectors with Subword Information
π An Aerial Weed Detection System for Green Onion Crops Using the You Only Look Once (YOLOv3) Deep Learning Algorithm
π Real Time Pear Fruit Detection and Counting Using YOLOv4 Models and Deep SORT
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: VITON-HD (High-Resolution VITON-Zalando Dataset)
π’ Description Of Dataset:
VITON-HDdataset is a dataset for high-resolution (i.e., 1024x768) virtual try-on of clothing items. Specifically, it consists of 13,679 frontal-view woman and top clothing image pairs.
π’ Official Homepage: https://github.com/shadow2496/VITON-HD
π’ Number of articles that used this dataset: 82
π’ Dataset Loaders:
yisol/IDM-VTON:
https://github.com/yisol/IDM-VTON
rizavelioglu/tryoffdiff:
https://github.com/rizavelioglu/tryoffdiff
π’ Articles related to the dataset:
π OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
π Improving Diffusion Models for Authentic Virtual Try-on in the Wild
π Learning Flow Fields in Attention for Controllable Person Image Generation
π CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
π StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On
π VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization
π High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions
π FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
π ViViD: Video Virtual Try-on using Diffusion Models
π GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
==================================
π΄ For more datasets resources:
β https://t.me/DataScienceT
π’ Description Of Dataset:
VITON-HDdataset is a dataset for high-resolution (i.e., 1024x768) virtual try-on of clothing items. Specifically, it consists of 13,679 frontal-view woman and top clothing image pairs.
π’ Official Homepage: https://github.com/shadow2496/VITON-HD
π’ Number of articles that used this dataset: 82
π’ Dataset Loaders:
yisol/IDM-VTON:
https://github.com/yisol/IDM-VTON
rizavelioglu/tryoffdiff:
https://github.com/rizavelioglu/tryoffdiff
π’ Articles related to the dataset:
π OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
π Improving Diffusion Models for Authentic Virtual Try-on in the Wild
π Learning Flow Fields in Attention for Controllable Person Image Generation
π CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
π StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On
π VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization
π High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions
π FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
π ViViD: Video Virtual Try-on using Diffusion Models
π GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
==================================
π΄ For more datasets resources:
β https://t.me/DataScienceT
β€3π₯2
π’ Name Of Dataset: MMI (MMI Facial Expression Database)
π’ Description Of Dataset:
TheMMIFacial Expression Database consists of over 2900 videos and high-resolution still images of 75 subjects. It is fully annotated for the presence of AUs in videos (event coding), and partially coded on frame-level, indicating for each frame whether an AU is in either the neutral, onset, apex or offset phase. A small part was annotated for audio-visual laughters.Source:https://mmifacedb.eu/
π’ Official Homepage: https://mmifacedb.eu/
π’ Number of articles that used this dataset: 65
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
π Facial Motion Prior Networks for Facial Expression Recognition
π Deep Facial Expression Recognition: A Survey
π Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
π Island Loss for Learning Discriminative Features in Facial Expression Recognition
π DeXpression: Deep Convolutional Neural Network for Expression Recognition
π Fine-Grained Expression Manipulation via Structured Latent Space
π AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild
π eMotion-GAN: A Motion-based GAN for Photorealistic and Facial Expression Preserving Frontal View Synthesis
π Beyond FACS: Data-driven Facial Expression Dictionaries, with Application to Predicting Autism
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheMMIFacial Expression Database consists of over 2900 videos and high-resolution still images of 75 subjects. It is fully annotated for the presence of AUs in videos (event coding), and partially coded on frame-level, indicating for each frame whether an AU is in either the neutral, onset, apex or offset phase. A small part was annotated for audio-visual laughters.Source:https://mmifacedb.eu/
π’ Official Homepage: https://mmifacedb.eu/
π’ Number of articles that used this dataset: 65
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
π Facial Motion Prior Networks for Facial Expression Recognition
π Deep Facial Expression Recognition: A Survey
π Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
π Island Loss for Learning Discriminative Features in Facial Expression Recognition
π DeXpression: Deep Convolutional Neural Network for Expression Recognition
π Fine-Grained Expression Manipulation via Structured Latent Space
π AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild
π eMotion-GAN: A Motion-based GAN for Photorealistic and Facial Expression Preserving Frontal View Synthesis
π Beyond FACS: Data-driven Facial Expression Dictionaries, with Application to Predicting Autism
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€5
π’ Name Of Dataset: AMASS
π’ Description Of Dataset:
AMASS is a large database of human motion unifying different optical marker-based motion capture datasets by representing them within a common framework and parameterization. AMASS is readily useful for animation, visualization, and generating training data for deep learning.
π’ Official Homepage: https://amass.is.tue.mpg.de/
π’ Number of articles that used this dataset: 354
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π VIBE: Video Inference for Human Body Pose and Shape Estimation
π MotionGPT: Human Motion as a Foreign Language
π MotionBERT: A Unified Perspective on Learning Human Motion Representations
π DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition
π AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars
π EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
π Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling
π MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
π WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion
π Deep motifs and motion signatures
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
AMASS is a large database of human motion unifying different optical marker-based motion capture datasets by representing them within a common framework and parameterization. AMASS is readily useful for animation, visualization, and generating training data for deep learning.
π’ Official Homepage: https://amass.is.tue.mpg.de/
π’ Number of articles that used this dataset: 354
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π VIBE: Video Inference for Human Body Pose and Shape Estimation
π MotionGPT: Human Motion as a Foreign Language
π MotionBERT: A Unified Perspective on Learning Human Motion Representations
π DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition
π AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars
π EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
π Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling
π MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
π WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion
π Deep motifs and motion signatures
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: Office-Home
π’ Description Of Dataset:
Office-Homeis a benchmark dataset for domain adaptation which contains 4 domains where each domain consists of 65 categories. The four domains are: Art β artistic images in the form of sketches, paintings, ornamentation, etc.; Clipart β collection of clipart images; Product β images of objects without a background and Real-World β images of objects captured with a regular camera. It contains 15,500 images, with an average of around 70 images per class and a maximum of 99 images in a class.Source:Multi-component Image Translation for Deep Domain Generalization
π’ Official Homepage: https://www.hemanthdv.org/officeHomeDataset.html
π’ Number of articles that used this dataset: 1064
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/office-home-dataset
π’ Articles related to the dataset:
π Domain Conditional Predictors for Domain Adaptation
π Transfer Learning with Dynamic Distribution Adaptation
π FIXED: Frustratingly Easy Domain Generalization with Mixup
π Visual Domain Adaptation with Manifold Embedded Distribution Alignment
π Easy Transfer Learning By Exploiting Intra-domain Structures
π Generalizing to Unseen Domains: A Survey on Domain Generalization
π Learning to Match Distributions for Domain Adaptation
π Unsupervised Domain Adaptation by Backpropagation
π Domain-Adversarial Training of Neural Networks
π A Review of Single-Source Deep Unsupervised Visual Domain Adaptation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Office-Homeis a benchmark dataset for domain adaptation which contains 4 domains where each domain consists of 65 categories. The four domains are: Art β artistic images in the form of sketches, paintings, ornamentation, etc.; Clipart β collection of clipart images; Product β images of objects without a background and Real-World β images of objects captured with a regular camera. It contains 15,500 images, with an average of around 70 images per class and a maximum of 99 images in a class.Source:Multi-component Image Translation for Deep Domain Generalization
π’ Official Homepage: https://www.hemanthdv.org/officeHomeDataset.html
π’ Number of articles that used this dataset: 1064
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/office-home-dataset
π’ Articles related to the dataset:
π Domain Conditional Predictors for Domain Adaptation
π Transfer Learning with Dynamic Distribution Adaptation
π FIXED: Frustratingly Easy Domain Generalization with Mixup
π Visual Domain Adaptation with Manifold Embedded Distribution Alignment
π Easy Transfer Learning By Exploiting Intra-domain Structures
π Generalizing to Unseen Domains: A Survey on Domain Generalization
π Learning to Match Distributions for Domain Adaptation
π Unsupervised Domain Adaptation by Backpropagation
π Domain-Adversarial Training of Neural Networks
π A Review of Single-Source Deep Unsupervised Visual Domain Adaptation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€1
π’ Name Of Dataset: M3-VOS (M3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation)
π’ Description Of Dataset:
π‘ DescriptionA new benchmark, Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation (M3-VOS), to verify the ability of models to understand object phases, which consists of 479 high-resolution videos spanning over 10 distinct everyday scenarios. We collected 205,181 masks, with an average track duration of 14.27s. M3-VOS covers 120+ categories of objects across 6 phases within 14 scenarios, encompassing 23 specific phase transitions.Venue:CVPR2025Repository:Tool π ,Pageπ Paper:arxiv.org/html/2412.13803v2Point of Contact:Jiaxin Li,Zixuan Chen
π’ Official Homepage: https://zixuan-chen.github.io/M-cube-VOS.github.io/
π’ Number of articles that used this dataset: 4
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π SAM 2: Segment Anything in Images and Videos
π XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
π Putting the Object Back into Video Object Segmentation
π M^3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
π‘ DescriptionA new benchmark, Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation (M3-VOS), to verify the ability of models to understand object phases, which consists of 479 high-resolution videos spanning over 10 distinct everyday scenarios. We collected 205,181 masks, with an average track duration of 14.27s. M3-VOS covers 120+ categories of objects across 6 phases within 14 scenarios, encompassing 23 specific phase transitions.Venue:CVPR2025Repository:Tool π ,Pageπ Paper:arxiv.org/html/2412.13803v2Point of Contact:Jiaxin Li,Zixuan Chen
π’ Official Homepage: https://zixuan-chen.github.io/M-cube-VOS.github.io/
π’ Number of articles that used this dataset: 4
π’ Dataset Loaders:
Not found
π’ Articles related to the dataset:
π SAM 2: Segment Anything in Images and Videos
π XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
π Putting the Object Back into Video Object Segmentation
π M^3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€4
π’ Name Of Dataset: Office-Home
π’ Description Of Dataset:
Office-Homeis a benchmark dataset for domain adaptation which contains 4 domains where each domain consists of 65 categories. The four domains are: Art β artistic images in the form of sketches, paintings, ornamentation, etc.; Clipart β collection of clipart images; Product β images of objects without a background and Real-World β images of objects captured with a regular camera. It contains 15,500 images, with an average of around 70 images per class and a maximum of 99 images in a class.Source:Multi-component Image Translation for Deep Domain Generalization
π’ Official Homepage: https://www.hemanthdv.org/officeHomeDataset.html
π’ Number of articles that used this dataset: 1064
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/office-home-dataset
π’ Articles related to the dataset:
π Domain Conditional Predictors for Domain Adaptation
π Generalizing to Unseen Domains: A Survey on Domain Generalization
π Transfer Learning with Dynamic Distribution Adaptation
π Easy Transfer Learning By Exploiting Intra-domain Structures
π FIXED: Frustratingly Easy Domain Generalization with Mixup
π Visual Domain Adaptation with Manifold Embedded Distribution Alignment
π Learning to Match Distributions for Domain Adaptation
π Unsupervised Domain Adaptation by Backpropagation
π Domain-Adversarial Training of Neural Networks
π A Review of Single-Source Deep Unsupervised Visual Domain Adaptation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Office-Homeis a benchmark dataset for domain adaptation which contains 4 domains where each domain consists of 65 categories. The four domains are: Art β artistic images in the form of sketches, paintings, ornamentation, etc.; Clipart β collection of clipart images; Product β images of objects without a background and Real-World β images of objects captured with a regular camera. It contains 15,500 images, with an average of around 70 images per class and a maximum of 99 images in a class.Source:Multi-component Image Translation for Deep Domain Generalization
π’ Official Homepage: https://www.hemanthdv.org/officeHomeDataset.html
π’ Number of articles that used this dataset: 1064
π’ Dataset Loaders:
activeloopai/Hub:
https://docs.activeloop.ai/datasets/office-home-dataset
π’ Articles related to the dataset:
π Domain Conditional Predictors for Domain Adaptation
π Generalizing to Unseen Domains: A Survey on Domain Generalization
π Transfer Learning with Dynamic Distribution Adaptation
π Easy Transfer Learning By Exploiting Intra-domain Structures
π FIXED: Frustratingly Easy Domain Generalization with Mixup
π Visual Domain Adaptation with Manifold Embedded Distribution Alignment
π Learning to Match Distributions for Domain Adaptation
π Unsupervised Domain Adaptation by Backpropagation
π Domain-Adversarial Training of Neural Networks
π A Review of Single-Source Deep Unsupervised Visual Domain Adaptation
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€3
π’ Name Of Dataset: Moving MNIST
π’ Description Of Dataset:
TheMoving MNISTdataset contains 10,000 video sequences, each consisting of 20 frames. In each video sequence, two digits move independently around the frame, which has a spatial resolution of 64Γ64 pixels. The digits frequently intersect with each other and bounce off the edges of the frameSource:Mutual Suppression Network for Video Prediction using Disentangled Features
π’ Official Homepage: http://www.cs.toronto.edu/~nitish/unsupervised_video/
π’ Number of articles that used this dataset: 194
π’ Dataset Loaders:
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/moving_mnist
π’ Articles related to the dataset:
π Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
π What Makes for Good Views for Contrastive Learning?
π VideoGPT: Video Generation using VQ-VAE and Transformers
π Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
π PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs
π OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning
π Eidetic 3D LSTM: A Model for Video Prediction and Beyond
π SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning
π MogaNet: Multi-order Gated Aggregation Network
π Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
TheMoving MNISTdataset contains 10,000 video sequences, each consisting of 20 frames. In each video sequence, two digits move independently around the frame, which has a spatial resolution of 64Γ64 pixels. The digits frequently intersect with each other and bounce off the edges of the frameSource:Mutual Suppression Network for Video Prediction using Disentangled Features
π’ Official Homepage: http://www.cs.toronto.edu/~nitish/unsupervised_video/
π’ Number of articles that used this dataset: 194
π’ Dataset Loaders:
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/moving_mnist
π’ Articles related to the dataset:
π Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
π What Makes for Good Views for Contrastive Learning?
π VideoGPT: Video Generation using VQ-VAE and Transformers
π Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
π PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs
π OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning
π Eidetic 3D LSTM: A Model for Video Prediction and Beyond
π SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning
π MogaNet: Multi-order Gated Aggregation Network
π Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€7
π’ Name Of Dataset: Cityscapes
π’ Description Of Dataset:
Cityscapesis a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories (flat surfaces, humans, vehicles, constructions, objects, nature, sky, and void). The dataset consists of around 5000 fine annotated images and 20000 coarse annotated ones. Data was captured in 50 cities during several months, daytimes, and good weather conditions. It was originally recorded as video so the frames were manually selected to have the following features: large number of dynamic objects, varying scene layout, and varying background.Source:A Review on Deep Learning Techniques Applied to Semantic Segmentation
π’ Official Homepage: https://www.cityscapes-dataset.com/dataset-overview/
π’ Number of articles that used this dataset: 3679
π’ Dataset Loaders:
facebookresearch/detectron2:
https://detectron2.readthedocs.io/en/latest/tutorials/builtin_datasets.html#expected-dataset-structure-for-cityscapes
open-mmlab/mmdetection:
https://github.com/open-mmlab/mmdetection/blob/master/docs/1_exist_data_model.md
pytorch/vision:
https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.Cityscapes
voxel51/fiftyone:
https://docs.voxel51.com/user_guide/dataset_zoo/datasets.html#cityscapes
open-mmlab/mmsegmentation:
https://github.com/open-mmlab/mmsegmentation/blob/master/docs/dataset_prepare.md
Kaggle/kaggle-api:
https://www.kaggle.com/datasets/sakshaymahna/cityscapes-depth-and-segmentation
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/cityscapes
facebookresearch/MaskFormer:
https://github.com/facebookresearch/MaskFormer
π’ Articles related to the dataset:
π Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
π DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
π Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
π Searching for MobileNetV3
π Rethinking Atrous Convolution for Semantic Image Segmentation
π Searching for Efficient Multi-Scale Architectures for Dense Image Prediction
π Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
π Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation
π MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context
π Mask R-CNN
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Cityscapesis a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories (flat surfaces, humans, vehicles, constructions, objects, nature, sky, and void). The dataset consists of around 5000 fine annotated images and 20000 coarse annotated ones. Data was captured in 50 cities during several months, daytimes, and good weather conditions. It was originally recorded as video so the frames were manually selected to have the following features: large number of dynamic objects, varying scene layout, and varying background.Source:A Review on Deep Learning Techniques Applied to Semantic Segmentation
π’ Official Homepage: https://www.cityscapes-dataset.com/dataset-overview/
π’ Number of articles that used this dataset: 3679
π’ Dataset Loaders:
facebookresearch/detectron2:
https://detectron2.readthedocs.io/en/latest/tutorials/builtin_datasets.html#expected-dataset-structure-for-cityscapes
open-mmlab/mmdetection:
https://github.com/open-mmlab/mmdetection/blob/master/docs/1_exist_data_model.md
pytorch/vision:
https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.Cityscapes
voxel51/fiftyone:
https://docs.voxel51.com/user_guide/dataset_zoo/datasets.html#cityscapes
open-mmlab/mmsegmentation:
https://github.com/open-mmlab/mmsegmentation/blob/master/docs/dataset_prepare.md
Kaggle/kaggle-api:
https://www.kaggle.com/datasets/sakshaymahna/cityscapes-depth-and-segmentation
tensorflow/datasets:
https://www.tensorflow.org/datasets/catalog/cityscapes
facebookresearch/MaskFormer:
https://github.com/facebookresearch/MaskFormer
π’ Articles related to the dataset:
π Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
π DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
π Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
π Searching for MobileNetV3
π Rethinking Atrous Convolution for Semantic Image Segmentation
π Searching for Efficient Multi-Scale Architectures for Dense Image Prediction
π Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
π Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation
π MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context
π Mask R-CNN
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
π’ Name Of Dataset: Visual Question Answering v2.0 (VQA v2.0)
π’ Description Of Dataset:
Visual Question Answering (VQA) v2.0 is a dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer. It is the second version of theVQAdataset.265,016 images (COCO and abstract scenes)At least 3 questions (5.4 questions on average) per image10 ground truth answers per question3 plausible (but likely incorrect) answers per questionAutomatic evaluation metricThefirst version of the datasetwas released in October 2015.
π’ Official Homepage: https://visualqa.org/
π’ Number of articles that used this dataset: 365
π’ Dataset Loaders:
facebookresearch/ParlAI:
https://parl.ai/docs/tasks.html#vqav2
allenai/allennlp-models:
https://docs.allennlp.org/models/main/models/vision/dataset_readers/vqav2/
π’ Articles related to the dataset:
π Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
π Language Models are General-Purpose Interfaces
π VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
π MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
π CoCa: Contrastive Captioners are Image-Text Foundation Models
π BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
π Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training
π Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
π InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
π MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
π’ Description Of Dataset:
Visual Question Answering (VQA) v2.0 is a dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer. It is the second version of theVQAdataset.265,016 images (COCO and abstract scenes)At least 3 questions (5.4 questions on average) per image10 ground truth answers per question3 plausible (but likely incorrect) answers per questionAutomatic evaluation metricThefirst version of the datasetwas released in October 2015.
π’ Official Homepage: https://visualqa.org/
π’ Number of articles that used this dataset: 365
π’ Dataset Loaders:
facebookresearch/ParlAI:
https://parl.ai/docs/tasks.html#vqav2
allenai/allennlp-models:
https://docs.allennlp.org/models/main/models/vision/dataset_readers/vqav2/
π’ Articles related to the dataset:
π Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
π Language Models are General-Purpose Interfaces
π VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
π MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
π CoCa: Contrastive Captioners are Image-Text Foundation Models
π BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
π Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training
π Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
π InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
π MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
==================================
π΄ For more datasets resources:
β https://t.me/Datasets1
β€6
