Data Science by ODS.ai 🦜
51K subscribers
363 photos
34 videos
7 files
1.52K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp
Download Telegram
​​Contrastive Semi-supervised Learning for ASR

Nowadays, pseudo-labeling is the most common method for pre-training automatic speech recognition (ASR) models, but in the case of low-resource setups and domain transfer, it suffers from a supervised teacher model’s degrading quality. The authors of this paper suggest using contrastive learning to overcome this problem.

CSL approach (Contrastive Semi-supervised Learning) uses teacher-generated predictions to select positive and negative examples instead of using pseudo-labels directly.

Experiments show that CSL has lower WER not only in comparison with standard CE-PL (Cross-Entropy pseudo-labeling) but also under low-resource and out-of-domain conditions.

To demonstrate its resilience to pseudo-labeling noise, the authors apply CSL pre-training in a low-resource setup with only 10hr of labeled data, where it reduces WER by 8% compared to the standard cross-entropy pseudo-labeling (CE-PL). This WER reduction increase to 19% with a teacher trained only on 1hr of labels and 17% for out-of-domain conditions.


Paper: https://arxiv.org/abs/2103.05149

#deeplearning #asr #contrastivelearning #semisupervised
Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains

This is an interesting paper about learning and combining representations of object shape and appearance from the different domains (for example, dogs and cars). This allows to create a model, which borrows different properties from each domain and generates images, which don't exist in a single domain.
The main idea is the following:
- use FineGAN as a base model;
- represent object appearance with a differentiable histogram of visual features;
- optimize the generator so that images with different shapes but similar appearances produce similar histograms;

Paper: https://openreview.net/forum?id=M88oFvqp_9
Project link: https://utkarshojha.github.io/inter-domain-gan/
Code will be available here: https://github.com/utkarshojha/inter-domain-gan

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-furrycars

#cv #gan #deeplearning #contrastivelearning