Spark in me
2.2K subscribers
822 photos
48 videos
116 files
2.68K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
Russian Open Speech To Text (STT/ASR) Dataset
4000 hours of STT data in Russian

Made by us. Yes, really. I am not joking.
It was a lot of work.

The dataset:
https://github.com/snakers4/open_stt/

Accompanying post:
https://spark-in.me/post/russian-open-stt-part1

TLDR:
- On third release, we have ~4000 hours;
- Contributors and help wanted;
- Let's bring the Imagenet moment in STT closer together!;

Please repost this as much as you can.

#stt
#asr
#data_science
#deep_learning
My foray into the STT Dark Forest

My tongue-in-cheek article on ML in general, and how to make your STT model train 3-4x faster with 4-5x less weights with the same quality

https://spark-in.me/post/stt-dark-forest

#data_science
#deep_learning
#stt
Russian Text Normalization for Speech Recognition

Usually no one talks about this, but STT / TTS technologies contain many "small" tasks that have to be solved, to make your STT / TTS pipeline work in real life.

For example:

- Speech recognition / dataset itself;
- Post-processing - beam-search / decoding;
- Domain customizations;
- Normalization (5 => пять);
- De-Normalization (пять => 5);

We want the Imagenet moment to arrive sooner in Speech in general.
So we released the Open STT dataset.
This time we have decided to share our text normalization to support STT research in Russian.

Please like / share / repost:

- Original publication
- Habr.com article
- GitHub repository
- Medium (coming soon!)
- Support dataset on Open Collective

#stt
#deep_learning
#nlp