Spark in me
2.2K subscribers
829 photos
48 videos
116 files
2.68K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
Russian Open Speech To Text (STT/ASR) Dataset
4000 hours of STT data in Russian

Made by us. Yes, really. I am not joking.
It was a lot of work.

The dataset:
https://github.com/snakers4/open_stt/

Accompanying post:
https://spark-in.me/post/russian-open-stt-part1

TLDR:
- On third release, we have ~4000 hours;
- Contributors and help wanted;
- Let's bring the Imagenet moment in STT closer together!;

Please repost this as much as you can.

#stt
#asr
#data_science
#deep_learning