Spark in me
2.29K subscribers
726 photos
47 videos
114 files
2.62K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
New in our Open STT dataset

https://github.com/snakers4/open_stt#updates

- An mp3 version of the dataset;
- A torrent for mp3 dataset;
- A torrent for the original wav dataset;
- Benchmarks on the public dataset / files with "poor" annotation marked;

#deep_learning
#data_science
#dataset
New version of our open STT dataset - 0.5, now in beta

Please share and repost!

https://github.com/snakers4/open_stt/releases/tag/v0.5-beta

What is new?
- A new domain - radio (1000+ new hours);
- A larger YouTube dataset with 1000+ additional hours;
- A small (300 hours) YouTube dataset downloaded in maximum quality;
- Ground truth validation sets for YouTube / books / public calls manually annotated;
- Now we will start to focus on actually cleaning and distilling the dataset. We have published a second list of "bad" data;

I'm back from vacation)

#deep_learning
#data_science
#dataset