Spark in me
2.24K subscribers
760 photos
48 videos
114 files
2.65K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
LSTM vs TCN vs Trellis network

- Did not try the Trellis network - decided it was too complex;
- All the TCN properties from the digest https://spark-in.me/post/2018_ds_ml_digest_31 hold - did not test for very long sequences;
- Looks like a really simple and reasonable alternative for RNNs for modeling and ensembling;
- On a sensible benchmark - performes mostly the same as LSTM from a practical standpoint;

https://github.com/locuslab/TCN/blob/master/TCN/tcn.py

#deep_learning
Tracking your hardware ... for data science

For a long time I though that if you really want to track all your servers' metrics you need Zabbix (which is very complicated).

A friend recommended me an amazing tool
- https://prometheus.io/docs/guides/node-exporter/

It installs and runs literally in minutes.
If you want to auto-start it properly, there are even a bit older Ubuntu packages and systemd examples
- https://github.com/prometheus/node_exporter/tree/master/examples/systemd


Dockerized metric exporters for GPUs by Nvidia
- https://github.com/NVIDIA/gpu-monitoring-tools/tree/master/exporters/prometheus-dcgm

It also features extensive alerting features, but they are very difficult to easily start, there being no minimal example
- https://prometheus.io/docs/alerting/overview/
- https://github.com/prometheus/docs/issues/581

#linux
Anyone knows anyone from TopCoder?
As usual with competition platforms organization sometimes has its issues
Forwarded from Анна
Привет!
Если кто не знает, кроме призовых за топ места, в спутниках была ещё одна классная фича - student's prize - приз для _студента_ с самым высоким скором. Там всё оказалось довольно неочевидно, отдельного лидерборда для студентов не было. Долго пыталась достучаться до админов, писала на почту, на форум, чтобы узнать больше подробностей. Спустя месяц админ таки ответил, что я единственный претендент на приз и, вроде, никаких проблем, всё улаживаем, кидай студак. И снова пропал. Периодически напоминала о своем существовании, интересовалась, как там дела, есть ли подвижки, в ответ игнор. *Ответа нет до сих пор.* Я впервые участвую в серьезном сореве и не совсем понимаю, что можно сделать в такой ситуации. Ждать новостей? Писать посты в твитер? Есть ли какой-то способ достучаться до админов?

Олсо, написала тут небольшую статейку про свое решение. https://spark-in.me/post/spacenet4
5th 2019 DS / ML digest

Highlights of the week
- New Adam version;
- POS tagging and semantic parsing in Russian;
- ML industrialization again;

https://spark-in.me/post/2019_ds_ml_digest_05

#digest
#data_science
#deep_learning
Russian STT datasets

Anyone knows more proper datasets?

I found this (60 hours), but I could not find the link to the dataset:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/274_Paper.pdf

Anyway, here is the list I found:

- 20 hours of Bible https://github.com/festvox/datasets-CMU_Wilderness;
- https://www.kaggle.com/bryanpark/russian-single-speaker-speech-dataset - does not say how many hours
- Ofc audio book datasets - https://www.caito.de/data/Training/stt_tts/ + and some scraping scripts https://github.com/ainy/shershe/tree/master/scripts
- And some disappointment here https://voice.mozilla.org/ru/languages

#deep_learning
Our experiments with Transformers, BERT and generative language pre-training

TLDR

For morphologically rich languages pre-trained Transformers are not a silver bullet and from a layman's perspective they are not feasible unless someone invests huge computational resources into sub-word tokenization methods that work well + actually training these large networks.

On the other hand we have definitively shown that:

- Starting a transformer with Embedding bag initialized via FastText works and is relatively feasible;
- On complicated tasks - such transformer significantly outperforms training from scratch (as well as naive models) and shows decent results compared to state-of-the-art specialized models;
- Pre-training worked, but it overfitted more thatn FastText initialization and given the complexity required for such pre-training - it is not useful;

https://spark-in.me/post/bert-pretrain-ru

All in all this was a relatively large gamble, which did not pay off - on some more down-to-earth task we hoped the Transformer would excel at - it did not.

#deep_learning
An approach to ranking search results with no annotation

Just a small article with a novel idea:
- Instead of training a network with CE - just train it with BCE;
- Source additional structure from the inner structure of your domain (tags, matrix decomposition methods, heuristics, etc);

https://spark-in.me/post/classifier-result-sorting

Works best if your ontology is relatively simple.

#deep_learning
This media is not supported in your browser
VIEW IN TELEGRAM
New tricks for training CNNs
Forwarded from Just links
DropBlock: A regularization method for convolutional networks https://arxiv.org/abs/1810.12890