A case against Kaggle
If you thought that Kaggle is the home of Data Science - think again.
https://www.kaggle.com/c/airbus-ship-detection/discussion/64355
This is official - they do not know the hell they are doing.
There have been several appalling cases already, but this takes the prize.
Following this thread, wrote a small petition to Kaggle
https://www.kaggle.com/c/airbus-ship-detection/discussion/64393
I doubt that they will hear, but why not.
#data_science
If you thought that Kaggle is the home of Data Science - think again.
https://www.kaggle.com/c/airbus-ship-detection/discussion/64355
This is official - they do not know the hell they are doing.
There have been several appalling cases already, but this takes the prize.
Following this thread, wrote a small petition to Kaggle
https://www.kaggle.com/c/airbus-ship-detection/discussion/64393
I doubt that they will hear, but why not.
#data_science
Venn diagrams in python
Compare sets as easy as:
Very simple and useful.
#data_science
Compare sets as easy as:
# Import the library
import matplotlib.pyplot as plt
from matplotlib_venn import venn3
# Make the diagram
plt.figure(figsize=(10,10))
venn3(subsets = (s1,s2,s3),set_labels=['synonyms','our_tree','add_syns'])
plt.show()
Very simple and useful.
#data_science
Crowd-AI maps repo
Just opened my repo for crowd AI maps 2018.
Did not pursue this competition till the end, so it is not polished,
https://github.com/snakers4/crowdai-maps-2018
https://spark-in.me/post/a-small-case-for-search-of-structure-within-your-data
#deep_learning
Just opened my repo for crowd AI maps 2018.
Did not pursue this competition till the end, so it is not polished,
.md
is not updated. Use it at your own risk!https://github.com/snakers4/crowdai-maps-2018
https://spark-in.me/post/a-small-case-for-search-of-structure-within-your-data
#deep_learning
GitHub
snakers4/crowdai-maps-2018
CrowdAI mapping challenge 2018 solution. Contribute to snakers4/crowdai-maps-2018 development by creating an account on GitHub.
ADAMW to be integrated into upstream PyTorch?
https://github.com/pytorch/pytorch/pull/3740
#deep_learning
https://github.com/pytorch/pytorch/pull/3740
#deep_learning
GitHub
Fixing Weight Decay Regularization in Adam by jingweiz · Pull Request #3740 · pytorch/pytorch
Hey,
We added SGDW and AdamW in optim, accoridng to the new ICLR submission from Loshchilov and Hutter: Fixing Weight Decay Regularization in Adam.
We also found some inconsistency of the current i...
We added SGDW and AdamW in optim, accoridng to the new ICLR submission from Loshchilov and Hutter: Fixing Weight Decay Regularization in Adam.
We also found some inconsistency of the current i...
A small hack to spare PyTorch memory when resuming training
When you resume from a checkpoint, consider adding this to save GPU memory:
#deep_learning
When you resume from a checkpoint, consider adding this to save GPU memory:
del checkpoint
torch.cuda.empty_cache()
#deep_learning
Training a MNASNET from scratch ... and failing
As a small side hobby we tried training new Google's mobile network from scratch and failed:
- https://spark-in.me/post/mnasnet-fail-alas
- https://github.com/snakers4/mnasnet-pytorch
Maybe you know how to train it properly?
Also now you can upvote articles on spark in me! =)
#deep_learning
As a small side hobby we tried training new Google's mobile network from scratch and failed:
- https://spark-in.me/post/mnasnet-fail-alas
- https://github.com/snakers4/mnasnet-pytorch
Maybe you know how to train it properly?
Also now you can upvote articles on spark in me! =)
#deep_learning
MySQL - replacing window functions
Older versions of MySQL (and maybe newer ones) do not have all the goodness you can find in PostgreSQL. Ofc you can do plain session matching in Python, but sometimes you just need to do it in plain SQL.
In Postgres you usually use window functions for this purpose if you need PLAIN SQL (ofc there are stored procedures / views / mat views etc).
In MySQL it can be elegantly solved like this:
#data_science
Older versions of MySQL (and maybe newer ones) do not have all the goodness you can find in PostgreSQL. Ofc you can do plain session matching in Python, but sometimes you just need to do it in plain SQL.
In Postgres you usually use window functions for this purpose if you need PLAIN SQL (ofc there are stored procedures / views / mat views etc).
In MySQL it can be elegantly solved like this:
SET @session_number = 0, @last_uid = '0', @current_id = '0', @dif=0;
SELECT
t1.some_field,
t2.some_field,
...
@last_uid:=@current_uid,
@current_uid:=t1.uid,
@dif:=TIMESTAMPDIFF(MINUTE, t2.session_ts, t1.session_ts),
if(@last_uid=@current_uid, if(@dif > 30,@session_number:=@session_number+1,@session_number),@session_number:=0) as session
FROM
table1 t1
JOIN table2 t2 on t1.id = t2.id+1
#data_science
DS/ML digest 23
The key topic of this one - is this is insanity
- vid2vid
- unsupervised NMT
https://spark-in.me/post/2018_ds_ml_digest_23
If you like our digests, you can support the channel via:
- Sharing / reposting;
- Giving an article a decent comment / a thumbs-up;
- Buying me a coffee (links on the digest);
Let's spread the right DS/ML ideas together.
#digest
#deep_learning
#data_science
The key topic of this one - is this is insanity
- vid2vid
- unsupervised NMT
https://spark-in.me/post/2018_ds_ml_digest_23
If you like our digests, you can support the channel via:
- Sharing / reposting;
- Giving an article a decent comment / a thumbs-up;
- Buying me a coffee (links on the digest);
Let's spread the right DS/ML ideas together.
#digest
#deep_learning
#data_science
Chainer - a predecessor of PyTorch
Looks like
- PyTorch was based not only on Torch, but also its autograd was forked from Chainer;
- Chainer looks like PyTorch ... but not by Facebook, but by independent Japanese group;
- A quick glance through the docs confirms that PyTorch and Chainer APIs look 90% identical (both numpy inspired, but using different back-ends);
- Open Images 2nd place was taken by people using Chainer with 512 GPUs;
- I have yet to confirm myself that PyTorch can work with a cluster (but other people have done it) https://github.com/eladhoffer/convNet.pytorch;
https://www.reddit.com/r/MachineLearning/comments/7lb5n1/d_chainer_vs_pytorch/
https://docs.chainer.org/en/stable/comparison.html
#deep_learning
Looks like
- PyTorch was based not only on Torch, but also its autograd was forked from Chainer;
- Chainer looks like PyTorch ... but not by Facebook, but by independent Japanese group;
- A quick glance through the docs confirms that PyTorch and Chainer APIs look 90% identical (both numpy inspired, but using different back-ends);
- Open Images 2nd place was taken by people using Chainer with 512 GPUs;
- I have yet to confirm myself that PyTorch can work with a cluster (but other people have done it) https://github.com/eladhoffer/convNet.pytorch;
https://www.reddit.com/r/MachineLearning/comments/7lb5n1/d_chainer_vs_pytorch/
https://docs.chainer.org/en/stable/comparison.html
#deep_learning
GitHub
GitHub - eladhoffer/convNet.pytorch: ConvNet training using pytorch
ConvNet training using pytorch. Contribute to eladhoffer/convNet.pytorch development by creating an account on GitHub.
Also - thanks for all DO referral link supporters - now finally hosting of my website is free (at least for next ~6 months)!
Also today I published a 200th post on spark-in.me. Ofc not all of these are proper long articles, but nevertheless it's cool.
Also today I published a 200th post on spark-in.me. Ofc not all of these are proper long articles, but nevertheless it's cool.
SeNet
- http://arxiv.org/abs/1709.01507;
- A 2017 Imagenet winner;
- Mostly ResNet-152 inspired network;
- Transfers well (ResNet);
- Squeeze and Excitation (SE) block, that adaptively recalibratess channel-wise feature responses by explicitly modelling in- terdependencies between channels;
- Intuitively looks like - convolution meet the attention mechanism;
- SE block:
- https://pics.spark-in.me/upload/aa50a2559f56faf705ad6639ac973a38.jpg
- Reduction ratio r to be 16 in all experiments;
- Results:
- https://pics.spark-in.me/upload/db2c98330744a6fd4dab17259d5f9d14.jpg
#deep_learning
- http://arxiv.org/abs/1709.01507;
- A 2017 Imagenet winner;
- Mostly ResNet-152 inspired network;
- Transfers well (ResNet);
- Squeeze and Excitation (SE) block, that adaptively recalibratess channel-wise feature responses by explicitly modelling in- terdependencies between channels;
- Intuitively looks like - convolution meet the attention mechanism;
- SE block:
- https://pics.spark-in.me/upload/aa50a2559f56faf705ad6639ac973a38.jpg
- Reduction ratio r to be 16 in all experiments;
- Results:
- https://pics.spark-in.me/upload/db2c98330744a6fd4dab17259d5f9d14.jpg
#deep_learning
Useful Python / PyTorch bits
dot.notation access to dictionary attributes
PyTorch embedding layer - ignore padding
#python
#pytorch
dot.notation access to dictionary attributes
class dotdict(dict):
__getattr__ = dict.get
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
PyTorch embedding layer - ignore padding
nn.Embedding
has a padding_idx
attribute not to update the padding token embedding.#python
#pytorch
Gensim's fast-text subwords
Some monkey patching to get subwords from Gensim's fast-text
from gensim.models.utils_any2vec import _compute_ngrams,_ft_hash
def subword(self, word):
ngram_lst = []
ngrams = _compute_ngrams(word, self.min_n, self.max_n)
for ngram in ngrams:
ngram_hash = _ft_hash(ngram) % self.bucket
if ngram_hash in self.hash2index:
ngram_lst.append(ngram)
return ngram_lst
gensim.models.keyedvectors.FastTextKeyedVectors.subword = subword
Some monkey patching to get subwords from Gensim's fast-text
from gensim.models.utils_any2vec import _compute_ngrams,_ft_hash
def subword(self, word):
ngram_lst = []
ngrams = _compute_ngrams(word, self.min_n, self.max_n)
for ngram in ngrams:
ngram_hash = _ft_hash(ngram) % self.bucket
if ngram_hash in self.hash2index:
ngram_lst.append(ngram)
return ngram_lst
gensim.models.keyedvectors.FastTextKeyedVectors.subword = subword
Understanding the current SOTA NMT / NLP model - transformer
A list of articles that really help to do so:
- Understanding attention https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
- Annotated transformer http://nlp.seas.harvard.edu/2018/04/03/attention.html
- Illustrated transformer https://jalammar.github.io/illustrated-transformer/
Playing with transformer in practice
This repo turned out to be really helpful
https://github.com/huggingface/pytorch-openai-transformer-lm
It features:
- Decent well encapsulated model and loss;
- Several head for different tasks;
- It works;
- Ofc their data-loading scheme is crappy and over-engineered;
My impressions on actually training the transformer model for classification:
- It works;
- It is high capacity;
- Inference time is ~`5x` higher than char-level or plain RNNs;
- It serves as a classifier as well as an LM;
- Capacity is enough to tackle most challenging tasks;
- It can be deployed on
- On smaller tasks there is no clear difference between plain RNNs and Transformer;
#nlp
A list of articles that really help to do so:
- Understanding attention https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
- Annotated transformer http://nlp.seas.harvard.edu/2018/04/03/attention.html
- Illustrated transformer https://jalammar.github.io/illustrated-transformer/
Playing with transformer in practice
This repo turned out to be really helpful
https://github.com/huggingface/pytorch-openai-transformer-lm
It features:
- Decent well encapsulated model and loss;
- Several head for different tasks;
- It works;
- Ofc their data-loading scheme is crappy and over-engineered;
My impressions on actually training the transformer model for classification:
- It works;
- It is high capacity;
- Inference time is ~`5x` higher than char-level or plain RNNs;
- It serves as a classifier as well as an LM;
- Capacity is enough to tackle most challenging tasks;
- It can be deployed on
CPU
for small texts (!);- On smaller tasks there is no clear difference between plain RNNs and Transformer;
#nlp
jalammar.github.io
Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
Translations: Chinese (Simplified), French, Japanese, Korean, Persian, Russian, Turkish, Uzbek
Watch: MIT’s Deep Learning State of the Art lecture referencing this post
May 25th update: New graphics (RNN animation, word embedding graph), color coding, elaborated…
Watch: MIT’s Deep Learning State of the Art lecture referencing this post
May 25th update: New graphics (RNN animation, word embedding graph), color coding, elaborated…
Using sklearn pairwise cosine similarity
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html#sklearn.metrics.pairwise.cosine_similarity
On 7k * 7k example with 300-dimensional vectors it turned out to be MUCH faster than doing the same:
- In 10 processes;
- Using numba;
The more you know.
If you have used it - please PM me.
#nlp
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html#sklearn.metrics.pairwise.cosine_similarity
On 7k * 7k example with 300-dimensional vectors it turned out to be MUCH faster than doing the same:
- In 10 processes;
- Using numba;
The more you know.
If you have used it - please PM me.
#nlp
DS/ML digest 24
Key topics of this one:
- New method to calculate phrase/n-gram/sentence embeddings for rare and OOV words;
- So many releases from Google;
https://spark-in.me/post/2018_ds_ml_digest_24
If you like our digests, you can support the channel via:
- Sharing / reposting;
- Giving an article a decent comment / a thumbs-up;
- Buying me a coffee (links on the digest);
#digest
#deep_learning
#data_science
Key topics of this one:
- New method to calculate phrase/n-gram/sentence embeddings for rare and OOV words;
- So many releases from Google;
https://spark-in.me/post/2018_ds_ml_digest_24
If you like our digests, you can support the channel via:
- Sharing / reposting;
- Giving an article a decent comment / a thumbs-up;
- Buying me a coffee (links on the digest);
#digest
#deep_learning
#data_science
Spark in me
2018 DS/ML digest 24
2018 DS/ML digest 24
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
(RU) most popular ML algorithms explained in simple terms
https://vas3k.ru/blog/machine_learning/
#data_science
https://vas3k.ru/blog/machine_learning/
#data_science
vas3k.blog
Машинное обучение для людей
None