Spark in me
2.22K subscribers
800 photos
48 videos
116 files
2.67K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
Yet another proxy - shadowsocks

If someone needs another proxy guide, someone with an Arabic username shared some alternative advice for proxy configuration
- http://disq.us/p/1tsy4nk (wait a bit till link resolves)

#internet
#linux
2018 DS/ML digest 16

Papers / posts
(0) RL now solves Quake
https://venturebeat.com/2018/07/03/googles-deepmind-taught-ai-teamwork-by-playing-quake-iii-arena/
(1) A fast.ai post about AdamW
http://www.fast.ai/2018/07/02/adam-weight-decay/
-- Adam generally requires more regularization than SGD, so be sure to adjust your regularization hyper-parameters when switching from SGD to Adam
-- Amsgrad turns out to be very disappointing
-- Refresher article http://ruder.io/optimizing-gradient-descent/index.html#nadam
(2) How to tackle new classes in CV
https://petewarden.com/2018/07/06/what-image-classifiers-can-do-about-unknown-objects/
(3) A new word in GANs?
-- https://ajolicoeur.wordpress.com/RelativisticGAN/
-- https://arxiv.org/pdf/1807.00734.pdf
(4) Using deep learning representations for search
-- https://goo.gl/R1vhTh
-- library for fast search on python https://github.com/spotify/annoy
(5) One more paper on GAN convergence
https://avg.is.tuebingen.mpg.de/publications/meschedericml2018
(6) Switchable normalization - adds a bit to ResNet50 + pre-trained models
https://github.com/switchablenorms/Switchable-Normalization

Datasets
(0) Disney starts to release datasets
https://www.disneyanimation.com/technology/datasets


Market / interesting links
(0) A motion to open-source GitHub
https://github.com/dear-github/dear-github/issues/304
(1) Allegedly GTX 1180 start in sales appearing in Asia (?)
(2) Some controversy regarding Andrew Ng and self-driving cars https://goo.gl/WNW4E3
(3) National AI strategies overviewed - https://goo.gl/BXDCD7
-- Canada C$135m
-- China has the largest strategy
-- Notably - countries like Finland also have one
(4) Amazon allegedly sells face recognition to the USA https://goo.gl/eDzekn

#data_science
#deep_learning
Ofc such experiments are done on toy datasets - but it's nice to know
Forwarded from Just links
Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks https://arxiv.org/abs/1806.10064
2018 DS/ML digest 17

Highlights of the week
(0) Troubling trends with ML scholars
http://approximatelycorrect.com/2018/07/10/troubling-trends-in-machine-learning-scholarship/
(1) NLP close to its ImageNet stage?
https://thegradient.pub/nlp-imagenet/

Papers / posts / articles
(0) Working with multi-modal data https://distill.pub/2018/feature-wise-transformations/
- concatenation-based conditioning
- conditional biasing or scaling ("residual" connections)
- sigmoidal gating
- all in all this approach seems like a mixture of attention / gating for multi-modal problems
(1) Glow, a reversible generative model which uses invertible 1x1 convolutions
https://blog.openai.com/glow/
(2) Facebooks moonshots - I kind of do not understand much here
- https://research.fb.com/facebook-research-at-icml-2018/
(3) RL concept flaws?
- https://thegradient.pub/why-rl-is-flawed/
(4) Intriguing failures of convolutions
https://eng.uber.com/coordconv/ - this is fucking amazing
(5) People are only STARTING to apply ML to reasoning
https://deepmind.com/blog/measuring-abstract-reasoning/

Yet another online book on Deep Learning
(1) Kind of standard https://livebook.manning.com/#!/book/grokking-deep-learning/chapter-1/v-10/1

Libraries / code
(0) Data version control continues to develop https://dvc.org/features

#deep_learning
#data_science
#digest

Like this post or have something to say => tell us more in the comments or donate!
Tensorboard + PyTorch

6 months ago looked at this - and it was messy
now it looks really polished
https://github.com/lanpa/tensorboard-pytorch

#data_science
Feeding images / tensors of different size using PyTorch dataloader classes

Struggled to do this properly on DS Bowl (I resorted to random crops there for training and 1-image sized batches for validation).

Suppose your dataset has some internal structure in it.
For example - you may have images of vastly different aspect ratios (3x1, 1x3 and 1x1) and you would like to squeeze every bit of performance from your pipeline.
Of course, you may pad your images / center-crop them / random crop them - but in this case you will lose some of the information.
I played with this on some tasks - sometimes force-resize works better than crops, but trying to apply your model convolutionally worked really good on SemSeg challenges.
So it may work very well on plain classification as well.

So, if you apply your model convolutionally, you will end up with differently-sized feature maps for each cluster of images.

Within the model, it can be fixed with:
(0) Adaptive avg pooling layers
(1) Some simple logic in .forward statement of the model

But anyway you end up with a small technical issue - PyTorch cannot concatenate tensors of different sizes using standard collation function.

Theoretically, there are several ways to fix this:
(0) Stupid solution - create N datasets, train on them sequentially.
In practice I tried that on DS Bowl - it worked poorly - the model overfitted to each cluster, and then performed poorly on next one;
(1) Crop / pad / resize images (suppose you deliberately want to avoid that);
(2) Insert some custom logic into PyTorch collattion function, i.e. resize there;
(3) Just sample images so that only images of one size end up within each batch;

(0) and (1) I would like to avoid intentionally.

(2) seems a bit stupid as well, because resizing should be done as a pre-processing step (collation function deals with normalized tensors, not images) and it is better not to mix purposes of your modules
Ofc, you can try to produce N tensors in (2) - i.e. tensor for each image size, but that would require additional loop downstream.

In the end, I decided that (3) is the best approach - because it can be easily transferred to other datasets / domains / tasks.

Long story short - here is my solution - I just extended their sampling function:
https://github.com/pytorch/pytorch/issues/1512#issuecomment-405015099

Maybe it is worth a PR on Github?
What do you think?

#deep_learning
#data_science

Like this post or have something to say => tell us more in the comments or donate!
Colab SeedBank

- TF is everywhere (naturally) - but at least they use keras
- On the other hand - all of the files are (at least now) downloadable via .ipynb or .py
- So - it may be a good place to look for boilerplate code

Also interesting facts, that are not mentioned openly
- Looks like they use Tesla K80s, which practically are 2.5-3x slower than 1080Ti
(https://medium.com/initialized-capital/benchmarking-tensorflow-performance-and-cost-across-different-gpu-options-69bd85fe5d58)
- Full screen notebook format is clearly inspired by Jupyter plugins
- Ofc there is a time limit for GPU scripts and GPU availability is not guaranteed (reported by people who used it)
- Personally - it looks a bit like slow instances from FloydHub - time limitations / slow GPU etc/etc

In a nutshell - perfect source of boilerplate code + playground for new people.

#deep_learning
Lazy failsafe in PyTorch Data Loader

Sometimes you train a model and testing all the combinations of augmentations / keys / params in your dataloader is too difficult. Or the dataset is too large, so it would take some time to check it properly.

In such cases I usually used some kind of failsafe try/catch.
But looks like even simpler approach works:

if img is None:
# do not return anything
pass
else:
return img


#deep_learning
#pytorch