Spark in me
2.2K subscribers
829 photos
48 videos
116 files
2.68K links
Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.
Download Telegram
The age of open-source

Recently I started using more and more open-source / CLI tools for mundane everyday tasks.

Sometimes they have higher barriers to entry (example - compare google slides vs markdown + latex), but usually more simplistic, yet more powerful.

Recently I was just appaled by MuTorrent bugs and ads - and I just found out that there is even a beta of Transmission for Windows (the alternative being - just using transmission daemon on Linux).

The question is - do you know any highly useful open-source / CLI / free tools to replace standard entrenched software, which is getting a bit annoying?

Like this post or have something to say => tell us more in the comments or donate!
Playing with renewing SSL certificates + Cloudflare

I am using certbot, which makes SSL certificate installation for any web-server literally a one-liner (a couple of guides - https://goo.gl/nP2tij / https://goo.gl/X6rVxs).
It also has an amazing command certbot renew for renewing your certificates.

Unsurprisingly, it does not work, when you have Cloudflare enabled. The solution in my case was as easy as:
- falling back to registrar's name-servers (luckily, my registrar stores its old DNS zone settings)
- certbot renew
- reverting back to cloudflare's DNS servers
- also, in this case when using VPN I did not have to wait for DNS records to propagate - it was instant

#linux
Playing with multi-GPU small batch-sizes

If you play with SemSeg with a big model with large images (HD, FullHD) - you may face a situation when only one image fits to one GPU.

Also this is useful if your train-test split is far from ideal and or you are using pre-trained imagenet encoders for a SemSeg task - so you cannot really update your bnorm params.

Also AFAIK - all the major deep-learning frameworks:
(0) do not have batch norm freeze options on evaluation (batch-norm contains 2 sets of parameters - learnable and updated on inference
(1) calculate batch-norm for each GPU separately

It all may mean, that your models may severely underperform in inference for these situations.

Solutions?

(0) Sync batch-norm. I believe to do it properly you will have to modify the framework you are using, but there is a PyTorch implementation done for the CVPR 2018 - also an explanation here http://hangzh.com/PyTorch-Encoding/notes/syncbn.html - I guess if its multi-GPU wrappers for model can be used for any models - then we are in the money)
(1) Use affine=False in your batch-norm. But probably in this case imagenet initialization will not help - you will have to train your model from scratch completely
(2) Freeze your encoder batch-norm params completely
https://discuss.pytorch.org/t/how-to-train-with-frozen-batchnorm/12106/10 (though I am not sure - they do not seem to be freezing the running mean parameters) - probably this also needs m.trainable = False or something like this
(3) Use recent Facebook group norm - https://arxiv.org/pdf/1803.08494.pdf

This is a finicky topic - please tell in comments about your experiences and tests

#deep_learning
#cv

Like this post or have something to say => tell us more in the comments or donate!
Interesting links about Internet

- Ben Evans' digest - https://goo.gl/t9zG4y
- China plans to track cars - https://goo.gl/jeroFW
- Ben Evans - content is not king anymore - distribution / eco-system are https://goo.gl/ms2tQd
- Google opens AI center in Ghana - https://goo.gl/PRHBjq

- (RU) A funny case on censorship in Russia - funny article deleted from habr - https://sohabr.net/habr/post/414595/
-- It kind of clearly shows that you cannot safely post anything to habr

- India + WhatsApp + lynch mobs - https://goo.gl/tSBUCp
- Tor foundation about web-tracking and Facebook - https://goo.gl/H9DSuL
- Docker image jacking for crypto-mining - https://goo.gl/KrLLuQ
- Ethereum - 75% transactions automated bots - https://goo.gl/Q9BSNL
- (RU) - analyzing fake elections in Russia - 3-10M votes are fake - https://habr.com/post/358790/

#internet
2018 DS/ML digest 12

As usual, this is whatever I found really interesting / worth reading.

Implementations / papers / ideas
(0)
You can count bees well with UNet - http://matpalm.com/blog/counting_bees/
(1)
A really super cool idea - use affine transformations in 3D to stack augmentations on the level of transformation matrices
(3D augs are costly)

- https://gist.github.com/ematvey/5ca7df5d37c2f6a674390d42ef9e7d59
- both for rotation and scaling
- note a couple of things for easier understanding:
-- there is offset in tranformations - because the coordinate center is not in "center"
-- zoom essentially scales unit vectors after applying the offset
- 3Blue1Brown videos about linear algebra - https://www.youtube.com/watch?v=fNk_zzaMoSs
(2)
A top solution from Google's Landmark Challenge - https://goo.gl/pkZULZ
Essentially
- ensemble of features / skip connections from a CNN (ResNeXt)
- KNN
- use KNN + augment the extracted features by averaging with similar images
- query expansion (use the fact that different crops of the same landmark remain the same landmark)
(3)
(RU) A super cool series about interestring clustering algorithms
- Affinity propagation
-- https://habr.com/post/321216/
-- http://www.icmla-conference.org/icmla07/FreyDueckScience07.pdf
- DBSCAN https://habrahabr.ru/post/322034/
- (spoiler - in practice use awesome HDBSCAN library)
(4)
Brief review of image super-resolution techniques
- https://habr.com/post/359016/
- In a nutshell try in this order FCN CNNs, auto-encoders with skip connections or GANs
(5)
SOTA NLP by open-ai
https://blog.openai.com/language-unsupervised/
Key ideas
- Train a transformer language models on large corpus in an unsupervised way
- Fine-tune on a smaller task
- Profit
Caveats
- "Our approach requires an expensive pre-training step - 1 month on 8 GPUs" (probably this should be discounted somewhat)
- TF and unreadable enterprise code
(6)
One more claimed SOTA word embedding set
https://allennlp.org/elmo
(7)
A cool github page by Sebastian Ruder to track major NLP tasks
https://github.com/sebastianruder/NLP-progress

Visualizations
(0)
Amazing visual explanations of how decision trees work
- http://www.r2d3.us/visual-intro-to-machine-learning-part-2/
- it explains visually how overfitting occurs in decisions tree models
(1)
CIFAR T-SNE can be done in real-time on the GPU + tensorflow.js integration
- Blog https://goo.gl/Pk5Lq3
- Website https://goo.gl/1vpeFf
- Arxiv - http://arxiv.org/abs/1802.03680
- Demo - https://nicola17.github.io/tfjs-tsne-demo/
(2) Why people fail to use d3.js - https://goo.gl/hSt5dL

Datasets
(0) Nice idea - use available tools and videos to collect datasets
- https://goo.gl/HULsyH
- https://goo.gl/7AfRZZ

#digest
A subscriber sent a really decent CS university scientific ranking

http://csrankings.org/#/index?all&worldpu

Useful, if you want to apply for CS/ML based Ph.D. there

#deep_learning
If someone needs a dataset, Kaggle launched ImageNet object detection
- https://www.kaggle.com/c/imagenet-object-localization-challenge#description

There is an open images dataset, which I guess is bigger though

#deep_learning
2018 DS/ML digest 13

Blog posts / articles:
(0) Google notes on CNN generalization - https://goo.gl/XS4KAw
(1) Google to teaching robots in virtual environment and then trasferring models to reality - https://goo.gl/aAYCqE
(2) Google's object tracking via image colorization - https://goo.gl/xchvBQ
(2) Interesting articles about VAEs:
- A small intro into VAEs https://habr.com/company/otus/blog/358946/
- A small intuitive intro (super super cool and intuitive)
https://towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf
- KL divergence explained
https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained
- A more formal write-up http://arxiv.org/abs/1606.05908
- In (RU) https://habr.com/company/otus/blog/358946/
- Converting a FC layer into a conv layer http://cs231n.github.io/convolutional-networks/#convert
- A post by Fchollet https://blog.keras.io/building-autoencoders-in-keras.html

A good in-depth write-up on object detection:
- http://machinethink.net/blog/object-detection/
- finally a decent explanation of YOLO parametrization http://machinethink.net/images/object-detection/grid@2x.png
- best comparison of YOLO and SSD ever - http://machinethink.net/images/object-detection/architectures@2x.png


Papers with interesting abstracts (just good to know sich things exist)
- Low-bit CNNs - https://ai.intel.com/nervana/wp-content/uploads/sites/53/2018/06/ELQ_CameraReady_CVPR2018.pdf
- Automated Meta ML - https://arxiv.org/abs/1806.06927
- Idea - use ResNet blocks for boosting - https://arxiv.org/abs/1706.04964
- 2D-discrete-Fourier transform (2D-DFT) to encode rotational invariance in neural networks - https://arxiv.org/abs/1805.12301
- Smallify the CNNs - https://arxiv.org/abs/1806.03723
- BLEU review as a metric - conclusion - it is good on average to measure MT performance - https://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00322


"New" ideas in SemSeg:
- UNET + conditional VAE http://arxiv.org/abs/1806.05034
- Dilated convolutions for larget satellite images http://arxiv.org/abs/1709.00179 - looks like that this works only if you have high resolution with small objects

#digest
#deep_learning
DL Framework choice - 2018

If you are still new to DL / DS / ML and have not yet chosen your framework, consider reading this before proceeding

- https://deepsense.ai/keras-or-pytorch/

#deep_learning
Playing with PyTorch 0.4

It was released some time ago
If you are not aware - this is the best summary
https://pytorch.org/2018/04/22/0_4_0-migration-guide.html

My first-hand experiences
- Multi-GPU support works strangely
- If you just launch your 0.3 code it will work on 0.4 with warnings - not a really breaking change
- All the new features are really cool, useful and make using PyTorch even more delightful
- I especially liked how they added context managers and cleaned up the device mess

#deep_learning
Measuring feature importance properly

http://explained.ai/rf-importance/index.html

Once again stumbled upon an amazing article about measuring feature importance for any ML algorithms:
(0) Permutation importance - if your ML algorithm is costly, then you can just shuffle a column and check importance
(1) Drop column importance - drop a column, re-train a model, check performance metrics

Why it is useful / caveats
(0) If you really care about understanding your domain - feature importances are a must have
(1) All of this works only for powerful models
(2) Landmines include - correlated or duplicate variables, data normalization

Correlated variables
(0) For RF - correlated variables share permutation importance roughly proportionally to their correlation
(1) Drop column importance can behave unpredictably

I personally like engineering different kinds of features and doing ablation tests:
(0) Among feature sets, sharing similar purpose
(1) Within feature sets

#data_science
2018 DS/ML digest 14

Amazing article - why you do not need ML
- https://cyberomin.github.io/startup/2018/07/01/sql-ml-ai.html
- I personally love plain-vanilla SQL and in 90% of cases people under-use it
- I even wrote 90% of my JSON API on our blog in pure PostgreSQL xD

Practice / papers
(0) Interesting papers from CVPR https://towardsdatascience.com/the-10-coolest-papers-from-cvpr-2018-11cb48585a49
(1) Some down-to-earth obstacles to ML deploy https://habr.com/company/hh/blog/415437/
(2) Using synthetic data for CNNs (by Nvidia) - https://arxiv.org/pdf/1804.06516.pdf
(3) This puzzles me - so much effort and engineering spent on something ... strange and useless - http://taskonomy.stanford.edu/index.html
On paper they do a cool thing - investigate transfer learning between different domains, but in practice it is done on TF and there is no clear conclusion of any kind
(4) VAE + real datasets http://siavashk.github.io/2016/02/22/autoencoder-imagenet/ - only small Imagenet (64x64)
(5) Understanding the speed of models deployed on mobile - http://machinethink.net/blog/how-fast-is-my-model/
(6) A brief overview of multi-modal methods https://medium.com/mlreview/multi-modal-methods-image-captioning-from-translation-to-attention-895b6444256e

Visualizations / explanations
(0) Amazing website with ML explanations http://explained.ai/
(1) PCA and linear VAEs are close https://pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/

#deep_learning
#digest
#data_science
Open Images Object detection on Kaggle

- https://www.kaggle.com/c/google-ai-open-images-object-detection-track#Description

- Key ideas
-- 1.2 images, high-res, 500 classes
-- decent prizes, but short time-span (2 months)
-- object detection

#deep_learning