Spark in me

An interesting idea from a CV conference

Imagine that you have some kind of algorithm, that is not exactly differentiable, but is "back-propable".

In this case you can have very convoluted logic in your "forward" statement (essentially something in between trees and dynamic programming) - for example a set of clever if-statements.

In this case you will be able to share both of the 2 worlds - both your algorithm (you will have to re-implement in your framework) and backprop + CNN. Nice.

Ofc this works only for dynamic deep-learning frameworks.

#deep_learning
#data_science

805 viewsAlexander, edited 06:50

Spark in me

Machines Can See 2018 adversarial competition

Happened to join forces with a team that won 2nd place in this competition
- https://spark-in.me/post/playing-with-mcs2018-adversarial-attacks

It was very entertaining and a new domain to me.

Read more materials:
- Our repo https://github.com/snakers4/msc-2018-final
- Our presentation https://drive.google.com/file/d/1P-4AdCqw81nOK79vU_m7IsCVzogdeSNq/view
- All presentations https://drive.google.com/file/d/1aIUSVFBHYabBRdolBRR-1RKhTMg-v-3f/view

#data_science
#deep_learning
#adversarial

Spark in me

Playing with adversarial attacks on Machines Can See 2018 competition

This article is about MCS 2018 competition and my participation in it, adversarial attack methods and how out team won
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me

931 viewsAlexander, 06:52

Spark in me

And now the habr.ru article is also live -
https://habr.com/post/413775/

Please support us with your likes!

#deep_learning
#data_science

Habr

Состязательные атаки (adversarial attacks) в соревновании Machines Can See 2018

Или как я оказался в команде победителей соревнования Machines Can See 2018 adversarial competition. Суть любых состязательных атак на примере. Так уж получилось, что мне довелось поучаствовать в...

978 viewsAlexander, 15:35

Spark in me

Interesting links about Internet

- Ben Evans' digest - https://goo.gl/7NkYn6
- Why it took so much time to create previews for Wikipedia - https://goo.gl/xg7N99
- Google postulating its AI principles? https://blog.google/topics/ai/ai-principles/
- Google product alternatives - https://goo.gl/RmA76N - I personally started to switch to more open-source stuff lately, but Docs and Android have no real options
- The future of ML in embedded devices - https://goo.gl/PjWpKj (sound ideas, but a post is by an evangelist)
- Yahoo messenger shutting down (20 years!) - https://goo.gl/uhomds - hi ICQ
- Microsoft Buys GitHub for $7.5 Billion - 16z write-up - https://goo.gl/3znstT
- NYC medallions dropped 5x in price - https://goo.gl/Vi7pG6
- JD covers villages in China with drone delivery already - https://goo.gl/bMGKSY

#digest

776 viewsAlexander, edited 10:48

Spark in me

The age of open-source

Recently I started using more and more open-source / CLI tools for mundane everyday tasks.

Sometimes they have higher barriers to entry (example - compare google slides vs markdown + latex), but usually more simplistic, yet more powerful.

Recently I was just appaled by MuTorrent bugs and ads - and I just found out that there is even a beta of Transmission for Windows (the alternative being - just using transmission daemon on Linux).

The question is - do you know any highly useful open-source / CLI / free tools to replace standard entrenched software, which is getting a bit annoying?

Like this post or have something to say => tell us more in the comments or donate!

899 viewsspark_comment_bot, edited 11:22

11+ Comments Donate

Spark in me

https://youtu.be/SWW0nVQNm2w

YouTube

Neural Image Stitching And Morphing | Two Minute Papers #256

The paper "Neural Best-Buddies: Sparse Cross-Domain Correspondence" is available here:
https://arxiv.org/abs/1805.04140

Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers

We would like to thank our generous Patreon supporters…

1.1K viewsAlexander, 16:14

Spark in me

Playing with renewing SSL certificates + Cloudflare

I am using certbot, which makes SSL certificate installation for any web-server literally a one-liner (a couple of guides - https://goo.gl/nP2tij / https://goo.gl/X6rVxs).
It also has an amazing command certbot renew for renewing your certificates.

Unsurprisingly, it does not work, when you have Cloudflare enabled. The solution in my case was as easy as:
- falling back to registrar's name-servers (luckily, my registrar stores its old DNS zone settings)
- certbot renew
- reverting back to cloudflare's DNS servers
- also, in this case when using VPN I did not have to wait for DNS records to propagate - it was instant

#linux

DigitalOcean

How To Use Certbot Standalone Mode for Let's Encrypt Certificates | DigitalOcean

Certbot offers a variety of ways to validate your domain, fetch certificates, and automatically configure Apache and Nginx. In this tutorial, we'll discuss Certbot's standalone mode and how to use it to secure other types of services, such as a mail s

1.0K viewsAlexander, 04:55

Spark in me

Playing with multi-GPU small batch-sizes

If you play with SemSeg with a big model with large images (HD, FullHD) - you may face a situation when only one image fits to one GPU.

Also this is useful if your train-test split is far from ideal and or you are using pre-trained imagenet encoders for a SemSeg task - so you cannot really update your bnorm params.

Also AFAIK - all the major deep-learning frameworks:
(0) do not have batch norm freeze options on evaluation (batch-norm contains 2 sets of parameters - learnable and updated on inference
(1) calculate batch-norm for each GPU separately

It all may mean, that your models may severely underperform in inference for these situations.

Solutions?

(0) Sync batch-norm. I believe to do it properly you will have to modify the framework you are using, but there is a PyTorch implementation done for the CVPR 2018 - also an explanation here http://hangzh.com/PyTorch-Encoding/notes/syncbn.html - I guess if its multi-GPU wrappers for model can be used for any models - then we are in the money)
(1) Use affine=False in your batch-norm. But probably in this case imagenet initialization will not help - you will have to train your model from scratch completely
(2) Freeze your encoder batch-norm params completely
https://discuss.pytorch.org/t/how-to-train-with-frozen-batchnorm/12106/10 (though I am not sure - they do not seem to be freezing the running mean parameters) - probably this also needs m.trainable = False or something like this
(3) Use recent Facebook group norm - https://arxiv.org/pdf/1803.08494.pdf

This is a finicky topic - please tell in comments about your experiences and tests

#deep_learning
#cv

Like this post or have something to say => tell us more in the comments or donate!

PyTorch Forums

How to train with frozen BatchNorm?

Since pytorch does not support syncBN, I hope to freeze mean/var of BN layer while trainning. Mean/Var in pretrained model are used while weight/bias are learnable. In this way, calculation of bottom_grad in BN will be different from that of the novel trainning…

1.0K viewsspark_comment_bot, edited 14:13

1+ Comments Donate

Spark in me

Interesting links about Internet

- Ben Evans' digest - https://goo.gl/t9zG4y
- China plans to track cars - https://goo.gl/jeroFW
- Ben Evans - content is not king anymore - distribution / eco-system are https://goo.gl/ms2tQd
- Google opens AI center in Ghana - https://goo.gl/PRHBjq

- (RU) A funny case on censorship in Russia - funny article deleted from habr - https://sohabr.net/habr/post/414595/
-- It kind of clearly shows that you cannot safely post anything to habr

- India + WhatsApp + lynch mobs - https://goo.gl/tSBUCp
- Tor foundation about web-tracking and Facebook - https://goo.gl/H9DSuL
- Docker image jacking for crypto-mining - https://goo.gl/KrLLuQ
- Ethereum - 75% transactions automated bots - https://goo.gl/Q9BSNL
- (RU) - analyzing fake elections in Russia - 3-10M votes are fake - https://habr.com/post/358790/

#internet

739 viewsAlexander, 12:10

Spark in me

2018 DS/ML digest 12

As usual, this is whatever I found really interesting / worth reading.

Implementations / papers / ideas
(0)
You can count bees well with UNet - http://matpalm.com/blog/counting_bees/
(1)
A really super cool idea - use affine transformations in 3D to stack augmentations on the level of transformation matrices
(3D augs are costly)

- https://gist.github.com/ematvey/5ca7df5d37c2f6a674390d42ef9e7d59
- both for rotation and scaling
- note a couple of things for easier understanding:
-- there is offset in tranformations - because the coordinate center is not in "center"
-- zoom essentially scales unit vectors after applying the offset
- 3Blue1Brown videos about linear algebra - https://www.youtube.com/watch?v=fNk_zzaMoSs
(2)
A top solution from Google's Landmark Challenge - https://goo.gl/pkZULZ
Essentially
- ensemble of features / skip connections from a CNN (ResNeXt)
- KNN
- use KNN + augment the extracted features by averaging with similar images
- query expansion (use the fact that different crops of the same landmark remain the same landmark)
(3)
(RU) A super cool series about interestring clustering algorithms
- Affinity propagation
-- https://habr.com/post/321216/
-- http://www.icmla-conference.org/icmla07/FreyDueckScience07.pdf
- DBSCAN https://habrahabr.ru/post/322034/
- (spoiler - in practice use awesome HDBSCAN library)
(4)
Brief review of image super-resolution techniques
- https://habr.com/post/359016/
- In a nutshell try in this order FCN CNNs, auto-encoders with skip connections or GANs
(5)
SOTA NLP by open-ai
https://blog.openai.com/language-unsupervised/
Key ideas
- Train a transformer language models on large corpus in an unsupervised way
- Fine-tune on a smaller task
- Profit
Caveats
- "Our approach requires an expensive pre-training step - 1 month on 8 GPUs" (probably this should be discounted somewhat)
- TF and unreadable enterprise code
(6)
One more claimed SOTA word embedding set
https://allennlp.org/elmo
(7)
A cool github page by Sebastian Ruder to track major NLP tasks
https://github.com/sebastianruder/NLP-progress

Visualizations
(0)
Amazing visual explanations of how decision trees work
- http://www.r2d3.us/visual-intro-to-machine-learning-part-2/
- it explains visually how overfitting occurs in decisions tree models
(1)
CIFAR T-SNE can be done in real-time on the GPU + tensorflow.js integration
- Blog https://goo.gl/Pk5Lq3
- Website https://goo.gl/1vpeFf
- Arxiv - http://arxiv.org/abs/1802.03680
- Demo - https://nicola17.github.io/tfjs-tsne-demo/
(2) Why people fail to use d3.js - https://goo.gl/hSt5dL

Datasets
(0) Nice idea - use available tools and videos to collect datasets
- https://goo.gl/HULsyH
- https://goo.gl/7AfRZZ

#digest

919 viewsAlexander, 12:12

Spark in me

https://youtu.be/Nq2xvsVojVo

YouTube

Better Video Impersonations with AI | Two Minute Papers #258

The paper "Deep Video Portraits" is available here: http://gvv.mpi-inf.mpg.de/projects/DeepVideoPortraits/ Pick up cool perks on our Patreon page: https://ww...

972 viewsAlexander, 16:46

Spark in me

A subscriber sent a really decent CS university scientific ranking

http://csrankings.org/#/index?all&worldpu

Useful, if you want to apply for CS/ML based Ph.D. there

#deep_learning

792 viewsAlexander, 10:53

Spark in me

Transformer in PyTorch

Looks like somebody implement recent Google's transformer fine-tuning in PyTorch

https://github.com/huggingface/pytorch-openai-transformer-lm

Nice!

#nlp
#deep_learning

GitHub

GitHub - huggingface/pytorch-openai-transformer-lm: 🐥A PyTorch implementation of OpenAI's finetuned transformer language model…

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI - huggingface/pytorch-openai-transformer-lm

967 viewsAlexander, 10:54

Spark in me

Forwarded from Just links

https://blog.openai.com/openai-five/

15 viewsAlexander, 15:57

Spark in me

If someone needs a dataset, Kaggle launched ImageNet object detection
- https://www.kaggle.com/c/imagenet-object-localization-challenge#description

There is an open images dataset, which I guess is bigger though

#deep_learning

Kaggle

ImageNet Object Localization Challenge

Identify the objects in images

1.0K viewsAlexander, edited 07:02

Spark in me

https://www.youtube.com/watch?v=Te0L5_u_wIg

YouTube

Faceforensics: This AI Detects DeepFakes

The paper "FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces " is available here:
http://niessnerlab.org/projects/roessler2018faceforensics.html

Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers…

1.2K viewsAlexander, 19:25

Spark in me