New cool papers on CNNs
(0) Do Better ImageNet Models Transfer Better?
An implicit hypothesis in modern computer vision research is that models that perform better on ImageNet necessarily perform better on other vision tasks.
However, this hypothesis has never been systematically tested.
- Wow an empiric study why ResNets rule - they are just better non-finetuned feature extractors and then are probably easier to fine-tune
- ResNets are the best fixed feature extractors
- Also ImageNet pretraining accelerates convergence
- Also my note is that inception-based models are more difficult to fine-tune.
- Among top ranking models are - Inception, NasNet, AmoebaNet
- Also my personal remark - any CNN architecture can be ft-ed to be relatively good, you just need to invent a proper training regime
Just the abstract says it all
(1) Shampoo: Preconditioned Stochastic Tensor Optimization
Looks really cool - but their implementation requires SVD and is slow for real tasks
Also they tested it only on toy tasks
http://arxiv.org/abs/1802.09568
https://github.com/moskomule/shampoo.pytorch
In real application PyTorch implementation takes 175.58s/it per batch
#deep_learning
(0) Do Better ImageNet Models Transfer Better?
An implicit hypothesis in modern computer vision research is that models that perform better on ImageNet necessarily perform better on other vision tasks.
However, this hypothesis has never been systematically tested.
- Wow an empiric study why ResNets rule - they are just better non-finetuned feature extractors and then are probably easier to fine-tune
- ResNets are the best fixed feature extractors
- Also ImageNet pretraining accelerates convergence
- Also my note is that inception-based models are more difficult to fine-tune.
- Among top ranking models are - Inception, NasNet, AmoebaNet
- Also my personal remark - any CNN architecture can be ft-ed to be relatively good, you just need to invent a proper training regime
Just the abstract says it all
Here, we compare the performance of 13 classification models on 12 image classification tasks in three settings: as fixed feature extractors, fine-tuned, and trained from random initialization. We find that, when networks are used as fixed feature extractors, ImageNet accuracy is only weakly predictive of accuracy on other tasks (r2 = 0.24). In this setting, ResNets consistently outperform networks that achieve higher accuracy on ImageNet. When networks are fine-tuned, we observe a substantially stronger correlation (r2 = 0.86). We achieve state-of-the-art performance on eight image classification tasks simply by fine-tuning state-of-the-art ImageNet architectures, outperforming previous results based on specialized methods for transfer learning.
(1) Shampoo: Preconditioned Stochastic Tensor Optimization
Looks really cool - but their implementation requires SVD and is slow for real tasks
Also they tested it only on toy tasks
http://arxiv.org/abs/1802.09568
https://github.com/moskomule/shampoo.pytorch
In real application PyTorch implementation takes 175.58s/it per batch
#deep_learning
GitHub
GitHub - moskomule/shampoo.pytorch: An implementation of shampoo
An implementation of shampoo. Contribute to moskomule/shampoo.pytorch development by creating an account on GitHub.
A very useful combination in tmux
You can resize your panes via pressing
- first
- hold
- press
...
- profit
#linux
#deep_learning
You can resize your panes via pressing
- first
ctrl+b
- hold
ctrl
- press
arrow keys
several time holding ctrl...
- profit
#linux
#deep_learning
Digest about Internet
(0) Ben Evans Internet digest - https://goo.gl/uoQCBb
(1) GitHub purchased by Microsoft - https://goo.gl/49X74r
-- If you want to migrate - there are guides already - https://about.gitlab.com/2018/06/03/movingtogitlab/
(2) And a post on how Microsoft kind of ruined Skype - https://goo.gl/Y7MJJL
-- focus on b2b
--lack of focus, constant redesigns, faltering service
(3) No drop in FB usage after its controversies - https://goo.gl/V93j2v
(4) Facebook allegedly employes 1200 moderators for Germany - https://goo.gl/VBcYQQ
(5) Looks like many Linux networking tools have been outdated for years
https://dougvitale.wordpress.com/2011/12/21/deprecated-linux-networking-commands-and-their-replacements/
#internet
#digest
(0) Ben Evans Internet digest - https://goo.gl/uoQCBb
(1) GitHub purchased by Microsoft - https://goo.gl/49X74r
-- If you want to migrate - there are guides already - https://about.gitlab.com/2018/06/03/movingtogitlab/
(2) And a post on how Microsoft kind of ruined Skype - https://goo.gl/Y7MJJL
-- focus on b2b
--lack of focus, constant redesigns, faltering service
(3) No drop in FB usage after its controversies - https://goo.gl/V93j2v
(4) Facebook allegedly employes 1200 moderators for Germany - https://goo.gl/VBcYQQ
(5) Looks like many Linux networking tools have been outdated for years
https://dougvitale.wordpress.com/2011/12/21/deprecated-linux-networking-commands-and-their-replacements/
#internet
#digest
2018 DS/ML digest 11
Datasets
(0)
New Andrew Ng paper on radiology datasets
YouTube 8M Dataset post
As mentioned before - this is more or less blatant TF marketing
New papers / models / architectures
(0) Google RL search for optimal augmentations
- Blog, paper
- Finally Google paid attention to augmentations
- 83.54% top1 accuracy on ImageNet
- Discrete search problem, each policy consists of 5 sub-policies each each operation associated with two hyperparameters: probability and magnitude
- Training regime cosine decay for 200 epochs
- Top accuracy on ImageNet
- Best policy
- Typical examples of augmentations
(1)
Training CNNs with less data
Key idea - with clever selection of data you can decrease annotation costs 2-3x
(2)
Regularized Evolution for Image Classifier Architecture Search (AmoebaNet)
- The first controlled comparison of the two search algorithms (genetic and RL)
- Mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1M parameters)
- ImageNet (top-1 accuracy = 83.1%)
Evolution vs. RL at Large-Compute Scale
• Evolution and RL do equally well on accuracy
• Both are significantly better than Random Search
• Evolution is faster
But the proper description of the architecture is nowhere to be seen...
Libraries / code / frameworks
(0) OpenCV installation for Ubuntu18 from source (if you need e.g. video support)
News / market
(0) Idea adversarial filters for apps - https://goo.gl/L4Vne7
(1) A list of 30 best practices for amateur ML / DL specialits - http://forums.fast.ai/t/30-best-practices/12344
- Some ideas about tackling naive NLP problems
- PyTorch allegedly supports just freezing bn layers
- Also a neat idea I tried with inception nets - assign different learning rates to larger models when fine-tuning them
(2) Stumbled upon a reference on NAdam as optimizer as being a bit better than Adam
It is also described in this popular article
(3) Barcode reader via OpenCV
#deep_learning
#digest
Like this post or have something to say => tell us more in the comments or donate!
Datasets
(0)
New Andrew Ng paper on radiology datasets
YouTube 8M Dataset post
As mentioned before - this is more or less blatant TF marketing
New papers / models / architectures
(0) Google RL search for optimal augmentations
- Blog, paper
- Finally Google paid attention to augmentations
- 83.54% top1 accuracy on ImageNet
- Discrete search problem, each policy consists of 5 sub-policies each each operation associated with two hyperparameters: probability and magnitude
- Training regime cosine decay for 200 epochs
- Top accuracy on ImageNet
- Best policy
- Typical examples of augmentations
(1)
Training CNNs with less data
Key idea - with clever selection of data you can decrease annotation costs 2-3x
(2)
Regularized Evolution for Image Classifier Architecture Search (AmoebaNet)
- The first controlled comparison of the two search algorithms (genetic and RL)
- Mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1M parameters)
- ImageNet (top-1 accuracy = 83.1%)
Evolution vs. RL at Large-Compute Scale
• Evolution and RL do equally well on accuracy
• Both are significantly better than Random Search
• Evolution is faster
But the proper description of the architecture is nowhere to be seen...
Libraries / code / frameworks
(0) OpenCV installation for Ubuntu18 from source (if you need e.g. video support)
News / market
(0) Idea adversarial filters for apps - https://goo.gl/L4Vne7
(1) A list of 30 best practices for amateur ML / DL specialits - http://forums.fast.ai/t/30-best-practices/12344
- Some ideas about tackling naive NLP problems
- PyTorch allegedly supports just freezing bn layers
- Also a neat idea I tried with inception nets - assign different learning rates to larger models when fine-tuning them
(2) Stumbled upon a reference on NAdam as optimizer as being a bit better than Adam
It is also described in this popular article
(3) Barcode reader via OpenCV
#deep_learning
#digest
Like this post or have something to say => tell us more in the comments or donate!
https://github.com/keras-team/keras/releases/tag/2.2.0
New keras follows PyTorch's OOP style model definitions?)
New keras follows PyTorch's OOP style model definitions?)
GitHub
Release Keras 2.2.0 · keras-team/keras
Areas of improvements
New model definition API: Model subclassing.
New input mode: ability to call models on TensorFlow tensors directly (TensorFlow backend only).
Improve feature coverage of Kera...
New model definition API: Model subclassing.
New input mode: ability to call models on TensorFlow tensors directly (TensorFlow backend only).
Improve feature coverage of Kera...
An amazing article about differential evolution
https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/
#data_science
https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/
#data_science
An interesting idea from a CV conference
Imagine that you have some kind of algorithm, that is not exactly differentiable, but is "back-propable".
In this case you can have very convoluted logic in your "forward" statement (essentially something in between trees and dynamic programming) - for example a set of clever if-statements.
In this case you will be able to share both of the 2 worlds - both your algorithm (you will have to re-implement in your framework) and backprop + CNN. Nice.
Ofc this works only for dynamic deep-learning frameworks.
#deep_learning
#data_science
Imagine that you have some kind of algorithm, that is not exactly differentiable, but is "back-propable".
In this case you can have very convoluted logic in your "forward" statement (essentially something in between trees and dynamic programming) - for example a set of clever if-statements.
In this case you will be able to share both of the 2 worlds - both your algorithm (you will have to re-implement in your framework) and backprop + CNN. Nice.
Ofc this works only for dynamic deep-learning frameworks.
#deep_learning
#data_science
Machines Can See 2018 adversarial competition
Happened to join forces with a team that won 2nd place in this competition
- https://spark-in.me/post/playing-with-mcs2018-adversarial-attacks
It was very entertaining and a new domain to me.
Read more materials:
- Our repo https://github.com/snakers4/msc-2018-final
- Our presentation https://drive.google.com/file/d/1P-4AdCqw81nOK79vU_m7IsCVzogdeSNq/view
- All presentations https://drive.google.com/file/d/1aIUSVFBHYabBRdolBRR-1RKhTMg-v-3f/view
#data_science
#deep_learning
#adversarial
Happened to join forces with a team that won 2nd place in this competition
- https://spark-in.me/post/playing-with-mcs2018-adversarial-attacks
It was very entertaining and a new domain to me.
Read more materials:
- Our repo https://github.com/snakers4/msc-2018-final
- Our presentation https://drive.google.com/file/d/1P-4AdCqw81nOK79vU_m7IsCVzogdeSNq/view
- All presentations https://drive.google.com/file/d/1aIUSVFBHYabBRdolBRR-1RKhTMg-v-3f/view
#data_science
#deep_learning
#adversarial
Spark in me
Playing with adversarial attacks on Machines Can See 2018 competition
This article is about MCS 2018 competition and my participation in it, adversarial attack methods and how out team won
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
And now the habr.ru article is also live -
https://habr.com/post/413775/
Please support us with your likes!
#deep_learning
#data_science
https://habr.com/post/413775/
Please support us with your likes!
#deep_learning
#data_science
Habr
Состязательные атаки (adversarial attacks) в соревновании Machines Can See 2018
Или как я оказался в команде победителей соревнования Machines Can See 2018 adversarial competition. Суть любых состязательных атак на примере. Так уж получилось, что мне довелось поучаствовать в...
Interesting links about Internet
- Ben Evans' digest - https://goo.gl/7NkYn6
- Why it took so much time to create previews for Wikipedia - https://goo.gl/xg7N99
- Google postulating its AI principles? https://blog.google/topics/ai/ai-principles/
- Google product alternatives - https://goo.gl/RmA76N - I personally started to switch to more open-source stuff lately, but Docs and Android have no real options
- The future of ML in embedded devices - https://goo.gl/PjWpKj (sound ideas, but a post is by an evangelist)
- Yahoo messenger shutting down (20 years!) - https://goo.gl/uhomds - hi ICQ
- Microsoft Buys GitHub for $7.5 Billion - 16z write-up - https://goo.gl/3znstT
- NYC medallions dropped 5x in price - https://goo.gl/Vi7pG6
- JD covers villages in China with drone delivery already - https://goo.gl/bMGKSY
#digest
- Ben Evans' digest - https://goo.gl/7NkYn6
- Why it took so much time to create previews for Wikipedia - https://goo.gl/xg7N99
- Google postulating its AI principles? https://blog.google/topics/ai/ai-principles/
- Google product alternatives - https://goo.gl/RmA76N - I personally started to switch to more open-source stuff lately, but Docs and Android have no real options
- The future of ML in embedded devices - https://goo.gl/PjWpKj (sound ideas, but a post is by an evangelist)
- Yahoo messenger shutting down (20 years!) - https://goo.gl/uhomds - hi ICQ
- Microsoft Buys GitHub for $7.5 Billion - 16z write-up - https://goo.gl/3znstT
- NYC medallions dropped 5x in price - https://goo.gl/Vi7pG6
- JD covers villages in China with drone delivery already - https://goo.gl/bMGKSY
#digest
The age of open-source
Recently I started using more and more open-source / CLI tools for mundane everyday tasks.
Sometimes they have higher barriers to entry (example - compare google slides vs markdown + latex), but usually more simplistic, yet more powerful.
Recently I was just appaled by MuTorrent bugs and ads - and I just found out that there is even a beta of Transmission for Windows (the alternative being - just using transmission daemon on Linux).
The question is - do you know any highly useful open-source / CLI / free tools to replace standard entrenched software, which is getting a bit annoying?
Like this post or have something to say => tell us more in the comments or donate!
Recently I started using more and more open-source / CLI tools for mundane everyday tasks.
Sometimes they have higher barriers to entry (example - compare google slides vs markdown + latex), but usually more simplistic, yet more powerful.
Recently I was just appaled by MuTorrent bugs and ads - and I just found out that there is even a beta of Transmission for Windows (the alternative being - just using transmission daemon on Linux).
The question is - do you know any highly useful open-source / CLI / free tools to replace standard entrenched software, which is getting a bit annoying?
Like this post or have something to say => tell us more in the comments or donate!
Playing with renewing SSL certificates + Cloudflare
I am using
It also has an amazing command
Unsurprisingly, it does not work, when you have Cloudflare enabled. The solution in my case was as easy as:
- falling back to registrar's name-servers (luckily, my registrar stores its old DNS zone settings)
-
- reverting back to cloudflare's DNS servers
- also, in this case when using VPN I did not have to wait for DNS records to propagate - it was instant
#linux
I am using
certbot
, which makes SSL certificate installation for any web-server literally a one-liner (a couple of guides - https://goo.gl/nP2tij / https://goo.gl/X6rVxs).It also has an amazing command
certbot renew
for renewing your certificates.Unsurprisingly, it does not work, when you have Cloudflare enabled. The solution in my case was as easy as:
- falling back to registrar's name-servers (luckily, my registrar stores its old DNS zone settings)
-
certbot renew
- reverting back to cloudflare's DNS servers
- also, in this case when using VPN I did not have to wait for DNS records to propagate - it was instant
#linux
DigitalOcean
How To Use Certbot Standalone Mode for Let's Encrypt Certificates | DigitalOcean
Certbot offers a variety of ways to validate your domain, fetch certificates, and automatically configure Apache and Nginx. In this tutorial, we'll discuss Certbot's standalone mode and how to use it to secure other types of services, such as a mail s
Playing with multi-GPU small batch-sizes
If you play with SemSeg with a big model with large images (HD, FullHD) - you may face a situation when only one image fits to one GPU.
Also this is useful if your train-test split is far from ideal and or you are using pre-trained imagenet encoders for a SemSeg task - so you cannot really update your bnorm params.
Also AFAIK - all the major deep-learning frameworks:
(0) do not have batch norm freeze options on evaluation (batch-norm contains 2 sets of parameters - learnable and updated on inference
(1) calculate batch-norm for each GPU separately
It all may mean, that your models may severely underperform in inference for these situations.
Solutions?
(0) Sync batch-norm. I believe to do it properly you will have to modify the framework you are using, but there is a PyTorch implementation done for the CVPR 2018 - also an explanation here http://hangzh.com/PyTorch-Encoding/notes/syncbn.html - I guess if its multi-GPU wrappers for model can be used for any models - then we are in the money)
(1) Use
(2) Freeze your encoder batch-norm params completely
https://discuss.pytorch.org/t/how-to-train-with-frozen-batchnorm/12106/10 (though I am not sure - they do not seem to be freezing the running mean parameters) - probably this also needs
(3) Use recent Facebook group norm - https://arxiv.org/pdf/1803.08494.pdf
This is a finicky topic - please tell in comments about your experiences and tests
#deep_learning
#cv
Like this post or have something to say => tell us more in the comments or donate!
If you play with SemSeg with a big model with large images (HD, FullHD) - you may face a situation when only one image fits to one GPU.
Also this is useful if your train-test split is far from ideal and or you are using pre-trained imagenet encoders for a SemSeg task - so you cannot really update your bnorm params.
Also AFAIK - all the major deep-learning frameworks:
(0) do not have batch norm freeze options on evaluation (batch-norm contains 2 sets of parameters - learnable and updated on inference
(1) calculate batch-norm for each GPU separately
It all may mean, that your models may severely underperform in inference for these situations.
Solutions?
(0) Sync batch-norm. I believe to do it properly you will have to modify the framework you are using, but there is a PyTorch implementation done for the CVPR 2018 - also an explanation here http://hangzh.com/PyTorch-Encoding/notes/syncbn.html - I guess if its multi-GPU wrappers for model can be used for any models - then we are in the money)
(1) Use
affine=False
in your batch-norm. But probably in this case imagenet initialization will not help - you will have to train your model from scratch completely(2) Freeze your encoder batch-norm params completely
https://discuss.pytorch.org/t/how-to-train-with-frozen-batchnorm/12106/10 (though I am not sure - they do not seem to be freezing the running mean parameters) - probably this also needs
m.trainable = False
or something like this(3) Use recent Facebook group norm - https://arxiv.org/pdf/1803.08494.pdf
This is a finicky topic - please tell in comments about your experiences and tests
#deep_learning
#cv
Like this post or have something to say => tell us more in the comments or donate!
PyTorch Forums
How to train with frozen BatchNorm?
Since pytorch does not support syncBN, I hope to freeze mean/var of BN layer while trainning. Mean/Var in pretrained model are used while weight/bias are learnable. In this way, calculation of bottom_grad in BN will be different from that of the novel trainning…
Interesting links about Internet
- Ben Evans' digest - https://goo.gl/t9zG4y
- China plans to track cars - https://goo.gl/jeroFW
- Ben Evans - content is not king anymore - distribution / eco-system are https://goo.gl/ms2tQd
- Google opens AI center in Ghana - https://goo.gl/PRHBjq
- (RU) A funny case on censorship in Russia - funny article deleted from habr - https://sohabr.net/habr/post/414595/
-- It kind of clearly shows that you cannot safely post anything to habr
- India + WhatsApp + lynch mobs - https://goo.gl/tSBUCp
- Tor foundation about web-tracking and Facebook - https://goo.gl/H9DSuL
- Docker image jacking for crypto-mining - https://goo.gl/KrLLuQ
- Ethereum - 75% transactions automated bots - https://goo.gl/Q9BSNL
- (RU) - analyzing fake elections in Russia - 3-10M votes are fake - https://habr.com/post/358790/
#internet
- Ben Evans' digest - https://goo.gl/t9zG4y
- China plans to track cars - https://goo.gl/jeroFW
- Ben Evans - content is not king anymore - distribution / eco-system are https://goo.gl/ms2tQd
- Google opens AI center in Ghana - https://goo.gl/PRHBjq
- (RU) A funny case on censorship in Russia - funny article deleted from habr - https://sohabr.net/habr/post/414595/
-- It kind of clearly shows that you cannot safely post anything to habr
- India + WhatsApp + lynch mobs - https://goo.gl/tSBUCp
- Tor foundation about web-tracking and Facebook - https://goo.gl/H9DSuL
- Docker image jacking for crypto-mining - https://goo.gl/KrLLuQ
- Ethereum - 75% transactions automated bots - https://goo.gl/Q9BSNL
- (RU) - analyzing fake elections in Russia - 3-10M votes are fake - https://habr.com/post/358790/
#internet
2018 DS/ML digest 12
As usual, this is whatever I found really interesting / worth reading.
Implementations / papers / ideas
(0)
You can count bees well with UNet - http://matpalm.com/blog/counting_bees/
(1)
A really super cool idea - use affine transformations in 3D to stack augmentations on the level of transformation matrices
(3D augs are costly)
- https://gist.github.com/ematvey/5ca7df5d37c2f6a674390d42ef9e7d59
- both for rotation and scaling
- note a couple of things for easier understanding:
-- there is offset in tranformations - because the coordinate center is not in "center"
-- zoom essentially scales unit vectors after applying the offset
- 3Blue1Brown videos about linear algebra - https://www.youtube.com/watch?v=fNk_zzaMoSs
(2)
A top solution from Google's Landmark Challenge - https://goo.gl/pkZULZ
Essentially
- ensemble of features / skip connections from a CNN (ResNeXt)
- KNN
- use KNN + augment the extracted features by averaging with similar images
- query expansion (use the fact that different crops of the same landmark remain the same landmark)
(3)
(RU) A super cool series about interestring clustering algorithms
- Affinity propagation
-- https://habr.com/post/321216/
-- http://www.icmla-conference.org/icmla07/FreyDueckScience07.pdf
- DBSCAN https://habrahabr.ru/post/322034/
- (spoiler - in practice use awesome HDBSCAN library)
(4)
Brief review of image super-resolution techniques
- https://habr.com/post/359016/
- In a nutshell try in this order FCN CNNs, auto-encoders with skip connections or GANs
(5)
SOTA NLP by open-ai
https://blog.openai.com/language-unsupervised/
Key ideas
- Train a transformer language models on large corpus in an unsupervised way
- Fine-tune on a smaller task
- Profit
Caveats
- "Our approach requires an expensive pre-training step - 1 month on 8 GPUs" (probably this should be discounted somewhat)
- TF and unreadable enterprise code
(6)
One more claimed SOTA word embedding set
https://allennlp.org/elmo
(7)
A cool github page by Sebastian Ruder to track major NLP tasks
https://github.com/sebastianruder/NLP-progress
Visualizations
(0)
Amazing visual explanations of how decision trees work
- http://www.r2d3.us/visual-intro-to-machine-learning-part-2/
- it explains visually how overfitting occurs in decisions tree models
(1)
CIFAR T-SNE can be done in real-time on the GPU + tensorflow.js integration
- Blog https://goo.gl/Pk5Lq3
- Website https://goo.gl/1vpeFf
- Arxiv - http://arxiv.org/abs/1802.03680
- Demo - https://nicola17.github.io/tfjs-tsne-demo/
(2) Why people fail to use d3.js - https://goo.gl/hSt5dL
Datasets
(0) Nice idea - use available tools and videos to collect datasets
- https://goo.gl/HULsyH
- https://goo.gl/7AfRZZ
#digest
As usual, this is whatever I found really interesting / worth reading.
Implementations / papers / ideas
(0)
You can count bees well with UNet - http://matpalm.com/blog/counting_bees/
(1)
A really super cool idea - use affine transformations in 3D to stack augmentations on the level of transformation matrices
(3D augs are costly)
- https://gist.github.com/ematvey/5ca7df5d37c2f6a674390d42ef9e7d59
- both for rotation and scaling
- note a couple of things for easier understanding:
-- there is offset in tranformations - because the coordinate center is not in "center"
-- zoom essentially scales unit vectors after applying the offset
- 3Blue1Brown videos about linear algebra - https://www.youtube.com/watch?v=fNk_zzaMoSs
(2)
A top solution from Google's Landmark Challenge - https://goo.gl/pkZULZ
Essentially
- ensemble of features / skip connections from a CNN (ResNeXt)
- KNN
- use KNN + augment the extracted features by averaging with similar images
- query expansion (use the fact that different crops of the same landmark remain the same landmark)
(3)
(RU) A super cool series about interestring clustering algorithms
- Affinity propagation
-- https://habr.com/post/321216/
-- http://www.icmla-conference.org/icmla07/FreyDueckScience07.pdf
- DBSCAN https://habrahabr.ru/post/322034/
- (spoiler - in practice use awesome HDBSCAN library)
(4)
Brief review of image super-resolution techniques
- https://habr.com/post/359016/
- In a nutshell try in this order FCN CNNs, auto-encoders with skip connections or GANs
(5)
SOTA NLP by open-ai
https://blog.openai.com/language-unsupervised/
Key ideas
- Train a transformer language models on large corpus in an unsupervised way
- Fine-tune on a smaller task
- Profit
Caveats
- "Our approach requires an expensive pre-training step - 1 month on 8 GPUs" (probably this should be discounted somewhat)
- TF and unreadable enterprise code
(6)
One more claimed SOTA word embedding set
https://allennlp.org/elmo
(7)
A cool github page by Sebastian Ruder to track major NLP tasks
https://github.com/sebastianruder/NLP-progress
Visualizations
(0)
Amazing visual explanations of how decision trees work
- http://www.r2d3.us/visual-intro-to-machine-learning-part-2/
- it explains visually how overfitting occurs in decisions tree models
(1)
CIFAR T-SNE can be done in real-time on the GPU + tensorflow.js integration
- Blog https://goo.gl/Pk5Lq3
- Website https://goo.gl/1vpeFf
- Arxiv - http://arxiv.org/abs/1802.03680
- Demo - https://nicola17.github.io/tfjs-tsne-demo/
(2) Why people fail to use d3.js - https://goo.gl/hSt5dL
Datasets
(0) Nice idea - use available tools and videos to collect datasets
- https://goo.gl/HULsyH
- https://goo.gl/7AfRZZ
#digest