Using groupby in pandas in multi-thread fashion
Sometimes you just need to use all of your CPUs to process some nasty thing in pandas (because you are lazy to do it properly) quick and dirty.
Pandas'
Solution I googled
- https://gist.github.com/tejaslodaya/562a8f71dc62264a04572770375f4bba
My lazy way using tqdm +
- https://gist.github.com/snakers4/b246de548669543dc3b5dbb49d4c2f0c
(Savva, if you read this, I know that your version is better, you can also send it to me to share xD)
#ds
Sometimes you just need to use all of your CPUs to process some nasty thing in pandas (because you are lazy to do it properly) quick and dirty.
Pandas'
GroupBy: Split, Apply, Combine
seems to have been built exactly for that, but there is also a lazy workaround.Solution I googled
- https://gist.github.com/tejaslodaya/562a8f71dc62264a04572770375f4bba
My lazy way using tqdm +
Pool
- https://gist.github.com/snakers4/b246de548669543dc3b5dbb49d4c2f0c
(Savva, if you read this, I know that your version is better, you can also send it to me to share xD)
#ds
Gist
pandas DataFrame apply multiprocessing
pandas DataFrame apply multiprocessing. GitHub Gist: instantly share code, notes, and snippets.
New competitions on Kaggle
Kaggle has started a new competition with video ... which is one of those competitions (read between the lines - blatant marketing)
https://www.kaggle.com/c/youtube8m-2018
I.e.
- TensorFlow Record files
- Each of the top 5 ranked teams will receive $5,000 per team as a
- The complete frame-level features take about 1.53TB of space (and yes, these are not videos, but extracted CNN features)
So, they are indeed using their platform to promote their business interests.
Released free datasets are really cool, but only when you can use then for transfer learning, which implies also seeing the underlying ground level data (i.e. images of videos).
#data_science
#deep_learning
Kaggle has started a new competition with video ... which is one of those competitions (read between the lines - blatant marketing)
https://www.kaggle.com/c/youtube8m-2018
I.e.
- TensorFlow Record files
- Each of the top 5 ranked teams will receive $5,000 per team as a
travel award
- no real prizes- The complete frame-level features take about 1.53TB of space (and yes, these are not videos, but extracted CNN features)
So, they are indeed using their platform to promote their business interests.
Released free datasets are really cool, but only when you can use then for transfer learning, which implies also seeing the underlying ground level data (i.e. images of videos).
#data_science
#deep_learning
Kaggle
The 2nd YouTube-8M Video Understanding Challenge
Can you create a constrained-size model to predict video labels?
A couple of neat tricks in PyTorch to make code more compact and more useful for hyper-param tuning
You may have seen that today one can use CNNs even for tabular data.
In this case you may to resort to a lot of fiddling regarding model capacity and hyper-params.
It is kind of easy to do so in Keras, but doing this in PyTorch requires a bit more fiddling.
Here are a couple of patterns that may help with this:
(0) Clever use of nn.Sequential()
(1) Clever use of lists (which is essentially the same as above)
Just this construction may save a lot of space and give a lot of flexibility
(2) Pushing as many hyper-params into flags for console scripts
You can even encode something like 1024_512_256 to be passed as list to your model constructor, i.e.
1024_512_256 => 1024,512,256 => an MLP with corresponding amount of neurons
(3) (Obvious) Using OOP where it makes sense
Example I recently used for one baseline
#deep_learning
Like this post or have something to say => tell us more in the comments or donate!
You may have seen that today one can use CNNs even for tabular data.
In this case you may to resort to a lot of fiddling regarding model capacity and hyper-params.
It is kind of easy to do so in Keras, but doing this in PyTorch requires a bit more fiddling.
Here are a couple of patterns that may help with this:
(0) Clever use of nn.Sequential()
self.layers = nn.Sequential(*[
ConvLayer(in_channels=channels,
out_channels=channels,
kernel_size=kernel_size,
activation=activation,
dropout=dropout)
for _ in range(blocks)
])
(1) Clever use of lists (which is essentially the same as above)
Just this construction may save a lot of space and give a lot of flexibility
modules = []
modules.append(...)
self.classifier = nn.Sequential(*modules)
(2) Pushing as many hyper-params into flags for console scripts
You can even encode something like 1024_512_256 to be passed as list to your model constructor, i.e.
1024_512_256 => 1024,512,256 => an MLP with corresponding amount of neurons
(3) (Obvious) Using OOP where it makes sense
Example I recently used for one baseline
#deep_learning
Like this post or have something to say => tell us more in the comments or donate!
Gist
Playing with MLP + embeddings in PyTorch
Playing with MLP + embeddings in PyTorch. GitHub Gist: instantly share code, notes, and snippets.
Dealing with class imbalance with CNNs
For small datasets / problems - oversampling works best, for large dataset - it's unclear
- http://arxiv.org/abs/1710.05381
Interestingly enough, they did not test oversampling + augmentations.
For small datasets / problems - oversampling works best, for large dataset - it's unclear
- http://arxiv.org/abs/1710.05381
Interestingly enough, they did not test oversampling + augmentations.
Transforms in PyTorch
The added a lot of useful stuff lately:
- https://pytorch.org/docs/master/torchvision/transforms.html
Basically this enables to build a decent pre-processing out-of box for simple tasks (just images)
I believe it will be much slower that OpenCV, but for small tasks it's ideal, if you do no look under the hood
#deep_learning
The added a lot of useful stuff lately:
- https://pytorch.org/docs/master/torchvision/transforms.html
Basically this enables to build a decent pre-processing out-of box for simple tasks (just images)
I believe it will be much slower that OpenCV, but for small tasks it's ideal, if you do no look under the hood
#deep_learning
MobileNetv2
New light-weight architecture from Google with 72%+ top1
(0)
Performance https://goo.gl/2czk9t
Link http://arxiv.org/abs/1801.04381
Pre-trained implementation
- https://github.com/tonylins/pytorch-mobilenet-v2
- but this one took much more memory that I expected
- did not debug it
(1)
Gist - new light-weight architecture from Google with 72%+ top1 on Imagenet
Ofc Google promotes only its own papers there
No mention of SqueezeNet
This is somewhat disturbing
(2)
Novel ideas
- the shortcut connections are between the thin bottleneck layers
- the intermediate expansion layer uses lightweight depthwise convolutions
- it is important to remove non-linearities in the narrow layers in order to maintain representational power
(3)
Very novel idea - it is argued that non-linearities collapse some information.
When the dimensionality of useful information is low, you can do w/o them w/o loss of accuracy
(4) Building blocks
- Recent small networks' key features (except for SqueezeNet ones) - https://goo.gl/mQtrFM
- MobileNet building block explanation
- https://goo.gl/eVnWQL https://goo.gl/Gj8eQ5
- Overall architecture - https://goo.gl/RRhxdp
#deep_learning
New light-weight architecture from Google with 72%+ top1
(0)
Performance https://goo.gl/2czk9t
Link http://arxiv.org/abs/1801.04381
Pre-trained implementation
- https://github.com/tonylins/pytorch-mobilenet-v2
- but this one took much more memory that I expected
- did not debug it
(1)
Gist - new light-weight architecture from Google with 72%+ top1 on Imagenet
Ofc Google promotes only its own papers there
No mention of SqueezeNet
This is somewhat disturbing
(2)
Novel ideas
- the shortcut connections are between the thin bottleneck layers
- the intermediate expansion layer uses lightweight depthwise convolutions
- it is important to remove non-linearities in the narrow layers in order to maintain representational power
(3)
Very novel idea - it is argued that non-linearities collapse some information.
When the dimensionality of useful information is low, you can do w/o them w/o loss of accuracy
(4) Building blocks
- Recent small networks' key features (except for SqueezeNet ones) - https://goo.gl/mQtrFM
- MobileNet building block explanation
- https://goo.gl/eVnWQL https://goo.gl/Gj8eQ5
- Overall architecture - https://goo.gl/RRhxdp
#deep_learning
Forwarded from Just links
https://github.com/Randl/MobileNetV2-pytorch
My implementation of MobileNetV2 - currently top low computational model - on PyTorch 0.4. RMSProp didn't work (I have feeling there are issues with it in PyTorch), so training is with SGD (scheme similar to ShuffleNet's one - reducing lr after 200 and 300 epochs). The results are a bit better that claimed in paper and achieved by other repos - 72.1% top1. Supports any scaling factor/input size (divisible by 32) as described in paper.
My implementation of MobileNetV2 - currently top low computational model - on PyTorch 0.4. RMSProp didn't work (I have feeling there are issues with it in PyTorch), so training is with SGD (scheme similar to ShuffleNet's one - reducing lr after 200 and 300 epochs). The results are a bit better that claimed in paper and achieved by other repos - 72.1% top1. Supports any scaling factor/input size (divisible by 32) as described in paper.
GitHub
Randl/MobileNetV2-pytorch
Impementation of MobileNetV2 in pytorch . Contribute to Randl/MobileNetV2-pytorch development by creating an account on GitHub.
Some insights about why recent TF speech recognition challenge dataset was so poor in quality:
- https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/
Cool ideas
+ a cool idea - use last layer in CNN as an embedding in TB visualization + how to
#deep_learning
- https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/
Cool ideas
+ a cool idea - use last layer in CNN as an embedding in TB visualization + how to
#deep_learning
Pete Warden's blog
Why you need to improve your training data, and how to do it
Photo by Lisha Li Andrej Karpathy showed this slide as part of his talk at Train AI and I loved it! It captures the difference between deep learning research and production perfectly. Academic pape…
New cool papers on CNNs
(0) Do Better ImageNet Models Transfer Better?
An implicit hypothesis in modern computer vision research is that models that perform better on ImageNet necessarily perform better on other vision tasks.
However, this hypothesis has never been systematically tested.
- Wow an empiric study why ResNets rule - they are just better non-finetuned feature extractors and then are probably easier to fine-tune
- ResNets are the best fixed feature extractors
- Also ImageNet pretraining accelerates convergence
- Also my note is that inception-based models are more difficult to fine-tune.
- Among top ranking models are - Inception, NasNet, AmoebaNet
- Also my personal remark - any CNN architecture can be ft-ed to be relatively good, you just need to invent a proper training regime
Just the abstract says it all
(1) Shampoo: Preconditioned Stochastic Tensor Optimization
Looks really cool - but their implementation requires SVD and is slow for real tasks
Also they tested it only on toy tasks
http://arxiv.org/abs/1802.09568
https://github.com/moskomule/shampoo.pytorch
In real application PyTorch implementation takes 175.58s/it per batch
#deep_learning
(0) Do Better ImageNet Models Transfer Better?
An implicit hypothesis in modern computer vision research is that models that perform better on ImageNet necessarily perform better on other vision tasks.
However, this hypothesis has never been systematically tested.
- Wow an empiric study why ResNets rule - they are just better non-finetuned feature extractors and then are probably easier to fine-tune
- ResNets are the best fixed feature extractors
- Also ImageNet pretraining accelerates convergence
- Also my note is that inception-based models are more difficult to fine-tune.
- Among top ranking models are - Inception, NasNet, AmoebaNet
- Also my personal remark - any CNN architecture can be ft-ed to be relatively good, you just need to invent a proper training regime
Just the abstract says it all
Here, we compare the performance of 13 classification models on 12 image classification tasks in three settings: as fixed feature extractors, fine-tuned, and trained from random initialization. We find that, when networks are used as fixed feature extractors, ImageNet accuracy is only weakly predictive of accuracy on other tasks (r2 = 0.24). In this setting, ResNets consistently outperform networks that achieve higher accuracy on ImageNet. When networks are fine-tuned, we observe a substantially stronger correlation (r2 = 0.86). We achieve state-of-the-art performance on eight image classification tasks simply by fine-tuning state-of-the-art ImageNet architectures, outperforming previous results based on specialized methods for transfer learning.
(1) Shampoo: Preconditioned Stochastic Tensor Optimization
Looks really cool - but their implementation requires SVD and is slow for real tasks
Also they tested it only on toy tasks
http://arxiv.org/abs/1802.09568
https://github.com/moskomule/shampoo.pytorch
In real application PyTorch implementation takes 175.58s/it per batch
#deep_learning
GitHub
GitHub - moskomule/shampoo.pytorch: An implementation of shampoo
An implementation of shampoo. Contribute to moskomule/shampoo.pytorch development by creating an account on GitHub.
A very useful combination in tmux
You can resize your panes via pressing
- first
- hold
- press
...
- profit
#linux
#deep_learning
You can resize your panes via pressing
- first
ctrl+b
- hold
ctrl
- press
arrow keys
several time holding ctrl...
- profit
#linux
#deep_learning
Digest about Internet
(0) Ben Evans Internet digest - https://goo.gl/uoQCBb
(1) GitHub purchased by Microsoft - https://goo.gl/49X74r
-- If you want to migrate - there are guides already - https://about.gitlab.com/2018/06/03/movingtogitlab/
(2) And a post on how Microsoft kind of ruined Skype - https://goo.gl/Y7MJJL
-- focus on b2b
--lack of focus, constant redesigns, faltering service
(3) No drop in FB usage after its controversies - https://goo.gl/V93j2v
(4) Facebook allegedly employes 1200 moderators for Germany - https://goo.gl/VBcYQQ
(5) Looks like many Linux networking tools have been outdated for years
https://dougvitale.wordpress.com/2011/12/21/deprecated-linux-networking-commands-and-their-replacements/
#internet
#digest
(0) Ben Evans Internet digest - https://goo.gl/uoQCBb
(1) GitHub purchased by Microsoft - https://goo.gl/49X74r
-- If you want to migrate - there are guides already - https://about.gitlab.com/2018/06/03/movingtogitlab/
(2) And a post on how Microsoft kind of ruined Skype - https://goo.gl/Y7MJJL
-- focus on b2b
--lack of focus, constant redesigns, faltering service
(3) No drop in FB usage after its controversies - https://goo.gl/V93j2v
(4) Facebook allegedly employes 1200 moderators for Germany - https://goo.gl/VBcYQQ
(5) Looks like many Linux networking tools have been outdated for years
https://dougvitale.wordpress.com/2011/12/21/deprecated-linux-networking-commands-and-their-replacements/
#internet
#digest
2018 DS/ML digest 11
Datasets
(0)
New Andrew Ng paper on radiology datasets
YouTube 8M Dataset post
As mentioned before - this is more or less blatant TF marketing
New papers / models / architectures
(0) Google RL search for optimal augmentations
- Blog, paper
- Finally Google paid attention to augmentations
- 83.54% top1 accuracy on ImageNet
- Discrete search problem, each policy consists of 5 sub-policies each each operation associated with two hyperparameters: probability and magnitude
- Training regime cosine decay for 200 epochs
- Top accuracy on ImageNet
- Best policy
- Typical examples of augmentations
(1)
Training CNNs with less data
Key idea - with clever selection of data you can decrease annotation costs 2-3x
(2)
Regularized Evolution for Image Classifier Architecture Search (AmoebaNet)
- The first controlled comparison of the two search algorithms (genetic and RL)
- Mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1M parameters)
- ImageNet (top-1 accuracy = 83.1%)
Evolution vs. RL at Large-Compute Scale
• Evolution and RL do equally well on accuracy
• Both are significantly better than Random Search
• Evolution is faster
But the proper description of the architecture is nowhere to be seen...
Libraries / code / frameworks
(0) OpenCV installation for Ubuntu18 from source (if you need e.g. video support)
News / market
(0) Idea adversarial filters for apps - https://goo.gl/L4Vne7
(1) A list of 30 best practices for amateur ML / DL specialits - http://forums.fast.ai/t/30-best-practices/12344
- Some ideas about tackling naive NLP problems
- PyTorch allegedly supports just freezing bn layers
- Also a neat idea I tried with inception nets - assign different learning rates to larger models when fine-tuning them
(2) Stumbled upon a reference on NAdam as optimizer as being a bit better than Adam
It is also described in this popular article
(3) Barcode reader via OpenCV
#deep_learning
#digest
Like this post or have something to say => tell us more in the comments or donate!
Datasets
(0)
New Andrew Ng paper on radiology datasets
YouTube 8M Dataset post
As mentioned before - this is more or less blatant TF marketing
New papers / models / architectures
(0) Google RL search for optimal augmentations
- Blog, paper
- Finally Google paid attention to augmentations
- 83.54% top1 accuracy on ImageNet
- Discrete search problem, each policy consists of 5 sub-policies each each operation associated with two hyperparameters: probability and magnitude
- Training regime cosine decay for 200 epochs
- Top accuracy on ImageNet
- Best policy
- Typical examples of augmentations
(1)
Training CNNs with less data
Key idea - with clever selection of data you can decrease annotation costs 2-3x
(2)
Regularized Evolution for Image Classifier Architecture Search (AmoebaNet)
- The first controlled comparison of the two search algorithms (genetic and RL)
- Mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1M parameters)
- ImageNet (top-1 accuracy = 83.1%)
Evolution vs. RL at Large-Compute Scale
• Evolution and RL do equally well on accuracy
• Both are significantly better than Random Search
• Evolution is faster
But the proper description of the architecture is nowhere to be seen...
Libraries / code / frameworks
(0) OpenCV installation for Ubuntu18 from source (if you need e.g. video support)
News / market
(0) Idea adversarial filters for apps - https://goo.gl/L4Vne7
(1) A list of 30 best practices for amateur ML / DL specialits - http://forums.fast.ai/t/30-best-practices/12344
- Some ideas about tackling naive NLP problems
- PyTorch allegedly supports just freezing bn layers
- Also a neat idea I tried with inception nets - assign different learning rates to larger models when fine-tuning them
(2) Stumbled upon a reference on NAdam as optimizer as being a bit better than Adam
It is also described in this popular article
(3) Barcode reader via OpenCV
#deep_learning
#digest
Like this post or have something to say => tell us more in the comments or donate!
https://github.com/keras-team/keras/releases/tag/2.2.0
New keras follows PyTorch's OOP style model definitions?)
New keras follows PyTorch's OOP style model definitions?)
GitHub
Release Keras 2.2.0 · keras-team/keras
Areas of improvements
New model definition API: Model subclassing.
New input mode: ability to call models on TensorFlow tensors directly (TensorFlow backend only).
Improve feature coverage of Kera...
New model definition API: Model subclassing.
New input mode: ability to call models on TensorFlow tensors directly (TensorFlow backend only).
Improve feature coverage of Kera...
An amazing article about differential evolution
https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/
#data_science
https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/
#data_science
An interesting idea from a CV conference
Imagine that you have some kind of algorithm, that is not exactly differentiable, but is "back-propable".
In this case you can have very convoluted logic in your "forward" statement (essentially something in between trees and dynamic programming) - for example a set of clever if-statements.
In this case you will be able to share both of the 2 worlds - both your algorithm (you will have to re-implement in your framework) and backprop + CNN. Nice.
Ofc this works only for dynamic deep-learning frameworks.
#deep_learning
#data_science
Imagine that you have some kind of algorithm, that is not exactly differentiable, but is "back-propable".
In this case you can have very convoluted logic in your "forward" statement (essentially something in between trees and dynamic programming) - for example a set of clever if-statements.
In this case you will be able to share both of the 2 worlds - both your algorithm (you will have to re-implement in your framework) and backprop + CNN. Nice.
Ofc this works only for dynamic deep-learning frameworks.
#deep_learning
#data_science
Machines Can See 2018 adversarial competition
Happened to join forces with a team that won 2nd place in this competition
- https://spark-in.me/post/playing-with-mcs2018-adversarial-attacks
It was very entertaining and a new domain to me.
Read more materials:
- Our repo https://github.com/snakers4/msc-2018-final
- Our presentation https://drive.google.com/file/d/1P-4AdCqw81nOK79vU_m7IsCVzogdeSNq/view
- All presentations https://drive.google.com/file/d/1aIUSVFBHYabBRdolBRR-1RKhTMg-v-3f/view
#data_science
#deep_learning
#adversarial
Happened to join forces with a team that won 2nd place in this competition
- https://spark-in.me/post/playing-with-mcs2018-adversarial-attacks
It was very entertaining and a new domain to me.
Read more materials:
- Our repo https://github.com/snakers4/msc-2018-final
- Our presentation https://drive.google.com/file/d/1P-4AdCqw81nOK79vU_m7IsCVzogdeSNq/view
- All presentations https://drive.google.com/file/d/1aIUSVFBHYabBRdolBRR-1RKhTMg-v-3f/view
#data_science
#deep_learning
#adversarial
Spark in me
Playing with adversarial attacks on Machines Can See 2018 competition
This article is about MCS 2018 competition and my participation in it, adversarial attack methods and how out team won
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me