Spark in me

Transformer in PyTorch

Looks like somebody implement recent Google's transformer fine-tuning in PyTorch

https://github.com/huggingface/pytorch-openai-transformer-lm

Nice!

#nlp
#deep_learning

GitHub

GitHub - huggingface/pytorch-openai-transformer-lm: 🐥A PyTorch implementation of OpenAI's finetuned transformer language model…

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI - huggingface/pytorch-openai-transformer-lm

967 viewsAlexander, 10:54

Spark in me

Forwarded from Just links

https://blog.openai.com/openai-five/

15 viewsAlexander, 15:57

Spark in me

If someone needs a dataset, Kaggle launched ImageNet object detection
- https://www.kaggle.com/c/imagenet-object-localization-challenge#description

There is an open images dataset, which I guess is bigger though

#deep_learning

Kaggle

ImageNet Object Localization Challenge

Identify the objects in images

1.0K viewsAlexander, edited 07:02

Spark in me

https://www.youtube.com/watch?v=Te0L5_u_wIg

YouTube

Faceforensics: This AI Detects DeepFakes

The paper "FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces " is available here:
http://niessnerlab.org/projects/roessler2018faceforensics.html

Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers…

1.2K viewsAlexander, 19:25

Spark in me

2018 DS/ML digest 13

Blog posts / articles:
(0) Google notes on CNN generalization - https://goo.gl/XS4KAw
(1) Google to teaching robots in virtual environment and then trasferring models to reality - https://goo.gl/aAYCqE
(2) Google's object tracking via image colorization - https://goo.gl/xchvBQ
(2) Interesting articles about VAEs:
- A small intro into VAEs https://habr.com/company/otus/blog/358946/
- A small intuitive intro (super super cool and intuitive)
https://towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf
- KL divergence explained
https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained
- A more formal write-up http://arxiv.org/abs/1606.05908
- In (RU) https://habr.com/company/otus/blog/358946/
- Converting a FC layer into a conv layer http://cs231n.github.io/convolutional-networks/#convert
- A post by Fchollet https://blog.keras.io/building-autoencoders-in-keras.html

A good in-depth write-up on object detection:
- http://machinethink.net/blog/object-detection/
- finally a decent explanation of YOLO parametrization http://machinethink.net/images/object-detection/grid@2x.png
- best comparison of YOLO and SSD ever - http://machinethink.net/images/object-detection/architectures@2x.png

Papers with interesting abstracts (just good to know sich things exist)
- Low-bit CNNs - https://ai.intel.com/nervana/wp-content/uploads/sites/53/2018/06/ELQ_CameraReady_CVPR2018.pdf
- Automated Meta ML - https://arxiv.org/abs/1806.06927
- Idea - use ResNet blocks for boosting - https://arxiv.org/abs/1706.04964
- 2D-discrete-Fourier transform (2D-DFT) to encode rotational invariance in neural networks - https://arxiv.org/abs/1805.12301
- Smallify the CNNs - https://arxiv.org/abs/1806.03723
- BLEU review as a metric - conclusion - it is good on average to measure MT performance - https://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00322

"New" ideas in SemSeg:
- UNET + conditional VAE http://arxiv.org/abs/1806.05034
- Dilated convolutions for larget satellite images http://arxiv.org/abs/1709.00179 - looks like that this works only if you have high resolution with small objects

#digest
#deep_learning

Google AI Blog

How Can Neural Network Similarity Help Us Understand Training and Generalization?

Posted by Maithra Raghu, Google Brain Team and Ari S. Morcos, DeepMind In order to solve tasks, deep neural networks (DNNs) progressively...

856 viewsAlexander, 07:43

Spark in me

Forwarded from Hacker News

Python 3.7 released (Score: 100+ in 2 hours)

Link: https://readhacker.news/s/3MawZ
Comments: https://readhacker.news/c/3MawZ

Python.org

Python Release Python 3.7.0

The official home of the Python Programming Language

18 viewsAlexander, 07:59

Spark in me

DL Framework choice - 2018

If you are still new to DL / DS / ML and have not yet chosen your framework, consider reading this before proceeding

- https://deepsense.ai/keras-or-pytorch/

#deep_learning

832 viewsAlexander, 11:18

Spark in me

Playing with PyTorch 0.4

It was released some time ago
If you are not aware - this is the best summary
https://pytorch.org/2018/04/22/0_4_0-migration-guide.html

My first-hand experiences
- Multi-GPU support works strangely
- If you just launch your 0.3 code it will work on 0.4 with warnings - not a really breaking change
- All the new features are really cool, useful and make using PyTorch even more delightful
- I especially liked how they added context managers and cleaned up the device mess

#deep_learning

984 viewsAlexander, 15:22

Spark in me

Measuring feature importance properly

http://explained.ai/rf-importance/index.html

Once again stumbled upon an amazing article about measuring feature importance for any ML algorithms:
(0) Permutation importance - if your ML algorithm is costly, then you can just shuffle a column and check importance
(1) Drop column importance - drop a column, re-train a model, check performance metrics

Why it is useful / caveats
(0) If you really care about understanding your domain - feature importances are a must have
(1) All of this works only for powerful models
(2) Landmines include - correlated or duplicate variables, data normalization

Correlated variables
(0) For RF - correlated variables share permutation importance roughly proportionally to their correlation
(1) Drop column importance can behave unpredictably

I personally like engineering different kinds of features and doing ablation tests:
(0) Among feature sets, sharing similar purpose
(1) Within feature sets

#data_science

1.1K viewsAlexander, 11:48

Spark in me

2018 DS/ML digest 14

Amazing article - why you do not need ML
- https://cyberomin.github.io/startup/2018/07/01/sql-ml-ai.html
- I personally love plain-vanilla SQL and in 90% of cases people under-use it
- I even wrote 90% of my JSON API on our blog in pure PostgreSQL xD

Practice / papers
(0) Interesting papers from CVPR https://towardsdatascience.com/the-10-coolest-papers-from-cvpr-2018-11cb48585a49
(1) Some down-to-earth obstacles to ML deploy https://habr.com/company/hh/blog/415437/
(2) Using synthetic data for CNNs (by Nvidia) - https://arxiv.org/pdf/1804.06516.pdf
(3) This puzzles me - so much effort and engineering spent on something ... strange and useless - http://taskonomy.stanford.edu/index.html
On paper they do a cool thing - investigate transfer learning between different domains, but in practice it is done on TF and there is no clear conclusion of any kind
(4) VAE + real datasets http://siavashk.github.io/2016/02/22/autoencoder-imagenet/ - only small Imagenet (64x64)
(5) Understanding the speed of models deployed on mobile - http://machinethink.net/blog/how-fast-is-my-model/
(6) A brief overview of multi-modal methods https://medium.com/mlreview/multi-modal-methods-image-captioning-from-translation-to-attention-895b6444256e

Visualizations / explanations
(0) Amazing website with ML explanations http://explained.ai/
(1) PCA and linear VAEs are close https://pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/

#deep_learning
#digest
#data_science

cyberomin.github.io

No, you don't need ML/AI. You need SQL

A while ago, I did a Twitter thread about the need to use traditional and existing tools to solve everyday business problems other than jumping on new buzzwords, sexy and often times complicated technologies.

1.1K viewsAlexander, 04:51

Spark in me

A cool article from Ben Evans about how to think about ML

https://www.ben-evans.com/benedictevans/2018/06/22/ways-to-think-about-machine-learning-8nefy

Benedict Evans

Ways to think about machine learning — Benedict Evans

Everyone has heard of machine learning now, and every big company is working on projects around ‘AI’. We know this is a Next Big Thing. But we don’t yet have a settled sense of quite what machine learning means - what it will mean for tech companies or…

768 viewsAlexander, 07:15

Spark in me

My recent PyTorch 0.4 Dockerfile for CV

https://gist.github.com/snakers4/72ccc3d936f04a3307d20f1810b2fa81

#deep_learning

Gist

My PyTorch 0.4 Dockerfile

My PyTorch 0.4 Dockerfile. GitHub Gist: instantly share code, notes, and snippets.

948 viewsAlexander, 07:16

Spark in me

Open Images Object detection on Kaggle

- https://www.kaggle.com/c/google-ai-open-images-object-detection-track#Description

- Key ideas
-- 1.2 images, high-res, 500 classes
-- decent prizes, but short time-span (2 months)
-- object detection

#deep_learning

Kaggle

Google AI Open Images - Object Detection Track

Detect objects in varied and complex images.

752 viewsAlexander, 05:12

Spark in me

2018 DS/ML digest 15

What I filtered through this time

Market / news
(0) Letters by big company employees against using ML for weapons
- Microsoft
- Amazon
(1) Facebook open sources Dense Pose (eseentially this is Mask-RCNN)
- https://research.fb.com/facebook-open-sources-densepose/

Papers / posts / NLP
(0) One more blog post about text / sentence embeddings https://goo.gl/Zm8C2c
- key idea different weighting

(1) One more sentence embedding calculation method
- https://openreview.net/pdf?id=SyK00v5xx ?

(2) Posts explaing NLP embeddings
- http://www.offconvex.org/2015/12/12/word-embeddings-1/ - some basics - SVD / Word2Vec / GloVe
-- SVD improves embedding quality (as compared to ohe)?
-- use log-weighting, use TF-IDF weighting (the above weighting)
- http://www.offconvex.org/2016/02/14/word-embeddings-2/ - word embedding properties
-- dimensions vs. embedding quality http://www.cs.princeton.edu/~arora/pubs/LSAgraph.jpg

(3) Spacy + Cython = 100x speed boost - https://goo.gl/9TwVqu - good to know about this as a last resort
- described use-case

you are pre-processing a large training set for a DeepLearning framework like pyTorch/TensorFlow
or you have a heavy processing logic in your DeepLearning batch loader that slows down your training

(4) Once again stumbled upon this - https://blog.openai.com/language-unsupervised/

(5) Papers
- Simple NLP embedding baseline https://goo.gl/nGujzS
- NLP decathlon for question answering https://goo.gl/6HHi7q
- Debiasing embeddings https://arxiv.org/abs/1806.06301
- Once again transfer learning in NLP by open-AI - https://goo.gl/82VR4U

#deep_learning
#digest
#data_science

837 viewsAlexander, edited 07:57

Spark in me

Forwarded from SK

http://nlp.town/blog/sentence-similarity/

824 viewsAlexander, 08:11

Spark in me

https://www.youtube.com/watch?utm_campaign=Revue+newsletter&utm_medium=Newsletter&utm_source=NLP+News&v=3o4VzEyJ0WA

YouTube

Machine Learning Research & Interpreting Neural Networks

Machine learning and neural networks change how computers and humans interact, but they can be complicated to understand. In this episode of Coffee with a Googler, Laurence Moroney (@lmoroney) sits down with Christoper Olah (@ch402) from the Google Brain…

870 viewsAlexander, 14:16

Spark in me

Forwarded from Just links

https://twitter.com/Foone/status/1014267515696922624

Twitter

foone

You want to know something about how bullshit insane our brains are? OK, so there's a physical problem with our eyes: We move them in short fast bursts called "saccades", right? very quick, synchronized movements. The only problem is: they go all blurry and…

12 viewsAlexander, 16:56

Spark in me

XGB - now on GPU properly?
https://twitter.com/i/web/status/1014192185510629378

Twitter

Joshua Patterson

#XGBoost is faster than ever, with better scaling, on #GPU thanks to the hard work of @nvidia & @h2oai! Check out the latest paper https://t.co/P2m31idljB, and more is coming very soon! #lightgbm #catboost #GBDT

1.2K viewsAlexander, 05:54

Spark in me

Forwarded from Админим с Буквой (bykva)

Bash shortcuts

Написал микро лабораторную работу для обучения хоткеям в bash.

https://medium.com/@bykvaadm/bash-shortcuts-d6f275a6ce9d

#bash_tips_and_tricks #junior

Medium

bash shortcuts

Небольшая лабораторка по изучению основных хоткеев в bash. Подготовьте себе вот такую строку:

17 viewsAlexander, 15:33

Spark in me

https://youtu.be/qS4H6PEcCCA

YouTube

Epicycles, complex Fourier series and Homer Simpson's orbit

NEW (Christmas 2019). Two ways to support Mathologer
Mathologer Patreon: https://www.patreon.com/mathologer
Mathologer PayPal: paypal.me/mathologer
(see the Patreon page for details)

Today’s video was motivated by an amazing animation of a picture of Homer…

786 viewsAlexander, 08:27

Spark in me

Playing with VAEs and their practical use

So, I played a bit with Variational Auto Encoders (VAE) and wrote a small blog post on this topic

https://spark-in.me/post/playing-with-vae-umap-pca

Please like, share and repost!

#deep_learning
#data_science

Like this post or have something to say => tell us more in the comments or donate!

938 viewsspark_comment_bot, 12:29

0+ Comments Donate

About

Blog

Apps

Platform