Internet digest
(0) Ben Evans - https://goo.gl/72b4pm
ML / industry
(1) FB to design its own FPGAs / ML chips - https://goo.gl/nh2Wph ?
(2) Google willing to replicate iMessage, again https://goo.gl/MtwCet
-- No mention of Telegram - but all Google's attempts are aeons behind Telegram
-- Google willing to go the hardest route - a standard enforced on the carrier + replace the messenging app
-- All of the previous attempts kind of did not work
(3) Facebook media backlash - https://goo.gl/rKd9E5
(4) Who makes LIDARs - https://goo.gl/uD5qc5
(5) Tesla over automation - https://goo.gl/1WBMj3
Telecom
(1) British Telecom to switch to VOIP - https://goo.gl/MCbZgq
(2) Flickr purchased - https://goo.gl/AMcE6f
#internet
(0) Ben Evans - https://goo.gl/72b4pm
ML / industry
(1) FB to design its own FPGAs / ML chips - https://goo.gl/nh2Wph ?
(2) Google willing to replicate iMessage, again https://goo.gl/MtwCet
-- No mention of Telegram - but all Google's attempts are aeons behind Telegram
-- Google willing to go the hardest route - a standard enforced on the carrier + replace the messenging app
-- All of the previous attempts kind of did not work
(3) Facebook media backlash - https://goo.gl/rKd9E5
(4) Who makes LIDARs - https://goo.gl/uD5qc5
(5) Tesla over automation - https://goo.gl/1WBMj3
Telecom
(1) British Telecom to switch to VOIP - https://goo.gl/MCbZgq
(2) Flickr purchased - https://goo.gl/AMcE6f
#internet
TechCrunch
Facebook has a new job posting calling for chip designers
Facebook has posted a job opening looking for an expert in ASIC and FPGA, two custom silicon designs that companies can gear toward specific use cases — particularly in machine learning and artific…
PyTorch 0.4 released
https://github.com/pytorch/pytorch/releases/tag/v0.4.0
Key
(1) Tensor / Variable merged
(2) Zero-dimensional Tensors
(3) dtypes
(4) migration guide http://pytorch.org/2018/04/22/0_4_0-migration-guide.html
#pytorch
https://github.com/pytorch/pytorch/releases/tag/v0.4.0
Key
(1) Tensor / Variable merged
(2) Zero-dimensional Tensors
(3) dtypes
(4) migration guide http://pytorch.org/2018/04/22/0_4_0-migration-guide.html
#pytorch
GitHub
Release Trade-off memory for compute, Windows support, 24 distributions with cdf, variance etc., dtypes, zero-dimensional Tensors…
PyTorch 0.4.0 release notes
Table of Contents
Major Core Changes
Tensor / Variable merged
Zero-dimensional Tensors
dtypes
migration guide
New Features
Tensors
Full support for advanced indexi...
Table of Contents
Major Core Changes
Tensor / Variable merged
Zero-dimensional Tensors
dtypes
migration guide
New Features
Tensors
Full support for advanced indexi...
On the surface looks like an interesting competition
Well, I said that about Power Laws - but then it turned out otherwise.
So far I can see CV, NLP and tables in one mix.
https://www.kaggle.com/c/avito-demand-prediction/
#data_science
Well, I said that about Power Laws - but then it turned out otherwise.
So far I can see CV, NLP and tables in one mix.
https://www.kaggle.com/c/avito-demand-prediction/
#data_science
Kaggle
Avito Demand Prediction Challenge
Predict demand for an online classified ad
Forwarded from Админим с Буквой (bykva)
Релиз дистрибутива Ubuntu 18.04 LTS
Состоялся релиз дистрибутива Ubuntu 18.04 "Bionic Beaver", который отнесён к категории выпусков с длительным сроком поддержки (LTS), обновления для которых формируются в течение 5 лет. Установочные образы созданы для Ubuntu Desktop, Ubuntu Server, Ubuntu Cloud, Kubuntu, Ubuntu Budgie, Lubuntu, Ubuntu Studio, Ubuntu Kylin, Ubuntu MATE и Xubuntu.
Состоялся релиз дистрибутива Ubuntu 18.04 "Bionic Beaver", который отнесён к категории выпусков с длительным сроком поддержки (LTS), обновления для которых формируются в течение 5 лет. Установочные образы созданы для Ubuntu Desktop, Ubuntu Server, Ubuntu Cloud, Kubuntu, Ubuntu Budgie, Lubuntu, Ubuntu Studio, Ubuntu Kylin, Ubuntu MATE и Xubuntu.
A handy snippet for `IOU` calculation
https://stackoverflow.com/questions/25349178/calculating-percentage-of-bounding-box-overlap-for-image-detector-evaluation
#deep_learning
https://stackoverflow.com/questions/25349178/calculating-percentage-of-bounding-box-overlap-for-image-detector-evaluation
#deep_learning
Stack Overflow
Calculating percentage of Bounding box overlap, for image detector evaluation
In testing an object detection algorithm in large images, we check our detected bounding boxes against the coordinates given for the ground truth rectangles.
According to the Pascal VOC challenges,
According to the Pascal VOC challenges,
Widen Jupyter editor to 100% wide screen
Just apply this CSS
#data_science
Just apply this CSS
#texteditor-container {
width: 95%
}
#data_science
Using Mendeley to read papers
Looks like when you migrate to a new PC it also can migrate your literature library.
Nice.
#data_science
Looks like when you migrate to a new PC it also can migrate your literature library.
Nice.
#data_science
Forwarded from Админим с Буквой (bykva)
Downgrading PyTorch from 0.4 to 0.3
Newest PyTorch has some issues with regards to multi-GPU operation.
If you want to install the previous version, the downgrade docs are a bit outdated, but you can simply:
#deep_learning
Newest PyTorch has some issues with regards to multi-GPU operation.
If you want to install the previous version, the downgrade docs are a bit outdated, but you can simply:
conda install pytorch=0.3.0 cuda90 -c pytorch
#deep_learning
A small saga about OpenVPN
TLDR:
(0) Purchase a cheap VDS from a
(1) This guide just works https://goo.gl/K2xjby (do not be afraid of its length - it is just verbose);
(2) I tested it with
From a financial standpoint US$1-5 per month per 3-5 users without any 3rd party services seems to be a bargain.
Hosting options:
(0) With DO it just works (just follow the guide step by step). But the cheapest VDS (which is overkill for this) costs
(1) Tested it with
(2) If you want to search yourself - go here - http://lowendstock.com/ - the best 2 options seem to be
Host.us caveats:
(0) If you would like to follow the DO guide but use
(1) VPS provisioning time there is 0-8 hours. In my case it was ~40 mins;
(2) I also faced this bug -https://goo.gl/BTqeTX;
What if I have a problem with ssh keys on windows?
(0) This will give you some basic info about managing Linux servers https://goo.gl/TgL61G;
(1) Here we explain how to use Putty and ssh keys on Windows https://goo.gl/xxvGBb (also just google it);
Why OpenVPN:
(0) Seems to be the most well-known open-source VPN software with easy accessible clients for all major platforms;
(1) I know people who used it;
Alternatives:
(0) https://github.com/trailofbits/algo - seems to be newer and cooler, but I do not know living people who reported actually using it;
#linux
#digital_freedom
TLDR:
(0) Purchase a cheap VDS from a
noname
provider with decent bandwidth => install OpenVPN => forget about problems => share with friends and family;(1) This guide just works https://goo.gl/K2xjby (do not be afraid of its length - it is just verbose);
(2) I tested it with
DigitalOcean
and hostus.us
;From a financial standpoint US$1-5 per month per 3-5 users without any 3rd party services seems to be a bargain.
Hosting options:
(0) With DO it just works (just follow the guide step by step). But the cheapest VDS (which is overkill for this) costs
US$5
per month. If you use my link - https://m.do.co/c/6f8e77dddc23 - you will get US$10 for free;(1) Tested it with
hostus.us
. Follow my link, if you would like to support us - https://my.hostus.us/aff.php?aff=2169. A decent VPS can be found in Amsterdam for as cheap as US$5-8 for 3 months. Be careful - their UX is a bit misleading at times - (!!!) the country choice does not seem to flow from one menu to another (!!!). This seems to be more than enough - https://goo.gl/GyPZ6u;(2) If you want to search yourself - go here - http://lowendstock.com/ - the best 2 options seem to be
VirMach
and hostus
, but the former is sold out;Host.us caveats:
(0) If you would like to follow the DO guide but use
hostus
, then for the cheapest options do not forget to enable this in the admin https://goo.gl/DRx3UX;(1) VPS provisioning time there is 0-8 hours. In my case it was ~40 mins;
(2) I also faced this bug -https://goo.gl/BTqeTX;
What if I have a problem with ssh keys on windows?
(0) This will give you some basic info about managing Linux servers https://goo.gl/TgL61G;
(1) Here we explain how to use Putty and ssh keys on Windows https://goo.gl/xxvGBb (also just google it);
Why OpenVPN:
(0) Seems to be the most well-known open-source VPN software with easy accessible clients for all major platforms;
(1) I know people who used it;
Alternatives:
(0) https://github.com/trailofbits/algo - seems to be newer and cooler, but I do not know living people who reported actually using it;
#linux
#digital_freedom
DigitalOcean
Как настроить сервер OpenVPN в Ubuntu 16.04 | DigitalOcean
Хотите иметь безопасный и защищённый доступ в Интернет с вашего смартфона или ноутбука при подключении к незащищённой сети через WiFi отеля или кафе Виртуальная частная сеть (Virtual Private Network, VPN) позволяет...
Playing with unsupervised learning in genetics
A small blog post on this topic
https://spark-in.me/post/playing-with-genetics
The first thing that springs to mind is RNN but what if there is no annotation and it is not known if the data consists of valid sequences?)
#data_science
A small blog post on this topic
https://spark-in.me/post/playing-with-genetics
The first thing that springs to mind is RNN but what if there is no annotation and it is not known if the data consists of valid sequences?)
#data_science
Spark in me
Playing with genetic markers, clustering and visualization
Mesmerizing structires found in data: encoding, dimension reduction, clustering and visualization a dataset with genetic markers
Статьи автора - http://spark-in.me/author/yara_tchk
Блог - http://spark-in.me
Статьи автора - http://spark-in.me/author/yara_tchk
Блог - http://spark-in.me
Pinned post
What is this channel about?
(0) This channel is a practitioner's channel on the following topics: Internet, Data Science, Deep Learning, Python
(1) Don't get your opinion in a twist if your opinion differs. You are welcome to contact me via telegram @snakers41 and email - aveysov@gmail.com
(2) No BS and ads
Donations
(0) Become a patreon 🤟 - https://www.patreon.com/bePatron?u=6159641
(1) Buy me a coffee 🤟 https://buymeacoff.ee/8oneCIN
Give us a rating:
(0) https://telegram.me/tchannelsbot?start=snakers4
Our chat
(0) https://t.me/joinchat/Bv9tjkH9JHaAEL-FVtw9Tw
More links
(0) Our website http://spark-in.me
(1) Our chat https://goo.gl/IS6Kzz
(2) DS courses review
http://goo.gl/5VGU5A
https://spark-in.me/post/learn-data-science
(3) GAN papers review
https://spark-in.me/post/gan-paper-review
(4) SpaceNet Challenge
https://spark-in.me/post/spacenet-three-challenge
(5) DS Bowl 2018
https://spark-in.me/post/playing-with-dwt-and-ds-bowl-2018
(6) Data Science tag on the website
https://spark-in.me/tag/data-science
What is this channel about?
(0) This channel is a practitioner's channel on the following topics: Internet, Data Science, Deep Learning, Python
(1) Don't get your opinion in a twist if your opinion differs. You are welcome to contact me via telegram @snakers41 and email - aveysov@gmail.com
(2) No BS and ads
Donations
(0) Become a patreon 🤟 - https://www.patreon.com/bePatron?u=6159641
(1) Buy me a coffee 🤟 https://buymeacoff.ee/8oneCIN
Give us a rating:
(0) https://telegram.me/tchannelsbot?start=snakers4
Our chat
(0) https://t.me/joinchat/Bv9tjkH9JHaAEL-FVtw9Tw
More links
(0) Our website http://spark-in.me
(1) Our chat https://goo.gl/IS6Kzz
(2) DS courses review
http://goo.gl/5VGU5A
https://spark-in.me/post/learn-data-science
(3) GAN papers review
https://spark-in.me/post/gan-paper-review
(4) SpaceNet Challenge
https://spark-in.me/post/spacenet-three-challenge
(5) DS Bowl 2018
https://spark-in.me/post/playing-with-dwt-and-ds-bowl-2018
(6) Data Science tag on the website
https://spark-in.me/tag/data-science
Patreon
Patron Checkout | Patreon
Patreon is empowering a new generation of creators.
Support and engage with artists and creators as they live out their passions!
Support and engage with artists and creators as they live out their passions!
Spark in me pinned «Pinned post What is this channel about? (0) This channel is a practitioner's channel on the following topics: Internet, Data Science, Deep Learning, Python (1) Don't get your opinion in a twist if your opinion differs. You are welcome to contact me via telegram…»
Showing more images in Tensorboard
TB is super cool (also in together with script https://gist.github.com/gyglim/1f8dfb1b5c82627ae3efcfbbadb9f514), but it shows ~10 images in its image preview.
This can be fixed.
(0) Find your TB folder
(1)
open
(2)
Change this line
Profit - now it shows ~400 images on each view tab
#deep_learning
TB is super cool (also in together with script https://gist.github.com/gyglim/1f8dfb1b5c82627ae3efcfbbadb9f514), but it shows ~10 images in its image preview.
This can be fixed.
(0) Find your TB folder
import tensorboardIn my case it shows
tensorboard.__file__
'/opt/conda/lib/python3.6/site-packages/tensorboard/__init__.py'
(1)
cd
thereopen
backend/application.py
(2)
Change this line
image_metadata.PLUGIN_NAME: 400,(3)
Profit - now it shows ~400 images on each view tab
#deep_learning
Gist
Logging to tensorboard without tensorflow operations. Uses manually generated summaries instead of summary ops
Logging to tensorboard without tensorflow operations. Uses manually generated summaries instead of summary ops - tensorboard_logging.py
Exploring GANs and unsupervised learning
Here are my findings from my hobby project about using GANs and unsupervised methods to build some decent semantic search on a large dataset of images without annotation:
(0) https://spark-in.me/post/unsupervised-learning-limits
Lots of cool images.
TLDR
(0) Features from pre-trained Imagenet encoder => PCA => Umap => HDBSCAN work really well for image clusterization;
(1) Any siamese network / hard negative mining inspired methods just did not work - the annotation data is too coarse;
(2) GANs kind of work, but I could not achieve the boasted photo-realistic levels;
#deep_learning
Here are my findings from my hobby project about using GANs and unsupervised methods to build some decent semantic search on a large dataset of images without annotation:
(0) https://spark-in.me/post/unsupervised-learning-limits
Lots of cool images.
TLDR
(0) Features from pre-trained Imagenet encoder => PCA => Umap => HDBSCAN work really well for image clusterization;
(1) Any siamese network / hard negative mining inspired methods just did not work - the annotation data is too coarse;
(2) GANs kind of work, but I could not achieve the boasted photo-realistic levels;
#deep_learning
Spark in me
Exploring the limits of unsupervised Machine Learning in Computer Vision
In this article I share my experience with GANs, progressive growing of GANs, image clustering and unsupervised learning
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me
2018 DS/ML digest 9
Market / libraries
(0) Tensorflow + Swift - wtf - https://goo.gl/FDvLM4
(1) Geektimes / Habrhabr.ru going international - https://goo.gl/dbGNwD
(2) A service for renting GPUs ... from people
- Reddit https://goo.gl/HxQ54x
- Link https://vectordash.com/hosting/
- Looks LXC based (afaik - the only user friendly alternative to Docker)
- Cool in theory, no idea how secure this is - we can assume as secure as providing a docker container to stranger
- They did not reply me in a week
(3) A friend sent me a new list of ... new yet another PyTorch NLP libraries
- https://goo.gl/kasRfZ, https://goo.gl/XXnbJy (AllenNLP is the biggest library like this)
- I believe that such libraries are more or less useless for real tasks, but cool to know they exist
(4) New SpaceNet 4? https://goo.gl/CsSS6P
(5) A new super cool competition on Kaggle about particle physics? https://www.kaggle.com/c/trackml-particle-identification
Tutorials / basics
(0) Bias vs. Variance (RU) https://goo.gl/4Y7tH7
(1) Yet another magic Jupyter guideline collection - https://goo.gl/AFWMuq
Real world ML applications
(0) Resnet + object detection (RU) - people wo helmets 90% accuracy - https://goo.gl/7xpQnE
(1) Fast.ai about using embeddings with Tabular data - http://www.fast.ai/2018/04/29/categorical-embeddings/
Very similar to our approach on electricity
I personally do not recommend using their library by all means
(2) Comparing Google TPU vs. V100 with ResNet50 - https://goo.gl/s6dhsy
- speed - https://goo.gl/Pww2sm
- pricing - https://goo.gl/Rtkp8Q
- but ... buying GPUs is much cheaper
(3) Other blog posts about embeddings + tabular data
- Sales prediction http://blog.kaggle.com/2016/01/22/rossmann-store-sales-winners-interview-3rd-place-cheng-gui/
- Taxi drive prediction http://blog.kaggle.com/2015/07/27/taxi-trajectory-winners-interview-1st-place-team-%F0%9F%9A%95/
MLP + classification + embeddings - https://goo.gl/AMNGNG / https://arxiv.org/pdf/1508.00021.pdf
(4) Albu's solution to SpaceNet - augmentations https://github.com/SpaceNetChallenge/RoadDetector/tree/master/albu-solution/src/augmentations
CNN overview
Jobs / job market
(0) Developers by country by scraping GitHub - https://goo.gl/n8gnLi
- developers count vs. GDP http://prntscr.com/j9v80e R^2 = 84%
- developers count vs. population - R^2 = 50%
Visualization
(0) Interactive tool for visualizing convolutions - https://ezyang.github.io/convolution-visualizer/
Datasets
(0) Open Images v4 outsourced
- https://research.googleblog.com/2018/04/announcing-open-images-v4-and-eccv-2018.html
- the dataset itself https://storage.googleapis.com/openimages/web/download.html
- categories https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy_visualizer/circle.html
#data_science
#deep_learning
#digest
Market / libraries
(0) Tensorflow + Swift - wtf - https://goo.gl/FDvLM4
(1) Geektimes / Habrhabr.ru going international - https://goo.gl/dbGNwD
(2) A service for renting GPUs ... from people
- Reddit https://goo.gl/HxQ54x
- Link https://vectordash.com/hosting/
- Looks LXC based (afaik - the only user friendly alternative to Docker)
- Cool in theory, no idea how secure this is - we can assume as secure as providing a docker container to stranger
- They did not reply me in a week
(3) A friend sent me a new list of ... new yet another PyTorch NLP libraries
- https://goo.gl/kasRfZ, https://goo.gl/XXnbJy (AllenNLP is the biggest library like this)
- I believe that such libraries are more or less useless for real tasks, but cool to know they exist
(4) New SpaceNet 4? https://goo.gl/CsSS6P
(5) A new super cool competition on Kaggle about particle physics? https://www.kaggle.com/c/trackml-particle-identification
Tutorials / basics
(0) Bias vs. Variance (RU) https://goo.gl/4Y7tH7
(1) Yet another magic Jupyter guideline collection - https://goo.gl/AFWMuq
Real world ML applications
(0) Resnet + object detection (RU) - people wo helmets 90% accuracy - https://goo.gl/7xpQnE
(1) Fast.ai about using embeddings with Tabular data - http://www.fast.ai/2018/04/29/categorical-embeddings/
Very similar to our approach on electricity
I personally do not recommend using their library by all means
(2) Comparing Google TPU vs. V100 with ResNet50 - https://goo.gl/s6dhsy
- speed - https://goo.gl/Pww2sm
- pricing - https://goo.gl/Rtkp8Q
- but ... buying GPUs is much cheaper
(3) Other blog posts about embeddings + tabular data
- Sales prediction http://blog.kaggle.com/2016/01/22/rossmann-store-sales-winners-interview-3rd-place-cheng-gui/
- Taxi drive prediction http://blog.kaggle.com/2015/07/27/taxi-trajectory-winners-interview-1st-place-team-%F0%9F%9A%95/
MLP + classification + embeddings - https://goo.gl/AMNGNG / https://arxiv.org/pdf/1508.00021.pdf
(4) Albu's solution to SpaceNet - augmentations https://github.com/SpaceNetChallenge/RoadDetector/tree/master/albu-solution/src/augmentations
CNN overview
Neural network part:
Split data to 4 folds randomly but the same number of each city tiles in every fold
Use resnet34 as encoder and unet-like decoder (conv-relu-upsample-conv-relu) with skip connection from every layer of network. Loss function: 0.8*binary_cross_entropy + 0.2*(1 – dice_coeff). Optimizer – Adam with default params.
Train on image crops 512*512 with batch size 11 for 30 epoch (8 times more images in one epoch)
Train 20 epochs with lr 1e-4
Train 5 epochs with lr 2e-5
Train 5 epochs with lr 4e-6
Predict on full image with padding 22 on borders (1344*1344).
Merge folds by mean
Jobs / job market
(0) Developers by country by scraping GitHub - https://goo.gl/n8gnLi
- developers count vs. GDP http://prntscr.com/j9v80e R^2 = 84%
- developers count vs. population - R^2 = 50%
Visualization
(0) Interactive tool for visualizing convolutions - https://ezyang.github.io/convolution-visualizer/
Datasets
(0) Open Images v4 outsourced
- https://research.googleblog.com/2018/04/announcing-open-images-v4-and-eccv-2018.html
- the dataset itself https://storage.googleapis.com/openimages/web/download.html
- categories https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy_visualizer/circle.html
#data_science
#deep_learning
#digest
GitHub
tensorflow/swift
swift - Swift for TensorFlow documentation repository.
Spark in me via @vote
Add comment button below major posts?
anonymous poll
Yes, definitely! – 26
👍👍👍👍👍👍👍 53%
Meh... – 14
👍👍👍👍 29%
No, why? – 7
👍👍 14%
Your option (PM / chat) – 2
👍 4%
👥 49 people voted so far.
anonymous poll
Yes, definitely! – 26
👍👍👍👍👍👍👍 53%
Meh... – 14
👍👍👍👍 29%
No, why? – 7
👍👍 14%
Your option (PM / chat) – 2
👍 4%
👥 49 people voted so far.
The current state of ML
https://goo.gl/rzKUiQ
(1) Do not call it AI
(2) Distinguish ML from Intelligent Infrastructure and Intelligence Augmentation
(3) Human-imitative AI is not tractable now
(4) Developments which are now being called "AI" arose mostly in the engineering fields associated with low-level pattern recognition and movement control
#deep_learning
https://goo.gl/rzKUiQ
(1) Do not call it AI
(2) Distinguish ML from Intelligent Infrastructure and Intelligence Augmentation
(3) Human-imitative AI is not tractable now
(4) Developments which are now being called "AI" arose mostly in the engineering fields associated with low-level pattern recognition and movement control
#deep_learning
Medium
Artificial Intelligence — The Revolution Hasn’t Happened Yet
Artificial Intelligence (AI) is the mantra of the current era. The phrase is intoned by technologists, academicians, journalists and…
A decent explanation about decorators in Python
http://book.pythontips.com/en/latest/decorators.html
#python
http://book.pythontips.com/en/latest/decorators.html
#python