Spark in me – Telegram

Spark in me

2.18K subscribers

973 photos

56 videos

116 files

2.74K links

Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.

Download Telegram

About

Blog

Apps

Platform

2.18K subscribers

Internet digest

(0) Ben Evans - https://goo.gl/72b4pm

ML / industry
(1) FB to design its own FPGAs / ML chips - https://goo.gl/nh2Wph ?
(2) Google willing to replicate iMessage, again https://goo.gl/MtwCet
-- No mention of Telegram - but all Google's attempts are aeons behind Telegram
-- Google willing to go the hardest route - a standard enforced on the carrier + replace the messenging app
-- All of the previous attempts kind of did not work
(3) Facebook media backlash - https://goo.gl/rKd9E5
(4) Who makes LIDARs - https://goo.gl/uD5qc5
(5) Tesla over automation - https://goo.gl/1WBMj3

Telecom
(1) British Telecom to switch to VOIP - https://goo.gl/MCbZgq
(2) Flickr purchased - https://goo.gl/AMcE6f

#internet

Facebook has a new job posting calling for chip designers

Facebook has posted a job opening looking for an expert in ASIC and FPGA, two custom silicon designs that companies can gear toward specific use cases — particularly in machine learning and artific…

1.0K viewsAlexander, edited 18:00

PyTorch 0.4 released

https://github.com/pytorch/pytorch/releases/tag/v0.4.0

Key
(1) Tensor / Variable merged
(2) Zero-dimensional Tensors
(3) dtypes
(4) migration guide http://pytorch.org/2018/04/22/0_4_0-migration-guide.html

#pytorch

Release Trade-off memory for compute, Windows support, 24 distributions with cdf, variance etc., dtypes, zero-dimensional Tensors…

PyTorch 0.4.0 release notes
Table of Contents

Major Core Changes

Tensor / Variable merged
Zero-dimensional Tensors
dtypes
migration guide

New Features

Tensors

Full support for advanced indexi...

1.5K viewsAlexander, edited 06:13

On the surface looks like an interesting competition

Well, I said that about Power Laws - but then it turned out otherwise.
So far I can see CV, NLP and tables in one mix.

https://www.kaggle.com/c/avito-demand-prediction/

#data_science

Avito Demand Prediction Challenge

Predict demand for an online classified ad

1.1K viewsAlexander, 04:24

https://youtu.be/m9XyXiL6n8w

AI Learns Real-Time 3D Face Reconstruction | Two Minute Papers #245

The paper "Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network" and its source code is available here:
https://arxiv.org/abs/1803.07835
https://github.com/YadiraF/PRNet

Addicted? Pick up cool perks on our Patreon page! …

972 viewsAlexander, 19:31

Forwarded from Админим с Буквой (bykva)

Релиз дистрибутива Ubuntu 18.04 LTS

Состоялся релиз дистрибутива Ubuntu 18.04 "Bionic Beaver", который отнесён к категории выпусков с длительным сроком поддержки (LTS), обновления для которых формируются в течение 5 лет. Установочные образы созданы для Ubuntu Desktop, Ubuntu Server, Ubuntu Cloud, Kubuntu, Ubuntu Budgie, Lubuntu, Ubuntu Studio, Ubuntu Kylin, Ubuntu MATE и Xubuntu.

17 viewsAlexander, 07:51

A handy snippet for `IOU` calculation

https://stackoverflow.com/questions/25349178/calculating-percentage-of-bounding-box-overlap-for-image-detector-evaluation

#deep_learning

Calculating percentage of Bounding box overlap, for image detector evaluation

In testing an object detection algorithm in large images, we check our detected bounding boxes against the coordinates given for the ground truth rectangles.

According to the Pascal VOC challenges,

1.1K viewsAlexander, 09:58

Widen Jupyter editor to 100% wide screen

Just apply this CSS

#texteditor-container {
    width: 95%
}

1.2K viewsAlexander, 09:59

Using Mendeley to read papers

Looks like when you migrate to a new PC it also can migrate your literature library.
Nice.

#data_science

1.0K viewsAlexander, 08:45

Forwarded from Админим с Буквой (bykva)

Docker pull via proxy

# systemctl edit docker.service

add the following strings:

[Service]
Environment=ALL_PROXY=socks5://user:password@host:port

reload systemd && restart docker

# systemctl daemon-reload
# systemctl restart docker.service

27 viewsAlexander, 08:38

Downgrading PyTorch from 0.4 to 0.3

Newest PyTorch has some issues with regards to multi-GPU operation.

If you want to install the previous version, the downgrade docs are a bit outdated, but you can simply:

conda install pytorch=0.3.0 cuda90 -c pytorch

830 viewsAlexander, edited 18:32

A small saga about OpenVPN

TLDR:
(0) Purchase a cheap VDS from a noname provider with decent bandwidth => install OpenVPN => forget about problems => share with friends and family;
(1) This guide just works https://goo.gl/K2xjby (do not be afraid of its length - it is just verbose);
(2) I tested it with DigitalOcean and hostus.us;

From a financial standpoint US$1-5 per month per 3-5 users without any 3rd party services seems to be a bargain.

Hosting options:
(0) With DO it just works (just follow the guide step by step). But the cheapest VDS (which is overkill for this) costs US$5 per month. If you use my link - https://m.do.co/c/6f8e77dddc23 - you will get US$10 for free;
(1) Tested it with hostus.us. Follow my link, if you would like to support us - https://my.hostus.us/aff.php?aff=2169. A decent VPS can be found in Amsterdam for as cheap as US$5-8 for 3 months. Be careful - their UX is a bit misleading at times - (!!!) the country choice does not seem to flow from one menu to another (!!!). This seems to be more than enough - https://goo.gl/GyPZ6u;
(2) If you want to search yourself - go here - http://lowendstock.com/ - the best 2 options seem to be VirMach and hostus, but the former is sold out;

Host.us caveats:
(0) If you would like to follow the DO guide but use hostus, then for the cheapest options do not forget to enable this in the admin https://goo.gl/DRx3UX;
(1) VPS provisioning time there is 0-8 hours. In my case it was ~40 mins;
(2) I also faced this bug -https://goo.gl/BTqeTX;

What if I have a problem with ssh keys on windows?
(0) This will give you some basic info about managing Linux servers https://goo.gl/TgL61G;
(1) Here we explain how to use Putty and ssh keys on Windows https://goo.gl/xxvGBb (also just google it);

Why OpenVPN:
(0) Seems to be the most well-known open-source VPN software with easy accessible clients for all major platforms;
(1) I know people who used it;

Alternatives:
(0) https://github.com/trailofbits/algo - seems to be newer and cooler, but I do not know living people who reported actually using it;

#linux
#digital_freedom

Как настроить сервер OpenVPN в Ubuntu 16.04 | DigitalOcean

Хотите иметь безопасный и защищённый доступ в Интернет с вашего смартфона или ноутбука при подключении к незащищённой сети через WiFi отеля или кафе Виртуальная частная сеть (Virtual Private Network, VPN) позволяет...

1.3K viewsAlexander, 05:37

Playing with unsupervised learning in genetics

A small blog post on this topic
https://spark-in.me/post/playing-with-genetics

The first thing that springs to mind is RNN but what if there is no annotation and it is not known if the data consists of valid sequences?)

#data_science

Playing with genetic markers, clustering and visualization

Mesmerizing structires found in data: encoding, dimension reduction, clustering and visualization a dataset with genetic markers
Статьи автора - http://spark-in.me/author/yara_tchk
Блог - http://spark-in.me

1.0K viewsAlexander, 05:37

Pinned post

What is this channel about?
(0) This channel is a practitioner's channel on the following topics: Internet, Data Science, Deep Learning, Python
(1) Don't get your opinion in a twist if your opinion differs. You are welcome to contact me via telegram @snakers41 and email - aveysov@gmail.com
(2) No BS and ads

Donations
(0) Become a patreon 🤟 - https://www.patreon.com/bePatron?u=6159641
(1) Buy me a coffee 🤟 https://buymeacoff.ee/8oneCIN

Give us a rating:
(0) https://telegram.me/tchannelsbot?start=snakers4

Our chat
(0) https://t.me/joinchat/Bv9tjkH9JHaAEL-FVtw9Tw

More links
(0) Our website http://spark-in.me
(1) Our chat https://goo.gl/IS6Kzz
(2) DS courses review
http://goo.gl/5VGU5A
https://spark-in.me/post/learn-data-science
(3) GAN papers review
https://spark-in.me/post/gan-paper-review
(4) SpaceNet Challenge
https://spark-in.me/post/spacenet-three-challenge
(5) DS Bowl 2018
https://spark-in.me/post/playing-with-dwt-and-ds-bowl-2018
(6) Data Science tag on the website
https://spark-in.me/tag/data-science

Patron Checkout | Patreon

Patreon is empowering a new generation of creators.
Support and engage with artists and creators as they live out their passions!

1.1K viewsAlexander, 06:51

Spark in me pinned «Pinned post What is this channel about? (0) This channel is a practitioner's channel on the following topics: Internet, Data Science, Deep Learning, Python (1) Don't get your opinion in a twist if your opinion differs. You are welcome to contact me via telegram…»

06:53

Showing more images in Tensorboard

TB is super cool (also in together with script https://gist.github.com/gyglim/1f8dfb1b5c82627ae3efcfbbadb9f514), but it shows ~10 images in its image preview.

This can be fixed.
(0) Find your TB folder

import tensorboard
tensorboard.__file__

In my case it shows '/opt/conda/lib/python3.6/site-packages/tensorboard/__init__.py'
(1)
cd there
open backend/application.py
(2)
Change this line

image_metadata.PLUGIN_NAME: 400,

(3)
Profit - now it shows ~400 images on each view tab

#deep_learning

Logging to tensorboard without tensorflow operations. Uses manually generated summaries instead of summary ops

Logging to tensorboard without tensorflow operations. Uses manually generated summaries instead of summary ops - tensorboard_logging.py

829 viewsAlexander, 06:58

Exploring GANs and unsupervised learning

Here are my findings from my hobby project about using GANs and unsupervised methods to build some decent semantic search on a large dataset of images without annotation:
(0) https://spark-in.me/post/unsupervised-learning-limits

Lots of cool images.

TLDR
(0) Features from pre-trained Imagenet encoder => PCA => Umap => HDBSCAN work really well for image clusterization;
(1) Any siamese network / hard negative mining inspired methods just did not work - the annotation data is too coarse;
(2) GANs kind of work, but I could not achieve the boasted photo-realistic levels;

#deep_learning

Exploring the limits of unsupervised Machine Learning in Computer Vision

In this article I share my experience with GANs, progressive growing of GANs, image clustering and unsupervised learning
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me

914 viewsAlexander, 07:53

2018 DS/ML digest 9

Market / libraries
(0) Tensorflow + Swift - wtf - https://goo.gl/FDvLM4
(1) Geektimes / Habrhabr.ru going international - https://goo.gl/dbGNwD
(2) A service for renting GPUs ... from people
- Reddit https://goo.gl/HxQ54x
- Link https://vectordash.com/hosting/
- Looks LXC based (afaik - the only user friendly alternative to Docker)
- Cool in theory, no idea how secure this is - we can assume as secure as providing a docker container to stranger
- They did not reply me in a week
(3) A friend sent me a new list of ... new yet another PyTorch NLP libraries
- https://goo.gl/kasRfZ, https://goo.gl/XXnbJy (AllenNLP is the biggest library like this)
- I believe that such libraries are more or less useless for real tasks, but cool to know they exist
(4) New SpaceNet 4? https://goo.gl/CsSS6P
(5) A new super cool competition on Kaggle about particle physics? https://www.kaggle.com/c/trackml-particle-identification

Tutorials / basics
(0) Bias vs. Variance (RU) https://goo.gl/4Y7tH7
(1) Yet another magic Jupyter guideline collection - https://goo.gl/AFWMuq

Real world ML applications
(0) Resnet + object detection (RU) - people wo helmets 90% accuracy - https://goo.gl/7xpQnE
(1) Fast.ai about using embeddings with Tabular data - http://www.fast.ai/2018/04/29/categorical-embeddings/
Very similar to our approach on electricity
I personally do not recommend using their library by all means
(2) Comparing Google TPU vs. V100 with ResNet50 - https://goo.gl/s6dhsy
- speed - https://goo.gl/Pww2sm
- pricing - https://goo.gl/Rtkp8Q
- but ... buying GPUs is much cheaper
(3) Other blog posts about embeddings + tabular data
- Sales prediction http://blog.kaggle.com/2016/01/22/rossmann-store-sales-winners-interview-3rd-place-cheng-gui/
- Taxi drive prediction http://blog.kaggle.com/2015/07/27/taxi-trajectory-winners-interview-1st-place-team-%F0%9F%9A%95/
MLP + classification + embeddings - https://goo.gl/AMNGNG / https://arxiv.org/pdf/1508.00021.pdf
(4) Albu's solution to SpaceNet - augmentations https://github.com/SpaceNetChallenge/RoadDetector/tree/master/albu-solution/src/augmentations
CNN overview

Neural network part:

    Split data to 4 folds randomly but the same number of each city tiles in every fold
    Use resnet34 as encoder and unet-like decoder (conv-relu-upsample-conv-relu) with skip connection from every layer of network. Loss function: 0.8*binary_cross_entropy + 0.2*(1 – dice_coeff). Optimizer – Adam with default params.
    Train on image crops 512*512 with batch size 11 for 30 epoch (8 times more images in one epoch)
    Train 20 epochs with lr 1e-4
    Train 5 epochs with lr 2e-5
    Train 5 epochs with lr 4e-6
    Predict on full image with padding 22 on borders (1344*1344).
    Merge folds by mean

Jobs / job market
(0) Developers by country by scraping GitHub - https://goo.gl/n8gnLi
- developers count vs. GDP http://prntscr.com/j9v80e R^2 = 84%
- developers count vs. population - R^2 = 50%

Visualization
(0) Interactive tool for visualizing convolutions - https://ezyang.github.io/convolution-visualizer/

Datasets
(0) Open Images v4 outsourced
- https://research.googleblog.com/2018/04/announcing-open-images-v4-and-eccv-2018.html
- the dataset itself https://storage.googleapis.com/openimages/web/download.html
- categories https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy_visualizer/circle.html

#data_science
#deep_learning
#digest

tensorflow/swift

swift - Swift for TensorFlow documentation repository.

1.2K viewsAlexander, 16:52

https://youtu.be/kbOsDFtvYZk

How Computers Find Naked People in Photos

Why isn't the internet just covered in naked people? Algorithms! However, designing them to distinguish between pornography and people in skin tone clothing or swimsuits is harder than you'd think.

Hosted by: Michael Aranda

SciShow has a spinoff podcast!…

1.2K viewsAlexander, 02:38

Spark in me via @vote

Add comment button below major posts?
anonymous poll

Yes, definitely! – 26
👍👍👍👍👍👍👍 53%

Meh... – 14
👍👍👍👍 29%

No, why? – 7
👍👍 14%

Your option (PM / chat) – 2
👍 4%

👥 49 people voted so far.

852 viewsAlexander, 05:47

Yes, definitely! – 26

Your option (PM / chat) – 2

The current state of ML

https://goo.gl/rzKUiQ
(1) Do not call it AI
(2) Distinguish ML from Intelligent Infrastructure and Intelligence Augmentation
(3) Human-imitative AI is not tractable now
(4) Developments which are now being called "AI" arose mostly in the engineering fields associated with low-level pattern recognition and movement control

#deep_learning

Artificial Intelligence — The Revolution Hasn’t Happened Yet

Artificial Intelligence (AI) is the mantra of the current era. The phrase is intoned by technologists, academicians, journalists and…

1.0K viewsAlexander, 06:59

A decent explanation about decorators in Python

http://book.pythontips.com/en/latest/decorators.html

#python

1.1K viewsAlexander, edited 07:45