Spark in me

Forwarded from Roem.ru (Ivan Illyn)

Яндекс «исправил» невысокое разрешение «интернет-версиям» 7 классических фильмов о ВОВ. Формально плёнки можно просто заново классно оцифровать, но поисковик провёл другой эксперимент. Включил нейронную сеть и с её помощью добавил деталей и убрал технический брак существовавшим не ахти оцифровкам.

https://roem.ru/10-05-2018/270806/yandex-correcting-old-movies-18/

roem.ru

Нейронная сеть «Яндекса» подняла качество старому кино о Великой Отечественной войне

Искусственный интеллект помог сделать кадры детальнее, визуально более резким, а также компенсировал некоторые технологические недостатки оцифровки // Роем в вашем Телеграме: https://telegram.me/roemru

17 viewsAlexander, 12:03

Spark in me

Forwarded from Hacker News

Ubuntu 18.04: Unity is gone, Gnome is back, and Ubuntu has never been better (Score: 100+)

Link: http://j.mp/2It0yNf
Comments: http://j.mp/2IqKFqu

Ars Technica

Ubuntu 18.04: Unity is gone, GNOME is back—and Ubuntu has never been better

Server users will really like 18.04, but the newest Ubuntu works great for all Linux fans.

13 viewsAlexander, 13:41

Spark in me

https://youtu.be/DglrYx9F3UU

YouTube

This AI Reproduces Human Perception | Two Minute Papers #248

The paper "The Unreasonable Effectiveness of Deep Networks as a Perceptual Metric" is available here:
https://richzhang.github.io/PerceptualSimilarity/

Our Patreon page: https://www.patreon.com/TwoMinutePapers

One-time payment links are available below.…

1.0K viewsAlexander, 18:43

Spark in me

Testing comments on Spark-in.me

Some time ago we saw how some channels (like HackerNews) format their posts in Telegram:
(0) The post itself contains a button with a link to a comment section of their website
(1) The comment counter is updated mostly in line with real comment count on the website
(2) Sometimes there are additional buttons

This can be easily done using using third-party bots, but in our case given our custom-built blog, there was only one way:
(0) Create a post on the blog with an embedded Telegram snippet, hide it from all of the feeds
(1) Create a simple bot for posting messages (like this one)
(2) Create a simple bot connected to disqus API and updating the comment counter
(3) Profit!

For some topics the community opinion / knowledge / experience may be worth much more than a single post (like GANs or proxy setup).

So, if you like this feature - please tell use what you think in the comments!

Spark in me

Exploring the limits of unsupervised Machine Learning in Computer Vision

In this article I share my experience with GANs, progressive growing of GANs, image clustering and unsupervised learning
Статьи автора - http://spark-in.me/author/snakers41
Блог - http://spark-in.me

989 viewsspark_comment_bot, edited 16:36

3+ Comments Donate

Spark in me

Running mean programming pattern in PyTorch

Sometimes you just need to apply exponential weighting:
(0) When tracking some metric
(1) When weighting a loss
(2) When applying something inspired by Adam

I used to do it in a quite an ugly way:
(0) Feed a list => calculate averages
(1) Do the same, but using a separate class

Found out from my colleague that it can be done in PyTorch using torch.nn.register_buffer
Very cool

Like this post or have something to say => tell us more in the comments or donate!

921 viewsspark_comment_bot, edited 07:25

0+ Comments Donate

Spark in me

Google Duplex

In a nutshell - a combination of ML (RNN + speech recognition + WaveNet + Tacotron) that can call a human and pretend to be a human. Only works for narrow specific domains (call to a restaurant).

Links:
(0) Blog post
(1) MKBHD video

Usually I include such links into digests, but this time it looks like insanity:
(0) It looks heavily doctored, but believable
(1) All the components are kind of known to be at least 90-95% as good as presented
(2) Once again - it is very domain focused

One of the key research insights was to constrain Duplex to closed domains, which are narrow enough to explore extensively.
Duplex can only carry out natural conversations after being deeply trained in such domains.
It cannot carry out general conversations.

Like this post or have something to say => tell us more in the comments or donate!

research.google

Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone

Posted by Yaniv Leviathan, Principal Engineer and Yossi Matias, Vice President, Engineering, Google A long-standing goal of human-computer intera...

1.0K viewsspark_comment_bot, edited 15:40

1+ Comments Donate

Spark in me

Presentation from a winning solution in DS Bowl 2018

https://goo.gl/LGNWL5

Google Docs

bowl.pdf

858 viewsAlexander, 07:26

Spark in me

2018 DS/ML digest 10

Market
(0) Some moonshots by Google in working with electronic health records
(1) Google duplex - a narrow domain bot that makes calls for you
(2) Nature wants to make its ML journal ... paid
(3) Standford DawnBench - training Imagenet encoders as quickly and cheaply as possible
(4) Facebook achieves 85% on Imagenet by training on 1bn images in 336 GPUs in a week
(5) Learning the models of the surrounding world based on a DOOM like game

Practice / libraries / code
(0) A smarter and new way to ensemble CNNs
- Traditional approach - ensemble CNNS with different architecture - and just vote / average / apply linear regression on top
- Newer approach - use Cyclic Learning rate
- Even newer approach - model snapshot ensembling
- Stochastic Weight Averaging
-- store running average of the models
-- train one model with CLR
-- at the end of each lr update (or epoch) - do a running average of the models with some weights
-- the gist of the method is located on this line
-- I do understand why the update bnorm params, but I do not understand why it cannot be done just running 1 train epoch
- Papers on CNN ensembling 1 2 3
(1) (RU) Small amount of technocal details, but face-detection + face hashing works in retail (+human operator) given an HD camera
(2) (RU) Pose estimation
(3) Numpy autograd

"New" papers worth mentioning
(0) SqueezeNext
- Module comparsion
- Key changes
(i) more aggressive channel reduction by incorporating a two-stage squeeze module
(ii separable 3 × 3 convolutions
(iii) element-wise addition skip co

nection similar to ResNet
- Performance
(1) GANs to generate full-body anime characters in different poses

Visualizations:
(0) (does not work in Firefox) Visualizing encoder-decoder networks for translation

#data-science
#deep-learning
#digest

Like this post or have something to say => tell us more in the comments or donate!

Google AI Blog

Deep Learning for Electronic Health Records

Posted by Alvin Rajkomar MD, Research Scientist and Eyal Oren PhD, Product Manager, Google AI When patients get admitted to a hospital, th...

932 viewsspark_comment_bot, edited 11:25

1+ Comments Donate

Spark in me via @vote

How to show links?
anonymous poll

Full if short, hide if long – 29
👍👍👍👍👍👍👍 58%

Hide them behind markup – 11
👍👍👍 22%

Post full links – 9
👍👍 18%

Shorten them – 1
▫️ 2%

👥 50 people voted so far.

1.1K viewsAlexander, 12:25

Shorten them – 2%

Hide them behind markup – 22%

Post full links – 18%

Full if short, hide if long – 58%

Spark in me

A great presentation about current state of particle tracking + ML

Also Kaggle failed to share this for some reason
https://indico.cern.ch/event/702054/attachments/1606643/2561698/tr180307_davidRousseau_CERN_trackML-FINAL.pdf

Key problem - current algorithm - Kalman filter faces time constaints

#data_science

872 viewsAlexander, 05:04

Spark in me

Internet / tech

- Google I/O news https://goo.gl/1FszFA
- MS to give custom voice option to its apps - https://goo.gl/5e2oMw
- Katzenberg (former Disney executive) raises US$800m to make YouTube like short series - https://goo.gl/wLpTGi => Internet + Video is a commodity now?
- _Reportedly_ Lyft has 35% market share in the USA https://goo.gl/tvzDTu
- Google becoming evil and doing military contracts - https://goo.gl/3HjYDg - wtf?
- Apple autonomous drive fleet is _repotedly_ now at 55 https://goo.gl/94tfkk

#internet

The Verge

The 10 biggest announcements from Google I/O 2018

Here’s the most important news to know.

1.1K viewsAlexander, 06:18

Spark in me

https://youtu.be/fklY2nH7AJo

YouTube

AI Learns Painterly Harmonization | Two Minute Papers #249

The paper "Deep Painterly Harmonization" and its source code is available here:
https://arxiv.org/abs/1804.03189
https://github.com/luanfujun/deep-painterly-harmonization

Pick up cool perks on Patreon: https://www.patreon.com/TwoMinutePapers

We would like…

1.1K viewsAlexander, 17:11

Spark in me

A very cool, fast and simple way to make presentations

https://www.youtube.com/watch?v=dum7q6UXiCE

YouTube

The Easiest Way to Make Presentations! (Pandoc + Markdown)

Support the channel!: https://Patreon.com/LukeSmith
Give me money: https://PayPal.me/LukeMSmith
Ask a question: luke@lukesmith.xyz
Get my configs: https://github.com/LukeSmithxyz
See my website: http://lukesmith.xyz

1.2K viewsAlexander, edited 13:21

Spark in me

Playing with 3D interactive scatter plots

Turns out you can do this using ipython widgets + ipyvolume.

Best example:
- Playing with particle data (https://nbviewer.jupyter.org/urls/gist.githubusercontent.com/maartenbreddels/04575b217aaf527d4417173f397253c7/raw/926a0e57403c0c65eb55bc52d5c7401dc1019fdf/trackml-ipyvolume.ipynb)

All of this looks kind of wobbly / new and a bit useless, but it works, is free and fast.

I was also trying to assing each point a colour like here

But in the end a much more simple approach just worked

fig = ipv.figure()

N = len(hits.volume_id.unique())
cmap = matplotlib.cm.get_cmap("tab20", N)
colors = cmap(np.linspace(0, 1.0, N))
colors = ["#%02x%02x%02x"  % tuple([int(k*255) for k in matplotlib.colors.to_rgb(color)[:3]]) for color in colors]

for i in range(0,N):
    hits_v = hits[hits.volume_id == list(hits.volume_id.unique())[i]]
    scatter = ipv.scatter(hits_v.x, hits_v.y, hits_v.z, marker="diamond", size=0.1, color=colors[i])

ipv.show()

Like this post or have something to say => tell us more in the comments or donate!

1.3K viewsspark_comment_bot, edited 09:55

1+ Comments Donate

Spark in me

Using ncdu with exclude

A really good extension of standard du

sudo ncdu --exclude /exclude_folder /

Useful when something is mounted in /media or /mnt

#linux

3.5K viewsAlexander, edited 06:59

Spark in me

A very cool sci-fi story about ethics, game theory and philosophy

https://www.lesswrong.com/posts/HawFh7RvDM4RyoJ2d/three-worlds-collide-0-8

#philosophy

Lesswrong

Three Worlds Collide (0/8) — LessWrong

"The kind of classic fifties-era first-contact story that Jonathan Swift might have written, if Jonathan Swift had had a background in game theory."…

1.1K viewsAlexander, 17:47

Spark in me

A thorough and short guide to Matplotlib API

A bit of history, small look under the hood and logical explanation of how to use it best:

https://realpython.com/python-matplotlib-guide/

#data_science

Realpython

Python Plotting With Matplotlib (Guide) – Real Python

This article is a beginner-to-intermediate-level walkthrough on Python and matplotlib that mixes theory with example.

1.1K viewsAlexander, 14:32

Spark in me

2018 DS/ML digest 11

Cool thing this week
(0) ML vs. compute stidy since 2012 - chart / link

Market
(0) Once again about Google Duplex
(1) Google announcements from Google IO
-- Email autocomplete

  
We encode the subject and previous email by averaging the word embeddings in each field. We then join those averaged embeddings, and feed them to the target sequence RNN-LM at every decoding step, as the model diagram below shows.

-- Learning Semantic Textual Similarity from Conversations blog, paper. Something in the lines of Sentence2Vec, but for conversations, self-supervised, uses attention and embedding averaging
-- Google Clips device + interesting moment estimation on the device. Looks like MobileNet distillation into a small network with some linear models on top

Libraries / tools / papers
(0) SaaS NLP annotation tool
(1) CNNs allegedly can reconstruct low light images? Blog, paper, Looks cool AF
(2) Cool thing to try in a new project - postgres restful API wrapper - such things require a lot of care though, but can elimininate a lot of useless work for small projects.

For my blog I had to write a simple business tier layer myself. I doubt that I could use this w/o overengineering because I constructed open-graph tags for example in SQL queries for example

Job / job market
(0) (RU) Realistic IT immigration story

Datasets
(0) Last week open images dataset was updated. I downloaded the small one for the sake of images. Though the download process itself is a bit murky

#machine-learning
#digest
#deep-learning

Like this post or have something to say => tell us more in the comments or donate!

1.2K viewsspark_comment_bot, edited 06:21

1+ Comments Donate

Spark in me

(RU) A cool post series on habr about auto-encoders

https://habr.com/post/331382/

#ds
#ml
#dl

Habr

Автоэнкодеры в Keras, Часть 1: Введение

Содержание Часть 1: Введение Часть 2: Manifold learning и скрытые ( latent ) переменные Часть 3: Вариационные автоэнкодеры ( VAE ) Часть 4: Conditional VAE Часть 5: GAN (Generative Adversarial...

930 viewsAlexander, edited 12:52

Spark in me

Using groupby in pandas in multi-thread fashion

Sometimes you just need to use all of your CPUs to process some nasty thing in pandas (because you are lazy to do it properly) quick and dirty.

Pandas' GroupBy: Split, Apply, Combine seems to have been built exactly for that, but there is also a lazy workaround.

Solution I googled
- https://gist.github.com/tejaslodaya/562a8f71dc62264a04572770375f4bba

My lazy way using tqdm + Pool
- https://gist.github.com/snakers4/b246de548669543dc3b5dbb49d4c2f0c

(Savva, if you read this, I know that your version is better, you can also send it to me to share xD)

#ds

Gist

pandas DataFrame apply multiprocessing

pandas DataFrame apply multiprocessing. GitHub Gist: instantly share code, notes, and snippets.

1.1K viewsAlexander, 13:01

Spark in me

New competitions on Kaggle

Kaggle has started a new competition with video ... which is one of those competitions (read between the lines - blatant marketing)
https://www.kaggle.com/c/youtube8m-2018

I.e.
- TensorFlow Record files
- Each of the top 5 ranked teams will receive $5,000 per team as a travel award - no real prizes
- The complete frame-level features take about 1.53TB of space (and yes, these are not videos, but extracted CNN features)

So, they are indeed using their platform to promote their business interests.
Released free datasets are really cool, but only when you can use then for transfer learning, which implies also seeing the underlying ground level data (i.e. images of videos).

#data_science
#deep_learning

Kaggle

The 2nd YouTube-8M Video Understanding Challenge

Can you create a constrained-size model to predict video labels?

1.2K viewsAlexander, edited 07:29

About

Blog

Apps

Platform