Data Science by ODS.ai 🦜
51K subscribers
363 photos
34 videos
7 files
1.52K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp
Download Telegram
A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

HMTL is a Hierarchical Multi-Task Learning model which combines a set of four carefully selected semantic tasks. The model achieves state-of-the-art results on Named Entity Recognition, Entity Mention Detection and Relation Extraction. Using SentEval, we show that as we move from the bottom to the top layers of the model, the model tend to learn more complex semantic representation.

ArXiV: https://arxiv.org/abs/1811.06031
Github: https://github.com/huggingface/hmtl

#SOTA #NLP #MultiTask
This media is not supported in your browser
VIEW IN TELEGRAM
California wildfire #visualization

How weather conditions during California's fire season have evolved over time.
Nice paper from the #GoogleAI team, grading prostate cancer in prostatectomy specimens.

The model outperforms humans on the silver standard labels (panel of experts), but there is no clear winner for outcome prediction in the K-M plot/c-index.

«the mean accuracy among 29 general pathologists was 0.61. The DLS achieved an... accuracy of 0.70 (p=0.002) and trended towards better patient risk stratification»

Post: https://ai.googleblog.com/2018/11/improved-grading-of-prostate-cancer.html
ArXiV: https://arxiv.org/abs/1811.06497

#DL #medical #cancer
Difference between machine learning and AI:

If it is written in Python, it's probably machine learning

If it is written in PowerPoint, it's probably AI
​​Are Pop Lyrics Getting More Repetitive?

Well-written article on pop music analysis. Is the repetitiveness in songs rising? Does it influence song popularity?

Article contains well-designed and illustrated research.

Link: https://pudding.cool/2017/05/song-repetition/

#popularDS #researh #statistics #vizualization
​​Measuring the Effects of Data Parallelism on Neural Network Training

Important paper from Google on large batch optimization. They do impressively careful experiments measuring iterations needed to achieve target validation error at various batch sizes. The main "surprise" is the lack of surprises.

There is rather long and throughtful twitter thread about the paper.

Twitter thread: https://twitter.com/RogerGrosse/status/1066392375570894849
ArXiV: https://arxiv.org/abs/1811.03600
Telegra.ph for instant view: https://telegra.ph/Roger-Grosses-thread-on-Measuring-the-Effects-of-Data-Parallelism-on-Neural-Network-Training-11-24
🎓 Free «Advanced Deep Learning and Reinforcement Learning» course.

#DeepMind researchers have released video recordings of lectures from «Advanced Deep Learning and Reinforcement Learning» a course on deep RL taught at #UCL earlier this year.

YouTube Playlist: https://www.youtube.com/playlist?list=PLqYmG7hTraZDNJre23vqCGIVpfZ_K2RZs

#course #video #RL #DL
QUESTION ANSWER ARCHITECTURES – SQUAD 2.0 + U-NET

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

Artilce provides easy intro into NLP, covering very basic methods and defenitions.


Link: https://betterlearningforlife.com/2018/11/16/question-answer-architectures-squad-2-0-u-net/

#NLP #QA #SQuAD #novice
Visual Model-Based Reinforcement Learning as a Path towards Generalist Robots

Model-based RL, from pixels, controlling a robot and generalizing to new objects (clothing, toys, etc.). All trained with unsupervised interaction!

Article: https://bair.berkeley.edu/blog/2018/11/30/visual-rl/

#RL #CV
Deep Counterfactual Regret Minimization

Modification to the tabular CFR algorithm popular for games like poker to use deep learning function approximation.

ArXiV: https://arxiv.org/abs/1811.00164

#NIPS2018 #RL #GamesTheory
​​STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency

Baidu technology presented at #NIPS2018

Website: https://simultrans-demo.github.io
ArXiV: https://arxiv.org/abs/1810.08398

#NLP #translation #Baidu
​​🔥 AlphaFold: Using AI for scientific discovery.

#DeepMind has significally improved protein folding prediction.

Protein folding is important because it allows to predict function along with the functioning mechanism.

Website: https://deepmind.com/blog/alphafold/
Guardian: https://www.theguardian.com/science/2018/dec/02/google-deepminds-ai-program-alphafold-predicts-3d-shapes-of-proteins

#bioinformatics #alphafold #genetics
​​Live demo of GAN paint brush

Now you can paint with textures on any images, drawing buildings, doors and complex objects by selecting an area where you want to draw an object. The #GAN takes care of merging part into the picture.

Link: http://gandissect.res.ibm.com/ganpaint.html
​​Dimensionality reduction for visualizing single-cell data using UMAP

UMAP is an t-SNE replacement for #visualization.

UMAP is being increasingly accepted as a powerful tool for visualizing single cell datasets. This paper compares UMAP to #TSNE

While UMAP is unquestionably better than default t-SNE in preserving global structure, it's worth mentioning that (very recently) it was shown that this limitation of t-SNE appears to be addressable with better parameters/initialization.

Article link: https://www.nature.com/articles/nbt.4314