Data Science by ODS.ai ๐Ÿฆœ
51K subscribers
363 photos
34 videos
7 files
1.52K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp
Download Telegram
โ€‹โ€‹Brilliant thread on ROC curve usage

ROC curve is used to measure how good is the discrimination between two distributions. This is a nice thread to refresh your memory, or to finally understand how ROCAUC works.

Link: https://threadreaderapp.com/thread/1104134423673479169.html
An introduction to prediction research: http://www.cecilejanssens.org/wp-content/uploads/2018/01/PredictionManual2.0.pdf

#ROC #AUC
Lessons learned building natural language processing systems in health care

Itโ€™s suprising, but #NLP in #healthcare doesnโ€™t work the same way, as a researcher might expect, because of semantic.
Lesson #1: Off-the-shelf NLP models donโ€™t work
Lesson #2: Build trainable NLP pipelines
Lesson #3: Start with labeling ground truth

Link: https://www.oreilly.com/ideas/lessons-learned-building-natural-language-processing-systems-in-health-care
โ€‹โ€‹Website using Deep Learning to colorize pictures.

Link: https://colourise.sg/#colorize

#DL #CV #demo
โ€‹โ€‹A deep learning framework for nucleus segmentation using image style transfer

One of the challenges in applying DL to tissue and cell analysis (which can be used to, but not which usage is not limited to, cancer diagnostics) is boosting annoted training sets. This paper may help with the matter.

Link: https://www.biorxiv.org/content/10.1101/580605v1
#deeplearning #microscopy
โ€‹โ€‹Neural network that turns sketches into realistic photo.

Paper is called ยซSemantic Image Synthesis with Spatially-Adaptive Normalizationยป.

#CVPR19 oral paper on a new conditional normalization layer for semantic image synthesis #SPADE and its demo app #GauGAN

ArXiV: https://arxiv.org/abs/1903.07291
Website: https://nvlabs.github.io/SPADE/

#GAN #CV #DL
โ€‹โ€‹Important article in Nature about statistical significance

Scientists rise up against statistical significance โ€” about motion to move from widely using and quoting statistical significance to confindence intervals.

Link: https://www.nature.com/articles/d41586-019-00857-9

#statistics #statsignificance #nature #science
Next-level learning approach: using MRI to peek into baby brains to improve CV

MRI Scanning 17 babies for 26 hours to see how face-recognizing brain regions mature. When just 4-6 months, babies prefer to look at faces & socially relevant things. This means face recognition is learned via evolution: data-hungry & sample-inefficient.

Link: https://www.nature.com/articles/ncomms13995

#nature
โ€‹โ€‹One-shot object detection

Long and complete post explaining how these one-shot detectors work and how they are trained and evaluated.

Link: https://machinethink.net/blog/object-detection/

#cv #dl #objectdetection
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

A deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). #nn achieves an #AUC of 0.895 in predicting whether there is a cancer in the breast, when tested on the screening population.

Link: https://arxiv.org/abs/1903.08297

#cv #dl #cancer #objectdetection
Hey, recent questionnaire showed that there are some people who can volunteer to help with the channel content. If you consider yourself as a volunteer who can help with spreading word about worthy content, please do fill in this Google Form. We will contact participants for further actions.

Please note, that data from questionnaires and feedback show that our editorial style as well as channel update frequency satisfy majority of our dear channel audience, so last word on any submissions and suggestions will be authorative.

Junior editors, we are looking for you!
Big article on how #uber ML system Michelangelo works

Michelangelo enables internal teams to seamlessly build, deploy, and operate machine learning solutions at Uberโ€™s scale. It is designed to cover the end-to-end ML workflow: manage data, train, evaluate, and deploy models, make predictions, and monitor predictions. The system also supports traditional ML models, time series forecasting, and deep learning.


Link: https://eng.uber.com/michelangelo/

#ML #MLSystem #MLatwork #practical
๐Ÿ”ฅQuasi-Breaking: An Algorithm Inks a Record Deal With Warner Music

Endel uses machine learning to create personalized tracks meant to help people focus, relax and sleep better by inputting factors such as heart rate, time of day, location and weather.
Looking forward to actual music-generating algorithm being signed up for label.

Link: https://hypebeast.com/2019/3/endel-algorithm-record-deal-warner-music

#MLHype #audiolearning #DL #Endel
โ€‹โ€‹Reducing the Need for Labeled Data in Generative Adversarial Networks

How combination of self-supervision and semi-supervision can help learn from partially labeled data.

Link: https://ai.googleblog.com/2019/03/reducing-need-for-labeled-data-in.html

#GAN #DL #Google #supervisedvsunsupervised
๐Ÿ’ซPrefect (Airflow alternative) has gone Open Source

Prefect is capable of:
* Handling data processing timeline
* Orchestrating the backend of Cloud execution platform
* Parameterizing machine learning models
* Execute other ETL patterns

Docs: https://docs.prefect.io
Link: https://medium.com/the-prefect-blog/prefect-is-open-source-744e3c00cf35
GitHub: https://github.com/prefecthq/prefect

#ml_pipeline #mlflow
โ€‹โ€‹A comprehensive beginnerโ€™s guide to create a Time Series Forecast (with Codes in Python)

A middle-level article on #TS forecasting in #Python.

Link: https://www.analyticsvidhya.com/blog/2016/02/time-series-forecasting-codes-python/