The 5 graph algorithms that you should know
Rahul Agarwal describes some of the most important graph algorithms you should know and how to implement them using Python.
Rahul Agarwal describes some of the most important graph algorithms you should know and how to implement them using Python.
A simple neural network with Python and Keras
https://www.pyimagesearch.com/2016/09/26/a-simple-neural-network-with-python-and-keras
https://www.pyimagesearch.com/2016/09/26/a-simple-neural-network-with-python-and-keras
PyImageSearch
A simple neural network with Python and Keras - PyImageSearch
Learn how to create a simple neural network using the Keras neural network and deep learning library along with the Python programming language.
Detecting and treating outliers is a necessity in any dataset as it inevitably introduces the deviation in the model estimations. It can make the difference between winning and loosing a data science competition.
https://lnkd.in/fMV6GaY
This article deals with the detection of the outliers in Time Series data using different ideas, every idea improving upon the previous one and finally treating the outliers in the best way possible.
Hint of ideas covered.....
Idea #1 — Winsorization
Idea #2 Standard deviation etc.
https://lnkd.in/fMV6GaY
This article deals with the detection of the outliers in Time Series data using different ideas, every idea improving upon the previous one and finally treating the outliers in the best way possible.
Hint of ideas covered.....
Idea #1 — Winsorization
Idea #2 Standard deviation etc.
Medium
Forecasting: how to detect outliers?
(the article below is an extract from the book Data Science for Supply Chain Forecast, available here)
An article covering the case study over "Customer Transaction Prediction using LightGBM".
https://medium.com/analytics-vidhya/https-medium-com-kushagrarajtiwari-customer-transaction-prediction-3191c6c634dc
It comprehensively covers:
1. General Business Significance of this problem
2. Exploratory Data Analysis
3. Feature Engineering
4. Why use LightGBM for this problem
A good read if you want to explore problems in bank/financial domain.
https://medium.com/analytics-vidhya/https-medium-com-kushagrarajtiwari-customer-transaction-prediction-3191c6c634dc
It comprehensively covers:
1. General Business Significance of this problem
2. Exploratory Data Analysis
3. Feature Engineering
4. Why use LightGBM for this problem
A good read if you want to explore problems in bank/financial domain.
Medium
Customer Transaction Prediction using LightGBM
Exploratory Data Analysis and modelling with imbalanced data.
Automating the end-to-end lifecycle of Machine Learning applications
#CD4ML #software_engineering #ML
Discoverable and Accessible Data
Reproducible Model Training
Model Serving (Embedded model, Model as service)
Testing and Quality in Machine Learning
Experiments Tracking
Model Deployment (Multiple models, Shadow models)
Model Monitoring and Observability
https://martinfowler.com/articles/cd4ml.html
#CD4ML #software_engineering #ML
Discoverable and Accessible Data
Reproducible Model Training
Model Serving (Embedded model, Model as service)
Testing and Quality in Machine Learning
Experiments Tracking
Model Deployment (Multiple models, Shadow models)
Model Monitoring and Observability
https://martinfowler.com/articles/cd4ml.html
martinfowler.com
Continuous Delivery for Machine Learning
How to apply Continuous Delivery to build Machine Learning applications
Microsoft open-sourced scripts and notebooks to pre-train and finetune BERT natural language model with domain-specific texts
Github: https://github.com/microsoft/AzureML-BERT
Earlier: https://t.me/opendatascience/837
#Bert #Microsoft #NLP #dl
Github: https://github.com/microsoft/AzureML-BERT
Earlier: https://t.me/opendatascience/837
#Bert #Microsoft #NLP #dl
GitHub
GitHub - microsoft/AzureML-BERT: End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service - microsoft/AzureML-BERT
Great collection of practical rules for routine DS engineering / research job.
Machine Learning in a company is 10% Data Science & 90% other challenges, this pdf provides a great deal of principals and solutions to deal with them.
We can only recommend saving this post to your Saved Messages by forwarding it to yourself.
Link: http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
#cheatsheet #advice #practical #common #shouldbesaved
Machine Learning in a company is 10% Data Science & 90% other challenges, this pdf provides a great deal of principals and solutions to deal with them.
We can only recommend saving this post to your Saved Messages by forwarding it to yourself.
Link: http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
#cheatsheet #advice #practical #common #shouldbesaved
CS238: Decision Making under Uncertainty (AA 228)
Textbook: Decision Making Under Uncertainty: Theory and Application by Mykel J. Kochenderfer et al. (MIT Lincoln Laboratory Series)
See course materials
http://web.stanford.edu/class/aa228/
Textbook: Decision Making Under Uncertainty: Theory and Application by Mykel J. Kochenderfer et al. (MIT Lincoln Laboratory Series)
See course materials
http://web.stanford.edu/class/aa228/
web.stanford.edu
AA228/CS238 | Decision Making under Uncertainty
Description This course introduces decision making under uncertainty from a computational perspective and provides an overview of the necessary tools for building autonomous and decision-support systems. Following an introduction to probabilistic models and…
Estimating the success of re-identifications in incomplete datasets using generative models
99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes, suggesting that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization set forth by GDPR.
This is a big concern about privacy and a problem for Data Engineering, especially for those working with anonymized personal information. Paper provides a way to re-identify person from anonymized dataset, this can be useful for people who work for government or security companies
https://www.reddit.com/r/science/comments/chko43/9998_of_americans_would_be_correctly_reidentified/
#privacy #gdpr #federatedlearning #ml
99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes, suggesting that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization set forth by GDPR.
This is a big concern about privacy and a problem for Data Engineering, especially for those working with anonymized personal information. Paper provides a way to re-identify person from anonymized dataset, this can be useful for people who work for government or security companies
https://www.reddit.com/r/science/comments/chko43/9998_of_americans_would_be_correctly_reidentified/
#privacy #gdpr #federatedlearning #ml
Reddit
From the science community on Reddit: 99.98% of Americans would be correctly re-identified in any dataset using 15 demographic…
Posted by FvDijk - 348 votes and 29 comments
Great article on text preprocessing, covering cleaning, #tokenization, #lemmatization and other aspects
Link: https://medium.com/@datamonsters/text-preprocessing-in-python-steps-tools-and-examples-bf025f872908
#NLP #NLU #datacleaning
Link: https://medium.com/@datamonsters/text-preprocessing-in-python-steps-tools-and-examples-bf025f872908
#NLP #NLU #datacleaning
Medium
Text Preprocessing in Python: Steps, Tools, and Examples
by Olga Davydova, Data Monsters
New paper on training with pseudo-labels for semantic segmentation
Semi-Supervised Segmentation of Salt Bodies in Seismic Images:
SOTA (1st place) at TGS Salt Identification Challenge.
Github: https://github.com/ybabakhin/kaggle_salt_bes_phalanx
ArXiV: https://arxiv.org/abs/1904.04445
#GCPR2019 #Segmentation #CV
Semi-Supervised Segmentation of Salt Bodies in Seismic Images:
SOTA (1st place) at TGS Salt Identification Challenge.
Github: https://github.com/ybabakhin/kaggle_salt_bes_phalanx
ArXiV: https://arxiv.org/abs/1904.04445
#GCPR2019 #Segmentation #CV
Unified rational protein engineering with sequence-only deep representation learning
UniRep predicts amino-acid sequences that form stable bonds. In industry, that’s vital for determining the production yields, reaction rates, and shelf life of protein-based products.
Link: https://www.biorxiv.org/content/10.1101/589333v1.full
#biolearning #rnn #Harvard #sequence #protein
UniRep predicts amino-acid sequences that form stable bonds. In industry, that’s vital for determining the production yields, reaction rates, and shelf life of protein-based products.
Link: https://www.biorxiv.org/content/10.1101/589333v1.full
#biolearning #rnn #Harvard #sequence #protein
Exploring Weight Agnostic Neural Networks
Exploration of agents that can already perform well in their environment without the need to learn weight parameters.
Link: https://ai.googleblog.com
Code: https://github.com/google/brain-tokyo-workshop/tree/master/WANNRelease
Exploration of agents that can already perform well in their environment without the need to learn weight parameters.
Link: https://ai.googleblog.com
Code: https://github.com/google/brain-tokyo-workshop/tree/master/WANNRelease
Deep learning cheatsheets, covering content of Stanford’s CS 230 class.
CNN: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neural-networks
RNN: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks
TipsAndTricks: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-deep-learning-tips-and-tricks
#cheatsheet #Stanford #cnn #rnn #tipsntricks #dnn
CNN: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neural-networks
RNN: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks
TipsAndTricks: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-deep-learning-tips-and-tricks
#cheatsheet #Stanford #cnn #rnn #tipsntricks #dnn
stanford.edu
CS 230 - Convolutional Neural Networks Cheatsheet
Teaching page of Shervine Amidi, Adjunct Lecturer at Stanford University.
Data science jobs are going to be in increasingly massive demand. Prepare your resume!
Internet of Things companies will dominate the 2020s.
https://www.androidauthority.com/internet-of-things-companies-1023404/
“Each year, Internet of Things companies tell us “this is the year of IoT,” and each year, we get an expensive fridge that no one buys. But IoT is coming. In fact, it is already here! There are currently 6.7 billion “data collecting devices” in use today, with 20 billion projected for 2020 according to Amazon. Gartner predicts there will be 25 billion connected devices by 2021, and Accenture suggests that the global IoT market will be worth $14.2 trillion in 2030 (as reported by CRN).”
“Data science jobs are going to be in increasingly massive demand for these companies. Collecting all that data from users is only useful if businesses know what to do with it, and how to infer actionable advice from it. How do you turn billions of purchases across millions of users in multiple different countries into a better marketing campaign? That’s where data science comes in.”
"One of the biggest areas of concern for Internet of Things companies is data security."
Internet of Things companies will dominate the 2020s.
https://www.androidauthority.com/internet-of-things-companies-1023404/
“Each year, Internet of Things companies tell us “this is the year of IoT,” and each year, we get an expensive fridge that no one buys. But IoT is coming. In fact, it is already here! There are currently 6.7 billion “data collecting devices” in use today, with 20 billion projected for 2020 according to Amazon. Gartner predicts there will be 25 billion connected devices by 2021, and Accenture suggests that the global IoT market will be worth $14.2 trillion in 2030 (as reported by CRN).”
“Data science jobs are going to be in increasingly massive demand for these companies. Collecting all that data from users is only useful if businesses know what to do with it, and how to infer actionable advice from it. How do you turn billions of purchases across millions of users in multiple different countries into a better marketing campaign? That’s where data science comes in.”
"One of the biggest areas of concern for Internet of Things companies is data security."
Android Authority
Internet of Things companies will dominate the 2020s: Prepare your resume!
Internet of things companies will dominate the 2020s. Discover the businesses making waves and how to prepare your resume to land a job with them.
Our conceptual understanding of how best to represent words and sentences in a way that best captures underlying meanings and relationships is rapidly evolving. Among the latest milestones are BERT and ELMo.
This article talks about the concepts one needs to be aware of to properly get his/her head around BERT.
http://jalammar.github.io/illustrated-bert/
#BERT #EMLO #NLP
This article talks about the concepts one needs to be aware of to properly get his/her head around BERT.
http://jalammar.github.io/illustrated-bert/
#BERT #EMLO #NLP
jalammar.github.io
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
Discussions:
Hacker News (98 points, 19 comments), Reddit r/MachineLearning (164 points, 20 comments)
Translations: Chinese (Simplified), French 1, French 2, Japanese, Korean, Persian, Russian, Spanish
2021 Update: I created this brief and highly accessible…
Hacker News (98 points, 19 comments), Reddit r/MachineLearning (164 points, 20 comments)
Translations: Chinese (Simplified), French 1, French 2, Japanese, Korean, Persian, Russian, Spanish
2021 Update: I created this brief and highly accessible…
Transformers and self-attention explained from scratch, also in pythonic language.
#BERT #GPT_2 #Transformer
#python
http://www.peterbloem.nl/blog/transformers
#BERT #GPT_2 #Transformer
#python
http://www.peterbloem.nl/blog/transformers