⚠️ The most cliché things that some data scientists actually say and do
On deep learning:
✔️ Say: «Deep learning is overkill in most cases.»
❌ Do: Use deep learning for everything.
On Excel:
✔️ Say: «Excel is such a useless and outdated tool.»
❌ Do: Use Excel every day.
On statistics:
✔️ Say: «My biggest strength is math and statistics.»
❌ Do: Google the definition of standard deviation.
On correlation:
✔️ Say: «Correlation is not causation.»
❌ Do: Base feature selection entirely on correlation.
On big data
✔️ Say: «We should use a big data store for this.»
❌ Do: Put everything in an SQL database.
On careers
✔️ Say: «I truly believe in the mission of this company. We’re going to change the world.»
❌ Do: Change jobs every year or whenever someone offers slightly more money.
✔️ Say: «I’m a senior data scientist and machine learning expert.»
❌ Do: Still, haven’t shipped a model to production.
On science
✔️ Say: «We use the scientific method. Every hypothesis needs to be tested.»
❌ Do: Deploy a model straight to production because it converged on the training set.
On academic papers
✔️ Say: «I read a lot of papers.»
❌ Do: Read the abstract of a DeepMind paper once.
On p-values
✔️ Say: «The p-value is very often misunderstood.»
❌ Do: Offer a flawed explanation of p-values.
❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
On deep learning:
✔️ Say: «Deep learning is overkill in most cases.»
❌ Do: Use deep learning for everything.
On Excel:
✔️ Say: «Excel is such a useless and outdated tool.»
❌ Do: Use Excel every day.
On statistics:
✔️ Say: «My biggest strength is math and statistics.»
❌ Do: Google the definition of standard deviation.
On correlation:
✔️ Say: «Correlation is not causation.»
❌ Do: Base feature selection entirely on correlation.
On big data
✔️ Say: «We should use a big data store for this.»
❌ Do: Put everything in an SQL database.
On careers
✔️ Say: «I truly believe in the mission of this company. We’re going to change the world.»
❌ Do: Change jobs every year or whenever someone offers slightly more money.
✔️ Say: «I’m a senior data scientist and machine learning expert.»
❌ Do: Still, haven’t shipped a model to production.
On science
✔️ Say: «We use the scientific method. Every hypothesis needs to be tested.»
❌ Do: Deploy a model straight to production because it converged on the training set.
On academic papers
✔️ Say: «I read a lot of papers.»
❌ Do: Read the abstract of a DeepMind paper once.
On p-values
✔️ Say: «The p-value is very often misunderstood.»
❌ Do: Offer a flawed explanation of p-values.
❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
New documentation about how differentiable programming works in Swift: • Differentiable Functions and Differentiation APIs: https://github.com/tensorflow/swift/blob/master/docs/DifferentiableFunctions.md … • Differentiable Types: https://github.com/tensorflow/swift/blob/master/docs/DifferentiableTypes.md … With language integration, autodiff is just a compiler implementation detail.
✴️ @AI_Python_EN
✴️ @AI_Python_EN
Machine Learning with Python, Jupyter, KSQL and TensorFlow https://ift.tt/2FbgQq6 #python #tensorflow #jupyter #ksql https://ift.tt/2TFCqNC
✴️ @AI_Python_EN
✴️ @AI_Python_EN
Discover Computer Vision Datasets with this search engine
#dataset #image #visual #search #engine #vision
https://www.visualdata.io
✴️ @AI_Python_EN
#dataset #image #visual #search #engine #vision
https://www.visualdata.io
✴️ @AI_Python_EN
MIT DeepLearning Basics — Introduction and Overview with TensorFlow #robotics #game #games
bit.ly/2E4xnx6
✴️ @AI_Python_EN
bit.ly/2E4xnx6
✴️ @AI_Python_EN
Deep Learning Project Building with Python and Keras ☞ http://bit.ly/2HADriH #DeepLearning #ai
✴️ @AI_Python_EN
✴️ @AI_Python_EN
A Step-by-Step Guide to Machine Learning Problem Framing: Diving into Machine Learning (ML) without knowing what you’re trying to achieve is a recipe for disaster. #MachineLearning #DeepLearning #DataScience
https://medium.com/thelaunchpad/a-step-by-step-guide-to-machine-learning-problem-framing-6fc17126b981
✴️ @AI_Python_EN
https://medium.com/thelaunchpad/a-step-by-step-guide-to-machine-learning-problem-framing-6fc17126b981
✴️ @AI_Python_EN
Rust: Programming Language Cheat Sheet.
#BigData #Analytics #DataScience #AI #MachineLearning #IoT #IIoT #PyTorch #Python #RStats #Rust #TensorFlow #Java #JavaScript #ReactJS #GoLang #CloudComputing #Serverless #Linux #Programming
http://bit.ly/2HEVJzl
✴️ @AI_Python_EN
#BigData #Analytics #DataScience #AI #MachineLearning #IoT #IIoT #PyTorch #Python #RStats #Rust #TensorFlow #Java #JavaScript #ReactJS #GoLang #CloudComputing #Serverless #Linux #Programming
http://bit.ly/2HEVJzl
✴️ @AI_Python_EN
It was interesting to work on classifying duplicate questions on quora. Just uploaded the code to git.
Approach 1: Siamese Network with manhattan distance as the objective function.
Code: https://lnkd.in/fhzV_HU
Approach 2: XGBoost + TF-iDF + NLP feature engineering.
Code: https://lnkd.in/f3zqm37
Competition: https://lnkd.in/fYHtwJq
✴️ @AI_Python_EN
Approach 1: Siamese Network with manhattan distance as the objective function.
Code: https://lnkd.in/fhzV_HU
Approach 2: XGBoost + TF-iDF + NLP feature engineering.
Code: https://lnkd.in/f3zqm37
Competition: https://lnkd.in/fYHtwJq
✴️ @AI_Python_EN
Deep Learning Drizzle
"Read enough so you start developing intuitions and then trust your intuitions and go for it!" - Geoffrey Hinton
By Marimuthu K.: https://lnkd.in/e6BBDVJ
#artificialintelligence #deeplearning #machinelearning
✴️ @AI_Python_EN
"Read enough so you start developing intuitions and then trust your intuitions and go for it!" - Geoffrey Hinton
By Marimuthu K.: https://lnkd.in/e6BBDVJ
#artificialintelligence #deeplearning #machinelearning
✴️ @AI_Python_EN
Curated list of awesome ****DEEP LEARNING**** tutorials, projects and communities.
Github Link - https://lnkd.in/fJdpFMn
#deeplearning #machinelearning #datascience
✴️ @AI_Python_EN
Github Link - https://lnkd.in/fJdpFMn
#deeplearning #machinelearning #datascience
✴️ @AI_Python_EN
This guide gives a complete understanding about various #machinelearning algorithms along with R & Python #codes to run them. These #algorithms can be applied to any data problem:
Linear Regression,
Logistic Regression,
Decision Tree,
SVM,
Naive Bayes,
kNN,
K-Means,
#Random Forest.
If you are keen to master machine learning, start right away.
Link : bit.ly/2CpWIjH
#machinelearning #deeplearning #python #coding #linkedin #decisiontrees #logisticregression #linearregression #forest #analytics #randomization #computervision
✴️ @AI_Python_EN
Linear Regression,
Logistic Regression,
Decision Tree,
SVM,
Naive Bayes,
kNN,
K-Means,
#Random Forest.
If you are keen to master machine learning, start right away.
Link : bit.ly/2CpWIjH
#machinelearning #deeplearning #python #coding #linkedin #decisiontrees #logisticregression #linearregression #forest #analytics #randomization #computervision
✴️ @AI_Python_EN
Forwarded from DLeX: AI Python (Meysam Asgari)
Have you heard of SuperTML?
Two-Dimensional Word Embedding and Transfer Learning Using ImageNet Pretrained CNN Models for the Classifications on Tabular Data
SuperTML: Two-Dimensional Word Embedding and Transfer Learning Using ImageNet Pretrained CNN Models for the Classifications on Tabular Data
Tabular data is the most commonly used form of data in industry. Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression are typically used for classification tasks on tabular data.
DNN models using categorical embeddings are also applied in this task, but all attempts thus far have used one-dimensional embeddings. The recent work of Super Characters method using two-dimensional word embeddings achieved the state of art result in text classification tasks, showcasing the promise of this new approach.
The SuperTML method, which borrows the idea of Super Characters method and two-dimensional embeddings to address the problem of classification on tabular data. It has achieved state-of-the-art results on both large and small datasets.
Here’s the paper: https://lnkd.in/djGFf63
❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
Two-Dimensional Word Embedding and Transfer Learning Using ImageNet Pretrained CNN Models for the Classifications on Tabular Data
SuperTML: Two-Dimensional Word Embedding and Transfer Learning Using ImageNet Pretrained CNN Models for the Classifications on Tabular Data
Tabular data is the most commonly used form of data in industry. Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression are typically used for classification tasks on tabular data.
DNN models using categorical embeddings are also applied in this task, but all attempts thus far have used one-dimensional embeddings. The recent work of Super Characters method using two-dimensional word embeddings achieved the state of art result in text classification tasks, showcasing the promise of this new approach.
The SuperTML method, which borrows the idea of Super Characters method and two-dimensional embeddings to address the problem of classification on tabular data. It has achieved state-of-the-art results on both large and small datasets.
Here’s the paper: https://lnkd.in/djGFf63
❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
Data-Driven Careers Dechipered
Seems Different with what I believed, but still worth it to discuss
#business #technology #datascience
❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
Seems Different with what I believed, but still worth it to discuss
#business #technology #datascience
❇️ @AI_Python
🗣 @AI_Python_arXiv
✴️ @AI_Python_EN
Yann LeCun Tweeted:
I gave two talks in Harvard's Mind, Brain, & Behavior Distinguished Lecture Series last week.
The slides are here:
- 2019-03-13 "The Power and Limits of Deep Learning":...
https://t.co/Pp4flTGlZ8
✴️ @AI_Python_EN
I gave two talks in Harvard's Mind, Brain, & Behavior Distinguished Lecture Series last week.
The slides are here:
- 2019-03-13 "The Power and Limits of Deep Learning":...
https://t.co/Pp4flTGlZ8
✴️ @AI_Python_EN
Our Computer Vision and Deep Learning Group:
https://t.me/joinchat/ECtp7VVFvEwjIrMrdgI-2w
🗣 @AI_Python_Arxiv
✴️ @AI_Python_EN
❇️ @AI_Python
https://t.me/joinchat/ECtp7VVFvEwjIrMrdgI-2w
🗣 @AI_Python_Arxiv
✴️ @AI_Python_EN
❇️ @AI_Python
Many misunderstandings persist regarding logistic regression (LR).
Though it can be used for classification, it is not a classification method. Its predictions are model-based estimates of the probabilities of group/class membership.
Cutoffs can be drawn anywhere that are meaningful to decision makers, not just at probability = .50. Automatically using the .50 cutoff - the default for most LR programs - is a mistake, especially when group/class sizes are highly imbalanced.
Relationships between the predictors (independent variables) and outcome (dependent variable) do not have to be "linear" - any sort of relationship, including curvilinear and moderated relationships (interactions), can be modeled.
Binary LR is just one member of the GLM family. There are also LR models for multinomial, ordinal and count data, as well as probit analysis.
More advanced extensions include multilevel and mixture models, and SEM with multiple categorical outcomes. There are also Bayesian alternatives to maximum likelihood estimation.
BIG topic, with very practical implications for marketing research, data science and many other fields. I'm still learning about it.
🗣 @AI_Python_Arxiv
✴️ @AI_Python_EN
❇️ @AI_Python
Though it can be used for classification, it is not a classification method. Its predictions are model-based estimates of the probabilities of group/class membership.
Cutoffs can be drawn anywhere that are meaningful to decision makers, not just at probability = .50. Automatically using the .50 cutoff - the default for most LR programs - is a mistake, especially when group/class sizes are highly imbalanced.
Relationships between the predictors (independent variables) and outcome (dependent variable) do not have to be "linear" - any sort of relationship, including curvilinear and moderated relationships (interactions), can be modeled.
Binary LR is just one member of the GLM family. There are also LR models for multinomial, ordinal and count data, as well as probit analysis.
More advanced extensions include multilevel and mixture models, and SEM with multiple categorical outcomes. There are also Bayesian alternatives to maximum likelihood estimation.
BIG topic, with very practical implications for marketing research, data science and many other fields. I'm still learning about it.
🗣 @AI_Python_Arxiv
✴️ @AI_Python_EN
❇️ @AI_Python
image_2019-03-20_04-26-00.png
843.4 KB
A Brief History of Data Science (Pre-2010, i.e. prior to rise of deep learning & popular usage of the term "data science")
#
Note: Modified original version of infographic to add 3 seminal developments in the history of Artificial Intelligence:
- 1943: Artificial neuron model (McCulloch & Pitts)
- 1950: Turing Test (Alan Turing)
- 1956: Dartmouth Conference (McCarthy, Minsky, Shannon)
#datascience #statistics #analytics #machinelearning #bigdata #artificialintelligence #innovation #technology #history #ai #datamining #informatics #infographics #informationtechnology #computerscience #dataanalysis #deeplearning #neuroscience #mathematics #science
🗣 @AI_Python_Arxiv
✴️ @AI_Python_EN
❇️ @AI_Python
#
Note: Modified original version of infographic to add 3 seminal developments in the history of Artificial Intelligence:
- 1943: Artificial neuron model (McCulloch & Pitts)
- 1950: Turing Test (Alan Turing)
- 1956: Dartmouth Conference (McCarthy, Minsky, Shannon)
#datascience #statistics #analytics #machinelearning #bigdata #artificialintelligence #innovation #technology #history #ai #datamining #informatics #infographics #informationtechnology #computerscience #dataanalysis #deeplearning #neuroscience #mathematics #science
🗣 @AI_Python_Arxiv
✴️ @AI_Python_EN
❇️ @AI_Python
Comprehensive Collection of #DataScience and #MachineLearning Resources for #DataScientists includes “Great Articles on Natural Language Processing” +much more 👉https://bit.ly/2nvMXIx #abdsc #BigData #AI #DeepLearning #Databases #Coding #Python #Rstats #NeuralNetworks #NLProc
✴️ @AI_Python_EN
✴️ @AI_Python_EN