How would a rockstar πΈwould improve their machine learning models?
ΪΪ―ΩΩΩ Ω Ψ―ΩΩΨ§Ϋ ΫΨ§Ψ―Ϊ―ΫΨ±Ϋ Ω Ψ§Ψ΄ΫΩ Ψ±Ψ§ Ψ¨ΩΨ¨ΩΨ― Ψ―ΩΫΩ Ψ
To get better at playing the guitar, you play the guitar more. You try different songs, different cords. Practice, practice, practice.
All the practice adds up to more experience, more examples of different notes.
And to try something totally different, you might merge two songs together. Or even take a song written originally for the piano but play it on your guitar.
After a while, you're ready to play a show. But the show won't some any good if all the speakers are set to different settings. Steve the sound guy takes care of this.
How does this relate #machinelearning?
1. More practice = more data
More examples of playing different notes = more data. Machine learning models love more data.
2. Combining different songs = feature engineering
If the #data you have isn't in the form you want, transforming into a different shape may be a better way of looking at it.
3. Tuning the speakers = hyperparameter tuning
There's a reason tuning the speakers is the last step in playing a rock show. Working speakers don't mean anything without all the practice (collecting data) and songwriting (feature engineering). If you've done 1 and 2 right, this is the easy part.
π£ @AI_Python_Arxiv
β΄οΈ @AI_Python_EN
ΪΪ―ΩΩΩ Ω Ψ―ΩΩΨ§Ϋ ΫΨ§Ψ―Ϊ―ΫΨ±Ϋ Ω Ψ§Ψ΄ΫΩ Ψ±Ψ§ Ψ¨ΩΨ¨ΩΨ― Ψ―ΩΫΩ Ψ
To get better at playing the guitar, you play the guitar more. You try different songs, different cords. Practice, practice, practice.
All the practice adds up to more experience, more examples of different notes.
And to try something totally different, you might merge two songs together. Or even take a song written originally for the piano but play it on your guitar.
After a while, you're ready to play a show. But the show won't some any good if all the speakers are set to different settings. Steve the sound guy takes care of this.
How does this relate #machinelearning?
1. More practice = more data
More examples of playing different notes = more data. Machine learning models love more data.
2. Combining different songs = feature engineering
If the #data you have isn't in the form you want, transforming into a different shape may be a better way of looking at it.
3. Tuning the speakers = hyperparameter tuning
There's a reason tuning the speakers is the last step in playing a rock show. Working speakers don't mean anything without all the practice (collecting data) and songwriting (feature engineering). If you've done 1 and 2 right, this is the easy part.
π£ @AI_Python_Arxiv
β΄οΈ @AI_Python_EN
How do you detect outliers in #data?
Do you use a blanket rule of anything outside 3 standard deviations?
Or do you use a more robust method?
If you have a resource you learned from or one you created. I'd love to reference it in my article on exploratory data analysis.
If you want to read it, there's a link in the comments. #EDA is one of the areas I've learned the most over the past year.
I remember things best if I write about them. So that's what I did.
PS There's more pretty pictures like this one in there too π¨
#datascience
π£ @AI_Python_Arxiv
β΄οΈ @AI_Python_EN
Do you use a blanket rule of anything outside 3 standard deviations?
Or do you use a more robust method?
If you have a resource you learned from or one you created. I'd love to reference it in my article on exploratory data analysis.
If you want to read it, there's a link in the comments. #EDA is one of the areas I've learned the most over the past year.
I remember things best if I write about them. So that's what I did.
PS There's more pretty pictures like this one in there too π¨
#datascience
π£ @AI_Python_Arxiv
β΄οΈ @AI_Python_EN
Been working through the Google Cloud Certified Professional Data Engineer track on Linux Academy the past few days.
Why?
Because it's one thing to build a #datascience or #machinelearning pipeline in a Jupyter Notebook but it's another thing to have something deployed in production.
Cloud services like #GoogleCloud provide a framework for ingesting, storing, analysing and visualising #data.
My exam is booked in for a couple of weeks.
The quizzes they have at the end of each module are incredibly helpful.
When I pass the exam, I'll do up a post with some of my favourite resources.
In the meantime, you can check out The Data Dossier book (pictured) here: https://lnkd.in/gmZMcGk
And if you're interested in the full Google Cloud Professional Data Engineer course, it's here: https://lnkd.in/gfBwXRF
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Why?
Because it's one thing to build a #datascience or #machinelearning pipeline in a Jupyter Notebook but it's another thing to have something deployed in production.
Cloud services like #GoogleCloud provide a framework for ingesting, storing, analysing and visualising #data.
My exam is booked in for a couple of weeks.
The quizzes they have at the end of each module are incredibly helpful.
When I pass the exam, I'll do up a post with some of my favourite resources.
In the meantime, you can check out The Data Dossier book (pictured) here: https://lnkd.in/gmZMcGk
And if you're interested in the full Google Cloud Professional Data Engineer course, it's here: https://lnkd.in/gfBwXRF
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Most of them use Python to solve data science problems. We may write code in Scripts or Notebook format.
People used to write in scripts. We have to run the entire code again and again which is a time taking process.
Now, everyone is using Jupyter notebooks. They are very useful and save time from executing the entire code. Instead, we can run individual chunks of code.
We need to be more productive in using this. If we don't know the shortcuts to use, we may waste a lot of time. So, it would be better if we know tips and shortcuts to use Jupyter notebook which makes us more productive at work.
Here is the resource to learn shortcuts.
28 Jupyter Notebook tips, tricks, and shortcuts: https://lnkd.in/f6VczRV
#datascience #python #datascience #machinelearning #artificialintelligence #data #deeplearning #jupyternotebook
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
People used to write in scripts. We have to run the entire code again and again which is a time taking process.
Now, everyone is using Jupyter notebooks. They are very useful and save time from executing the entire code. Instead, we can run individual chunks of code.
We need to be more productive in using this. If we don't know the shortcuts to use, we may waste a lot of time. So, it would be better if we know tips and shortcuts to use Jupyter notebook which makes us more productive at work.
Here is the resource to learn shortcuts.
28 Jupyter Notebook tips, tricks, and shortcuts: https://lnkd.in/f6VczRV
#datascience #python #datascience #machinelearning #artificialintelligence #data #deeplearning #jupyternotebook
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
The Best #FREE Books for Learning #DataScience
Link => bit.ly/AIFreeBooks
#ai #analytics #artificialinteligence #bi #bigdata #data #machinelearning
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Link => bit.ly/AIFreeBooks
#ai #analytics #artificialinteligence #bi #bigdata #data #machinelearning
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
Introducing TensorFlow Datasets
By TensorFlow: https://lnkd.in/d2yEjSr
#MachineLearning #Data #Dataset #TensorFlow
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
By TensorFlow: https://lnkd.in/d2yEjSr
#MachineLearning #Data #Dataset #TensorFlow
β΄οΈ @AI_Python_EN
βοΈ @AI_Python
π£ @AI_Python_arXiv
"AI Needs Better Data, Not Just More Data"
> https://lnkd.in/gR5E7Re
#AI #ArtificialIntelligence #MI #MachineIntelligence
#ML #MachineLearning #DataScience #Analytics
#Data #BigData #IoT #4IR #DataPedigree
#Veracity #Trust #DataQuality #BetterData
β΄οΈ @AI_Python_EN
> https://lnkd.in/gR5E7Re
#AI #ArtificialIntelligence #MI #MachineIntelligence
#ML #MachineLearning #DataScience #Analytics
#Data #BigData #IoT #4IR #DataPedigree
#Veracity #Trust #DataQuality #BetterData
β΄οΈ @AI_Python_EN
One of my favorite tricks is adding a constant to each of the independent variables in a regression so as to shift the intercept. Of course just shifting the data will not change R-squared, slopes, F-scores, P-values, etc., so why do it?
Because just about any software package capable of doing regression, even Excel, can give you standard errors and confidence intervals for the Intercept, but it is much harder to get most packages to give you standard errors and confidence intervals around the predicted value of the dependent variable for OTHER combinations of the independent variables. Shifting the intercept is an easy way to get confidence intervals for arbitrary combinations of the independent variables.
This sort of thing becomes especially important at a time when the Statistics community is loudly calling for a move away from P-values. Instead it is recommended that researchers give confidence intervals in clinically meaningful terms.
#data #researchers #statistics #r #excel #regression
β΄οΈ @AI_Python_EN
Because just about any software package capable of doing regression, even Excel, can give you standard errors and confidence intervals for the Intercept, but it is much harder to get most packages to give you standard errors and confidence intervals around the predicted value of the dependent variable for OTHER combinations of the independent variables. Shifting the intercept is an easy way to get confidence intervals for arbitrary combinations of the independent variables.
This sort of thing becomes especially important at a time when the Statistics community is loudly calling for a move away from P-values. Instead it is recommended that researchers give confidence intervals in clinically meaningful terms.
#data #researchers #statistics #r #excel #regression
β΄οΈ @AI_Python_EN
MATH ml.pdf
694.6 KB
Mathematics for Machine Learning
Credits - Garrett Thomas
#data #datascience #machinelearning #deeplearning #dataanalytics #dataanalysis
β΄οΈ @AI_Python_EN
Credits - Garrett Thomas
#data #datascience #machinelearning #deeplearning #dataanalytics #dataanalysis
β΄οΈ @AI_Python_EN
Top 10 FREE Deep Learning Courses via T. Scott Clendaniel
Link => http://bit.ly/10FreeDL
#ai #education #success #training #bigdata #data #datascience #artificialintelligence
β΄οΈ @AI_Python_EN
Link => http://bit.ly/10FreeDL
#ai #education #success #training #bigdata #data #datascience #artificialintelligence
β΄οΈ @AI_Python_EN
ot Unbalanced Dataset to analyse and confused how to use data strategically to get unbiased results
approach to handle unbalanced data
https://lnkd.in/dZHXigP
#datascience #unbalanced #data #analyse
β΄οΈ @AI_Python_EN
approach to handle unbalanced data
https://lnkd.in/dZHXigP
#datascience #unbalanced #data #analyse
β΄οΈ @AI_Python_EN
ery interesting paper on machine learning algorithms. This paper compares polynomial regression vs neural networks applying on several well known datasets (including MNIST). The results are worth looking.
Other datasets tested: (1) census data of engineers salaries in Silicon Valley; (2) million song data; (3) concrete strength data; (4) letter recognition data; (5) New York city taxi data; (6) forest cover type data; (7) Harvard/MIT MOOC course completion data; (8) amateur athletic competitions; (9) NCI cancer genomics; (10) MNIST image classification; and (11) United States 2016 Presidential Election.
I haven't reproduced the paper myself but I am very tempted in doing it.
Link here: https://lnkd.in/fd-VNtk
#machinelearning #petroleumengineering #artificialintelligence #data #algorithms #neuralnetworks #predictiveanalytics
β΄οΈ @AI_Python_EN
Other datasets tested: (1) census data of engineers salaries in Silicon Valley; (2) million song data; (3) concrete strength data; (4) letter recognition data; (5) New York city taxi data; (6) forest cover type data; (7) Harvard/MIT MOOC course completion data; (8) amateur athletic competitions; (9) NCI cancer genomics; (10) MNIST image classification; and (11) United States 2016 Presidential Election.
I haven't reproduced the paper myself but I am very tempted in doing it.
Link here: https://lnkd.in/fd-VNtk
#machinelearning #petroleumengineering #artificialintelligence #data #algorithms #neuralnetworks #predictiveanalytics
β΄οΈ @AI_Python_EN
π‘π‘ K-means Clustering β In-depth Tutorial with Example π‘π‘
Credits - Data Flair
Link - https://lnkd.in/eCQUriR
#machineleaning #supervisedlearning #unsupervisedlearning #datascience #data #ai # #technology #deeplearning #artificalintelligence
β΄οΈ @AI_Python_EN
Credits - Data Flair
Link - https://lnkd.in/eCQUriR
#machineleaning #supervisedlearning #unsupervisedlearning #datascience #data #ai # #technology #deeplearning #artificalintelligence
β΄οΈ @AI_Python_EN
Sampling is a deceptively complex subject, and some academic statisticians have devoted the bulk of their careers to it.
It's not a subject that thrills everyone but is a very important one, and one which seems underappreciated in marketing research and #data science.
Here are some books on or related to sampling I've found helpful:
- Survey Sampling (Kish)
- Sampling Techniques (Cochran)
- Model Assisted Survey Sampling (SΓ€rndal et al.)
- Sampling: Design and Analysis (Lohr)
- Practical Tools for Designing and Weighting Survey Samples (Valliant et al.)
- Survey Weights: A Step-by-step Guide to Calculation (Valliant and Dever)ο»Ώ
- Complex Surveys (Lumley)ο»Ώ
- Hard-to-Survey Populations (Tourangeau et al.)
- Small Area Estimation (Rao and Molina)
The first three are regarded as classics (though still relevant.) Sharon Lohr's book is the friendliest introduction I know of on this subject. Standard marketing research textbooks also give simple overviews of sampling but do not get into depth.
There are also academic journals that feature articles on sampling, such as the Public Opinion Quarterly (AAPOR) and the Journal of Survey #Statistics and Methodology (AAPOR and ASA).
β΄οΈ @AI_Python_EN
It's not a subject that thrills everyone but is a very important one, and one which seems underappreciated in marketing research and #data science.
Here are some books on or related to sampling I've found helpful:
- Survey Sampling (Kish)
- Sampling Techniques (Cochran)
- Model Assisted Survey Sampling (SΓ€rndal et al.)
- Sampling: Design and Analysis (Lohr)
- Practical Tools for Designing and Weighting Survey Samples (Valliant et al.)
- Survey Weights: A Step-by-step Guide to Calculation (Valliant and Dever)ο»Ώ
- Complex Surveys (Lumley)ο»Ώ
- Hard-to-Survey Populations (Tourangeau et al.)
- Small Area Estimation (Rao and Molina)
The first three are regarded as classics (though still relevant.) Sharon Lohr's book is the friendliest introduction I know of on this subject. Standard marketing research textbooks also give simple overviews of sampling but do not get into depth.
There are also academic journals that feature articles on sampling, such as the Public Opinion Quarterly (AAPOR) and the Journal of Survey #Statistics and Methodology (AAPOR and ASA).
β΄οΈ @AI_Python_EN
Empowering you to use machine learning to get valuable insights from data.
π₯ Implement basic ML algorithms and deep neural networks with PyTorch.
π₯ Run everything on the browser without any set up using Google Colab.
π¦ Learn object-oriented ML to code for products, not just tutorials.
Github Link - https://lnkd.in/f8nu8UR
#datascience #data #dataanalysis #ml #machinelearning #deeplearning #ai #artificialintelligence
β΄οΈ @AI_Python_EN
π₯ Implement basic ML algorithms and deep neural networks with PyTorch.
π₯ Run everything on the browser without any set up using Google Colab.
π¦ Learn object-oriented ML to code for products, not just tutorials.
Github Link - https://lnkd.in/f8nu8UR
#datascience #data #dataanalysis #ml #machinelearning #deeplearning #ai #artificialintelligence
β΄οΈ @AI_Python_EN
0.pdf
500.2 KB
π‘π‘ Commonly used Machine Learning Algorithms π‘π‘
Here is the list of commonly used machine learning algorithms. The code is provided in both #R and #Python. These algorithms can be applied to almost any data problem:
β Linear Regression
β Logistic Regression
β Decision Tree
β SVM
β Naive Bayes
β kNN
β K-Means
β Random Forest
β Dimensionality Reduction Algorithms
β Gradient Boosting algorithms
βοΈGBM
βοΈXGBoost
βοΈLightGBM
βοΈCatBoost
Credit: Analytics Vidhya,Sunil Ray
Thanks for the share Steve Nouri.
#datascience #deeplearning #ai #artificialintelligence #machinelearning #data #r #python
β΄οΈ @AI_Python_EN
Here is the list of commonly used machine learning algorithms. The code is provided in both #R and #Python. These algorithms can be applied to almost any data problem:
β Linear Regression
β Logistic Regression
β Decision Tree
β SVM
β Naive Bayes
β kNN
β K-Means
β Random Forest
β Dimensionality Reduction Algorithms
β Gradient Boosting algorithms
βοΈGBM
βοΈXGBoost
βοΈLightGBM
βοΈCatBoost
Credit: Analytics Vidhya,Sunil Ray
Thanks for the share Steve Nouri.
#datascience #deeplearning #ai #artificialintelligence #machinelearning #data #r #python
β΄οΈ @AI_Python_EN
If you've read job descriptions in data lately you are probably confused. Are you a data scientist, machine learning engineer, or research scientist? Instead of title matching, try asking yourself these questions:
1. Can you use statistics to answer questions about a situation that is new to you? Meaning, is your comfort with stats solid enough that you can bring it to bear appropriately depending on scenario?
2. Can you explain why a particular model performs well in a scenario, rather than just noting it does well? Meaning, do you understand the inner workings of models to tune and make sense of why they do what they do?
3. If someone mentions time and space complexity to you, does it make sense? In a big data world, thinking carefully about load of a particular algorithm is extremely important. This matters particularly for MLE and science positions.
4. Can you build something new? Maybe there isn't a perfect algorithm for what you want. Maybe the package in R doesn't exist. Can you make it happen if you need to?
5. Do you know what it means to put something into production? Do you have examples of how you've succeeded or failed with this?
These questions are not all encompassing, but they point to some of the key skillsets you'll need.
#datascience #analytics #data
β΄οΈ @AI_Python_EN
1. Can you use statistics to answer questions about a situation that is new to you? Meaning, is your comfort with stats solid enough that you can bring it to bear appropriately depending on scenario?
2. Can you explain why a particular model performs well in a scenario, rather than just noting it does well? Meaning, do you understand the inner workings of models to tune and make sense of why they do what they do?
3. If someone mentions time and space complexity to you, does it make sense? In a big data world, thinking carefully about load of a particular algorithm is extremely important. This matters particularly for MLE and science positions.
4. Can you build something new? Maybe there isn't a perfect algorithm for what you want. Maybe the package in R doesn't exist. Can you make it happen if you need to?
5. Do you know what it means to put something into production? Do you have examples of how you've succeeded or failed with this?
These questions are not all encompassing, but they point to some of the key skillsets you'll need.
#datascience #analytics #data
β΄οΈ @AI_Python_EN
ππ Python Machine Learning Tutorial ππ
β‘οΈ Python Machine Learning β Tasks and Applications ( https://lnkd.in/fZcs-xE)
β‘οΈ Python Machine Learning Environment Setup β Installation Process (https://lnkd.in/fJHwbjr)
β‘οΈ Data Preprocessing, Analysis & Visualization (https://lnkd.in/fVz58kJ)
β‘οΈ Train and Test Set (https://lnkd.in/fq_GXjn)
β‘οΈ Machine Learning Techniques with Python (https://lnkd.in/fjdsQzd)
β‘οΈ Top Applications of Machine Learning (https://lnkd.in/f-CNyK2)
β‘οΈ Machine Learning Algorithms in Python β You Must Learn (https://lnkd.in/fTxCA23)
#python #machinelearning #datascience #data #dataanalysis #artificialintelligence #ai #visualization #algorithms
β΄οΈ @AI_Python_EN
β‘οΈ Python Machine Learning β Tasks and Applications ( https://lnkd.in/fZcs-xE)
β‘οΈ Python Machine Learning Environment Setup β Installation Process (https://lnkd.in/fJHwbjr)
β‘οΈ Data Preprocessing, Analysis & Visualization (https://lnkd.in/fVz58kJ)
β‘οΈ Train and Test Set (https://lnkd.in/fq_GXjn)
β‘οΈ Machine Learning Techniques with Python (https://lnkd.in/fjdsQzd)
β‘οΈ Top Applications of Machine Learning (https://lnkd.in/f-CNyK2)
β‘οΈ Machine Learning Algorithms in Python β You Must Learn (https://lnkd.in/fTxCA23)
#python #machinelearning #datascience #data #dataanalysis #artificialintelligence #ai #visualization #algorithms
β΄οΈ @AI_Python_EN
It is a good feeling when a popular Python package adds a new feature based on your article :-)
#Yellowbrick is a great little #ML #visualization library in the Python universe, which extends the Scikit-Learn API to allow human steering of the model selection process, and adds statistical plotting capability for common diagnostics tests on ML.
Based on my article "How do you check the quality of your regression model in Python? they are adding a new feature to the library - Cook's distance stemplot (outlier detection) for regression models.
#python #datascience #machinelearning #data #model
https://www.scikit-yb.org/en/latest/
β΄οΈ @AI_Python_EN
#Yellowbrick is a great little #ML #visualization library in the Python universe, which extends the Scikit-Learn API to allow human steering of the model selection process, and adds statistical plotting capability for common diagnostics tests on ML.
Based on my article "How do you check the quality of your regression model in Python? they are adding a new feature to the library - Cook's distance stemplot (outlier detection) for regression models.
#python #datascience #machinelearning #data #model
https://www.scikit-yb.org/en/latest/
β΄οΈ @AI_Python_EN
image_2019-07-12_13-54-05.png
807.4 KB
Welcome to @ai_machinelearning_big_data the world of :
* #Artificial #Intelligence,
* #Deep #Learning,
* #Machine #Learning,
* #Data #Science
* #Python Programming language
* and more advanced research
linksπ and more you wanted.
Join us and learn hot topics of Computer Science together.πππ
@ai_machinelearning_big_data
* #Artificial #Intelligence,
* #Deep #Learning,
* #Machine #Learning,
* #Data #Science
* #Python Programming language
* and more advanced research
linksπ and more you wanted.
Join us and learn hot topics of Computer Science together.πππ
@ai_machinelearning_big_data