Must Read Articles For Data Science Enthusiat.
1) Every Intro to Data Science Course on the Internet, Ranked.
(https://lnkd.in/fQDMiNX)
2) What would be useful for aspiring data scientists to know?
(https://lnkd.in/fmcFyN7)
3) 8 Essential Tips for People starting a Career in Data Science.
(https://lnkd.in/f5vUg6i)
4) Cheat sheet: How to become a data scientist.
(https://lnkd.in/fMEhi4D)
5) The Art of Learning Data Science.
(https://lnkd.in/fruY2AC)
6) The Periodic Table of Data Science.
(https://lnkd.in/fxReDab)
7) Aspiring Data Scientists! Start to learn Statistics with these 6 books!
(https://lnkd.in/fXSE-us)
8) 8 Skills You Need to Be a Data Scientist.
(https://lnkd.in/f8S3Ygd)
9) Top 10 Essential Books for the Data Enthusiast
(https://lnkd.in/fKugicE)
10) Aspiring data scientist? Master these fundamentals.
(https://lnkd.in/fTGDkju)
11) How to Become a Data Scientist - On your own.
(https://lnkd.in/f_Zhpzf)
#datascience #neverstoplearning
✴️ @AI_Python_EN
1) Every Intro to Data Science Course on the Internet, Ranked.
(https://lnkd.in/fQDMiNX)
2) What would be useful for aspiring data scientists to know?
(https://lnkd.in/fmcFyN7)
3) 8 Essential Tips for People starting a Career in Data Science.
(https://lnkd.in/f5vUg6i)
4) Cheat sheet: How to become a data scientist.
(https://lnkd.in/fMEhi4D)
5) The Art of Learning Data Science.
(https://lnkd.in/fruY2AC)
6) The Periodic Table of Data Science.
(https://lnkd.in/fxReDab)
7) Aspiring Data Scientists! Start to learn Statistics with these 6 books!
(https://lnkd.in/fXSE-us)
8) 8 Skills You Need to Be a Data Scientist.
(https://lnkd.in/f8S3Ygd)
9) Top 10 Essential Books for the Data Enthusiast
(https://lnkd.in/fKugicE)
10) Aspiring data scientist? Master these fundamentals.
(https://lnkd.in/fTGDkju)
11) How to Become a Data Scientist - On your own.
(https://lnkd.in/f_Zhpzf)
#datascience #neverstoplearning
✴️ @AI_Python_EN
Fashion++: Minimal Edits for Outfit Improvement https://arxiv.org/pdf/1904.09261.pdf
✴️ @AI_Python_EN
✴️ @AI_Python_EN
Swift + TensorFlow
Create a simple NN and CNN.
Notebook by Zaid Alyafeai: https://lnkd.in/e5zWxZ5
#ArtificialIntelligence #DeepLearning #NeuralNetworks
✴️ @AI_Python_EN
Create a simple NN and CNN.
Notebook by Zaid Alyafeai: https://lnkd.in/e5zWxZ5
#ArtificialIntelligence #DeepLearning #NeuralNetworks
✴️ @AI_Python_EN
Many people working in data analysis believe that there's something special about Python (or R, or Scala). They will tell you that you have to use one of those because otherwise, you will not get the best result.
That is of course not true. The choice of language should be based on two factors:
1) How well your final product will integrate with the existing ecosystem.
2) The availability of production-grade data analysis libraries.
Currently, almost any popular language has one or more powerful libraries for data analysis. Java is an excellent example, where the development of everything hot is happening right now because of a multitude of existing JVM languages.
C++ historically has a huge choice of implemented algorithms. Even proprietary ecosystems such as .Net today contain an implementation of most state-of-the-art algorithms and machine learning paradigms.
So, if the person you consider hiring to work on your data analysis project tells you that only #Python is the way to go, I would be skeptical and look for someone who embraces diversity.
✴️ @AI_Python_EN
That is of course not true. The choice of language should be based on two factors:
1) How well your final product will integrate with the existing ecosystem.
2) The availability of production-grade data analysis libraries.
Currently, almost any popular language has one or more powerful libraries for data analysis. Java is an excellent example, where the development of everything hot is happening right now because of a multitude of existing JVM languages.
C++ historically has a huge choice of implemented algorithms. Even proprietary ecosystems such as .Net today contain an implementation of most state-of-the-art algorithms and machine learning paradigms.
So, if the person you consider hiring to work on your data analysis project tells you that only #Python is the way to go, I would be skeptical and look for someone who embraces diversity.
✴️ @AI_Python_EN
an ultra-short micro-course that gives you a fast way to try Python and start using it for data visualization.
This micro-course won’t teach you computer science, and it skips most parts of the Python programming language. But you’ll learn enough to impress colleagues or potential employers with nicer graphics than anyone makes in Excel.
https://www.kaggle.com/learn/data-visualization-from-non-coder-to-coder
✴️ @AI_Python_EN
This micro-course won’t teach you computer science, and it skips most parts of the Python programming language. But you’ll learn enough to impress colleagues or potential employers with nicer graphics than anyone makes in Excel.
https://www.kaggle.com/learn/data-visualization-from-non-coder-to-coder
✴️ @AI_Python_EN
https://github.com/pytorch/pytorch/releases
Official TensorBoard Support, Attributes, Dicts, Lists and User-defined types in JIT / TorchScript, Improved Distributed
✴️ @AI_Python_EN
Official TensorBoard Support, Attributes, Dicts, Lists and User-defined types in JIT / TorchScript, Improved Distributed
✴️ @AI_Python_EN
Current State of Deep Learning from Francois Chollet (Keras creator)
While very strong future perspective, it's still a data-fitting model and require lots and lots of data. In contrast, human brain can work with fewer data but use abstract model of the world.
Video link here
#deeplearning
✴️ @AI_Python_EN
While very strong future perspective, it's still a data-fitting model and require lots and lots of data. In contrast, human brain can work with fewer data but use abstract model of the world.
Video link here
#deeplearning
✴️ @AI_Python_EN
George Box's observation that "Essentially, all models are wrong, but some are useful" is one of the most quoted of all statistical proverbs.
However, we need to ask "Useful for what?"
There are many #machinelearning #algorithms and statistical models that are difficult to interpret but useful in predictive analytics. Sometimes all we need are predictions and classifications that are sufficiently accurate for decision purposes.
Often in the business world there is little or no theory to guide statisticians and data scientists. Moreover, we may not have the data necessary for a good understanding of why some customers are heavier purchasers of our product than others, for instance.
That said, predictions and classifications that are "accurate enough" often aren't good enough. We may need a reasonable - if imperfect - understanding of the Why for these predictions and classifications to be useful.
This is obvious in "hard" scientific research but just as true in the behavioral and social sciences, marketing included.
Being able to design primary studies and having a good grasp of causal analysis, IMO, is necessary to be a full stack analytics professional, as opposed to being a full stack programmer or IT professional. These are different occupations.
"Causation: The Why Beneath The What," an interview with Tyler VanderWeele, a Harvard epidemiologist and authority on causal analysis might be of interest
http://www.greenbookblog.org/2017/07/17/causation-the-why-beneath-the-what/
✴️ @AI_Python_EN
However, we need to ask "Useful for what?"
There are many #machinelearning #algorithms and statistical models that are difficult to interpret but useful in predictive analytics. Sometimes all we need are predictions and classifications that are sufficiently accurate for decision purposes.
Often in the business world there is little or no theory to guide statisticians and data scientists. Moreover, we may not have the data necessary for a good understanding of why some customers are heavier purchasers of our product than others, for instance.
That said, predictions and classifications that are "accurate enough" often aren't good enough. We may need a reasonable - if imperfect - understanding of the Why for these predictions and classifications to be useful.
This is obvious in "hard" scientific research but just as true in the behavioral and social sciences, marketing included.
Being able to design primary studies and having a good grasp of causal analysis, IMO, is necessary to be a full stack analytics professional, as opposed to being a full stack programmer or IT professional. These are different occupations.
"Causation: The Why Beneath The What," an interview with Tyler VanderWeele, a Harvard epidemiologist and authority on causal analysis might be of interest
http://www.greenbookblog.org/2017/07/17/causation-the-why-beneath-the-what/
✴️ @AI_Python_EN
Transfer Learning is a boon to #DeepLearning when you don't have much data of your own.
This allows you to succeed with trained datasets that have worked hard on solving similar problems in #computer #vision or #nlp
Higher start, higher slope and higher asymptote are key ways to know that your model will be performing better.
#performance #machinelearning #transferlearning #model
✴️ @AI_Python_EN
This allows you to succeed with trained datasets that have worked hard on solving similar problems in #computer #vision or #nlp
Higher start, higher slope and higher asymptote are key ways to know that your model will be performing better.
#performance #machinelearning #transferlearning #model
✴️ @AI_Python_EN
In the context of analytics, the terms longitudinal and time-series refer to data covering more than one time-period (i.e., not cross-sectional data).
The terms are often seen as interchangeable but time-series tends to be used more often when there are many periods, e.g., four years of weekly sales data. On the other hand, some see longitudinal as a generic term that includes high-frequency data.
Whatever you call it, there's a lot more of it now than ever and more ways to analyze it than ever, including neural networks architectures such as LSTM.
Here are some books I've found helpful which cover this topic from a statistical angle:
- Longitudinal Analysis (Hoffman)
- Longitudinal Structural Equation Modeling (Newsom)
- Growth Modeling (Grimm et al.)
- Age-Period-Cohort Analysis (Yang and Land)
- Analysis of Longitudinal Data (Diggle et al.)
- Applied Longitudinal Data Analysis for Epidemiology (Twisk)
- Modeling Dynamic Relations (Pauwels)
- Time Series Analysis and Its Applications (Shumway and Stoffer)
- Time Series Analysis (Wei)
(Some titles are abbreviated.)
There are many more, including those focused on fields such meteorology, environmental studies and financial econometrics, but these are good places to start if you'd like to learn more about this topic.
Richly Parameterized Linear Models: Additive, Time Series, and Spatial Models Using Random Effects (Hodges) is a very interesting if (for me challenging) book that might be of interest to some of you -
https://www.crcpress.com/Richly-Parameterized-Linear-Models-Additive-Time-Series-and-Spatial/Hodges/p/book/9781439866832
✴️ @AI_Python_EN
The terms are often seen as interchangeable but time-series tends to be used more often when there are many periods, e.g., four years of weekly sales data. On the other hand, some see longitudinal as a generic term that includes high-frequency data.
Whatever you call it, there's a lot more of it now than ever and more ways to analyze it than ever, including neural networks architectures such as LSTM.
Here are some books I've found helpful which cover this topic from a statistical angle:
- Longitudinal Analysis (Hoffman)
- Longitudinal Structural Equation Modeling (Newsom)
- Growth Modeling (Grimm et al.)
- Age-Period-Cohort Analysis (Yang and Land)
- Analysis of Longitudinal Data (Diggle et al.)
- Applied Longitudinal Data Analysis for Epidemiology (Twisk)
- Modeling Dynamic Relations (Pauwels)
- Time Series Analysis and Its Applications (Shumway and Stoffer)
- Time Series Analysis (Wei)
(Some titles are abbreviated.)
There are many more, including those focused on fields such meteorology, environmental studies and financial econometrics, but these are good places to start if you'd like to learn more about this topic.
Richly Parameterized Linear Models: Additive, Time Series, and Spatial Models Using Random Effects (Hodges) is a very interesting if (for me challenging) book that might be of interest to some of you -
https://www.crcpress.com/Richly-Parameterized-Linear-Models-Additive-Time-Series-and-Spatial/Hodges/p/book/9781439866832
✴️ @AI_Python_EN
Cornell University - Machine Learning for Intelligent Systems (CS4780/ CS5780)
I highly recommend the Cornell University's "Machine Learning for Intelligent Systems (CS4780/ CS5780)" course taught by Associate Professor Kilian Q. Weinberger.
Youtube Video Lectures:
https://www.youtube.com/playlist?list=PLl8OlHZGYOQ7bkVbuRthEsaLr7bONzbXS
Course Lecture Notes:
http://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/
#artificialintelligence #machinelearning #deeplearning #AI #algorithms #computerscience #datascience
✴️ @AI_Python_EN
I highly recommend the Cornell University's "Machine Learning for Intelligent Systems (CS4780/ CS5780)" course taught by Associate Professor Kilian Q. Weinberger.
Youtube Video Lectures:
https://www.youtube.com/playlist?list=PLl8OlHZGYOQ7bkVbuRthEsaLr7bONzbXS
Course Lecture Notes:
http://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/
#artificialintelligence #machinelearning #deeplearning #AI #algorithms #computerscience #datascience
✴️ @AI_Python_EN
This is lecture 3 in the series on Wasserstein #GAN. In this lecture, basic understanding of Wasserstein Generative
Adversarial Network (WGAN) is discussed
videos
✴️ @AI_Python_EN
Adversarial Network (WGAN) is discussed
videos
✴️ @AI_Python_EN
Ironically, Yuval Noah Harari's equation of B X C X D= HH, where B=biological knowledge, C=computer power, D=data, HH=human hacking in days after the 1st report of direct #brain activity to speech.
Fei Fei Li to YNH : "Okay, can I be specific? First of all the birth of AI is AI scientists talking to biologists, specifically neuroscientists, right. The birth of AI is very much inspired by what the brain does. Fast forward to 60 years later, today's AI is making great improvements in healthcare. There's a lot of data from our physiology and pathology being collected and using machine learning to help us. But I feel like you're talking about something else."
https://www.wired.com/story/will-artificial-intelligence-enhance-hack-humanity/
✴️ @AI_Python_EN
Fei Fei Li to YNH : "Okay, can I be specific? First of all the birth of AI is AI scientists talking to biologists, specifically neuroscientists, right. The birth of AI is very much inspired by what the brain does. Fast forward to 60 years later, today's AI is making great improvements in healthcare. There's a lot of data from our physiology and pathology being collected and using machine learning to help us. But I feel like you're talking about something else."
https://www.wired.com/story/will-artificial-intelligence-enhance-hack-humanity/
✴️ @AI_Python_EN
What are histograms?
• Histograms are collected counts of data organized into a set of predefined bins
• When we say data we are not restricting it to be intensity values (as we saw in the previous Tutorial). The data
collected can be whatever feature you find useful to describe your image.
• Let’s see an example. Imagine that a Matrix contains information of an image (i.e. intensity in the range 0-255):
What happens if we want to count this data in an organized way? Since we know that the range of information
value for this case is 256 values, we can segment our range in subparts (called bins) like:
[0; 255] = [0; 15] ∪ [16; 31] ∪ :::: ∪ [240; 255]
range = bin1 ∪ bin2 ∪ :::: ∪ binn=15
and we can keep count of the number of pixels that fall in the range of each bini. Applying this to the example
above we get the image below ( axis x represents the bins and axis y the number of pixels in each of them).
This was just a simple example of how an histogram works and why it is useful. An histogram can keep count
not only of color intensities, but of whatever image features that we want to measure (i.e. gradients, directions,
etc).
• Let’s identify some parts of the histogram:
1. dims: The number of parameters you want to collect data of. In our example, dims = 1 because we are
only counting the intensity values of each pixel (in a greyscale image).
2. bins: It is the number of subdivisions in each dim. In our example, bins = 16
3. range: The limits for the values to be measured. In this case: range = [0,255]
• What if you want to count two features? In this case your resulting histogram would be a 3D plot (in which x
and y would be binx and biny for each feature and z would be the number of counts for each combination of
(binx; biny). The same would apply for more features (of course it gets trickier).
The OpenCV Tutorials, Release 2.4.13.0
✴️ @AI_Python_EN
• Histograms are collected counts of data organized into a set of predefined bins
• When we say data we are not restricting it to be intensity values (as we saw in the previous Tutorial). The data
collected can be whatever feature you find useful to describe your image.
• Let’s see an example. Imagine that a Matrix contains information of an image (i.e. intensity in the range 0-255):
What happens if we want to count this data in an organized way? Since we know that the range of information
value for this case is 256 values, we can segment our range in subparts (called bins) like:
[0; 255] = [0; 15] ∪ [16; 31] ∪ :::: ∪ [240; 255]
range = bin1 ∪ bin2 ∪ :::: ∪ binn=15
and we can keep count of the number of pixels that fall in the range of each bini. Applying this to the example
above we get the image below ( axis x represents the bins and axis y the number of pixels in each of them).
This was just a simple example of how an histogram works and why it is useful. An histogram can keep count
not only of color intensities, but of whatever image features that we want to measure (i.e. gradients, directions,
etc).
• Let’s identify some parts of the histogram:
1. dims: The number of parameters you want to collect data of. In our example, dims = 1 because we are
only counting the intensity values of each pixel (in a greyscale image).
2. bins: It is the number of subdivisions in each dim. In our example, bins = 16
3. range: The limits for the values to be measured. In this case: range = [0,255]
• What if you want to count two features? In this case your resulting histogram would be a 3D plot (in which x
and y would be binx and biny for each feature and z would be the number of counts for each combination of
(binx; biny). The same would apply for more features (of course it gets trickier).
The OpenCV Tutorials, Release 2.4.13.0
✴️ @AI_Python_EN
Activation Atlases: a new technique for visualizing what interactions between neurons can represent
By Google and OpenAI.
Blog: https://blog.openai.com/introducing-activation-atlases/
Paper: https://distill.pub/2019/activation-atlas
Code: https://github.com/tensorflow/lucid/…
Demo: https://distill.pub/2019/activation-atlas/app.html
#artificialintelligence #deeplearning #machinelearning #neuralnetworks
✴️ @AI_Python_EN
By Google and OpenAI.
Blog: https://blog.openai.com/introducing-activation-atlases/
Paper: https://distill.pub/2019/activation-atlas
Code: https://github.com/tensorflow/lucid/…
Demo: https://distill.pub/2019/activation-atlas/app.html
#artificialintelligence #deeplearning #machinelearning #neuralnetworks
✴️ @AI_Python_EN
free courses
🔸 Machine Learning (University of Washington)
🔸 Machine Learning (University of Wisconsin-Madison)
🔸 Algorithms (in journalism) (Columbia University )
🔸 Practical Deep Learning (Yandex Data School)
🔸 Big Data in 30 Hours (Krakow Technical University )
🔸 Deep Reinforcement Learning Bootcamp (UC Berkeley(& others))
🔸 Introduction to Artificial intelligence
(University of Washington)
🔸 Brains, Minds and Machines Summer Course (MIT)
🔸 Design and Analysis of Algorithms
(MIT)
🔸 Natural Language Processing
(University of Washington)
🌎 link
#MachineLearning #DataScience #Course #DeepLearning #BigData #AI
✴️ @AI_Python_EN
🔸 Machine Learning (University of Washington)
🔸 Machine Learning (University of Wisconsin-Madison)
🔸 Algorithms (in journalism) (Columbia University )
🔸 Practical Deep Learning (Yandex Data School)
🔸 Big Data in 30 Hours (Krakow Technical University )
🔸 Deep Reinforcement Learning Bootcamp (UC Berkeley(& others))
🔸 Introduction to Artificial intelligence
(University of Washington)
🔸 Brains, Minds and Machines Summer Course (MIT)
🔸 Design and Analysis of Algorithms
(MIT)
🔸 Natural Language Processing
(University of Washington)
🌎 link
#MachineLearning #DataScience #Course #DeepLearning #BigData #AI
✴️ @AI_Python_EN
Real numbers, data science and chaos: How to fit any dataset with a single parameter
Paper:
https://arxiv.org/abs/1904.12320
Code:
https://github.com/Ranlot/single-parameter-fit/
#artificialintelligence #datascience #dataset #machinelearning
✴️ @AI_Python_EN
Paper:
https://arxiv.org/abs/1904.12320
Code:
https://github.com/Ranlot/single-parameter-fit/
#artificialintelligence #datascience #dataset #machinelearning
✴️ @AI_Python_EN
Artificial Intelligence and Games by Georgios N. Yannakakis
🌎 Book
#artificialintelligence
✴️ @AI_Python_EN
🌎 Book
#artificialintelligence
✴️ @AI_Python_EN
What will be the #programming_language for machine learning in the next few years?
Now it's #Python, but I would rather wish for better language because:
1- Python it's too slow. it's slower than JS!! to work around this problem we either use #C++ in the backend or use other technologies to Python faster such as Cython PyPy.
2- Doesn't support smooth and real Multiprocessing.
So we need a new programming language that has:
1- Nice syntax and easy to learn
2- Fast enough (at least not slower than #JS) without using complex tools.
3- Numerical computing and machine learning ecosystem
4- (Optional) integrated linear algebra operations such as adding two vectors.
I feel it could be either Julia or Swift
#machinelearning
✴️ @AI_Python_EN
Now it's #Python, but I would rather wish for better language because:
1- Python it's too slow. it's slower than JS!! to work around this problem we either use #C++ in the backend or use other technologies to Python faster such as Cython PyPy.
2- Doesn't support smooth and real Multiprocessing.
So we need a new programming language that has:
1- Nice syntax and easy to learn
2- Fast enough (at least not slower than #JS) without using complex tools.
3- Numerical computing and machine learning ecosystem
4- (Optional) integrated linear algebra operations such as adding two vectors.
I feel it could be either Julia or Swift
#machinelearning
✴️ @AI_Python_EN