Andriy Burkov
I often receive questions from people in my network about what should they learn and master to become a data scientist. While I personally think that the term "data scientist" is very unfortunate and without a clear definition, this is what a good modern #dataanalyst has to master:
#DataScience
– Data structures (local and distributed)
– Data indexing
– Data privacy and anonymization
– Data lifecycle management
– Data transformation (deduplication, handling outliers, and missing values, dimensionality reduction)
– Data analysis (experiment design, classification, regression, unsupervised methods)
– #Machinelearning methods (feature engineering, regularization, hyperparameter tuning, ensemble methods, and #neuralnetwork s)
– Computer and database programming, numerical optimization
– Distributed data processing
– Real-time and high-frequency data processing
– Linux (my personal bias)
A modern data analyst also has to be a good popularizer of complex ideas. Having a Ph.D. is not a requirement, but a very big plus: it contributes to the popularizing skill and teaches the scientific approach to problem-solving.
✴️ @AI_Python_EN
I often receive questions from people in my network about what should they learn and master to become a data scientist. While I personally think that the term "data scientist" is very unfortunate and without a clear definition, this is what a good modern #dataanalyst has to master:
#DataScience
– Data structures (local and distributed)
– Data indexing
– Data privacy and anonymization
– Data lifecycle management
– Data transformation (deduplication, handling outliers, and missing values, dimensionality reduction)
– Data analysis (experiment design, classification, regression, unsupervised methods)
– #Machinelearning methods (feature engineering, regularization, hyperparameter tuning, ensemble methods, and #neuralnetwork s)
– Computer and database programming, numerical optimization
– Distributed data processing
– Real-time and high-frequency data processing
– Linux (my personal bias)
A modern data analyst also has to be a good popularizer of complex ideas. Having a Ph.D. is not a requirement, but a very big plus: it contributes to the popularizing skill and teaches the scientific approach to problem-solving.
✴️ @AI_Python_EN