Here's the link for the complete pdf of 100 pages, for those who want to learn statistics for FREE for Machine Learning role.
https://topmate.io/codingdidi/1177807
ππβ
https://www.instagram.com/reel/C_M7GNXyHFk/?utm_source=ig_web_copy_link&igsh=MzRlODBiNWFlZA==
https://topmate.io/codingdidi/1177807
ππβ
https://www.instagram.com/reel/C_M7GNXyHFk/?utm_source=ig_web_copy_link&igsh=MzRlODBiNWFlZA==
topmate.io
Statistics for Machine Learning with Codingdidi
Complete statistics notes for data science
β€1π1
Many people reached out to me saying telegram may get banned in their countries. So I've decided to create WhatsApp channel ππ
Follow the CODING DIDI channel on WhatsApp:
https://whatsapp.com/channel/0029VaiVMpH2kNFyMWeMDV2Z
Donβt worry Guys your contact number will stay hidden!
ENJOY LEARNING ππ
Follow the CODING DIDI channel on WhatsApp:
https://whatsapp.com/channel/0029VaiVMpH2kNFyMWeMDV2Z
Donβt worry Guys your contact number will stay hidden!
ENJOY LEARNING ππ
WhatsApp.com
CODING DIDI | WhatsApp Channel
CODING DIDI WhatsApp Channel. I will provide free resources, for learning machine learning, data analytics, data science and many more in the AI domain. 0 followers
This media is not supported in your browser
VIEW IN TELEGRAM
π¨Here's an opportunity for youβ οΈ
*Webinar highlights:-*
β Data acquisition
β Data cleaning
β Data analysis
β Data visualization
β Dashboard creation
β Story creation
*Tools that will be covered in the webinar:-*
ππ»Python
ππ»Mysql
ππ»Powerbi
*Goal of the webinar:-*
β A complete data analyst project.
β A good decision maker.
β Practical real world project.
*ADD ONS*:-
- Pandas notes ποΈ which is worth rupees 299/-
- Statistics notes
- β Power BI guide
- β Data analysis notes
- β Sql notes
- β Power Bi 7 days live session access, early bird access.
*Highlights*:-
- 3 hours live session.
- access to the recording for 2 months
*Here's how you can enroll!!*
- Pay 249/- INR
- Fill out the Google form.
*Date:-* 22nd sept 2024
*Timing:-* 7 pm - 10 pm IST
*Webinar highlights:-*
β Data acquisition
β Data cleaning
β Data analysis
β Data visualization
β Dashboard creation
β Story creation
*Tools that will be covered in the webinar:-*
ππ»Python
ππ»Mysql
ππ»Powerbi
*Goal of the webinar:-*
β A complete data analyst project.
β A good decision maker.
β Practical real world project.
*ADD ONS*:-
- Pandas notes ποΈ which is worth rupees 299/-
- Statistics notes
- β Power BI guide
- β Data analysis notes
- β Sql notes
- β Power Bi 7 days live session access, early bird access.
*Highlights*:-
- 3 hours live session.
- access to the recording for 2 months
*Here's how you can enroll!!*
- Pay 249/- INR
- Fill out the Google form.
*Date:-* 22nd sept 2024
*Timing:-* 7 pm - 10 pm IST
π2
FREE RESOURCES TO LEARN PYTHON
ππ
Free Udacity Course to learn Python
https://imp.i115008.net/5bK93j
Data Structure and OOPS in Python Free Courses
https://bit.ly/3t1WEBt
Free Certified Python course by Freecodecamp
https://www.freecodecamp.org/learn/scientific-computing-with-python/
Free Python Course from Google
https://developers.google.com/edu/python
Free Python Tutorials from Kaggle
https://www.kaggle.com/learn/python
Python hands-on Project
https://t.me/Programming_experts/23
Free Python Books Collection
https://cfm.ehu.es/ricardo/docs/python/Learning_Python.pdf
https://static.realpython.com/python-basics-sample-chapters.pdf
π¨βπ»Websites to Practice Python
1. http://codingbat.com/python
2. https://www.hackerrank.com/
3. https://www.hackerearth.com/practice/
4. https://projecteuler.net/archives
5. http://www.codeabbey.com/index/task_list
6. http://www.pythonchallenge.com/
Beginner's guide to Python Free Book
https://t.me/pythondevelopersindia/144
Official Documentation
https://docs.python.org/3/
ENJOY LEARNING ππ
ππ
Free Udacity Course to learn Python
https://imp.i115008.net/5bK93j
Data Structure and OOPS in Python Free Courses
https://bit.ly/3t1WEBt
Free Certified Python course by Freecodecamp
https://www.freecodecamp.org/learn/scientific-computing-with-python/
Free Python Course from Google
https://developers.google.com/edu/python
Free Python Tutorials from Kaggle
https://www.kaggle.com/learn/python
Python hands-on Project
https://t.me/Programming_experts/23
Free Python Books Collection
https://cfm.ehu.es/ricardo/docs/python/Learning_Python.pdf
https://static.realpython.com/python-basics-sample-chapters.pdf
π¨βπ»Websites to Practice Python
1. http://codingbat.com/python
2. https://www.hackerrank.com/
3. https://www.hackerearth.com/practice/
4. https://projecteuler.net/archives
5. http://www.codeabbey.com/index/task_list
6. http://www.pythonchallenge.com/
Beginner's guide to Python Free Book
https://t.me/pythondevelopersindia/144
Official Documentation
https://docs.python.org/3/
ENJOY LEARNING ππ
π4
This media is not supported in your browser
VIEW IN TELEGRAM
π¨Here's an opportunity for youβ οΈ
*Webinar highlights:-*
β Data acquisition
β Data cleaning
β Data analysis
β Data visualization
β Dashboard creation
β Story creation
*Tools that will be covered in the webinar:-*
ππ»Python
ππ»Mysql
ππ»Powerbi
*Goal of the webinar:-*
β A complete data analyst project.
β A good decision maker.
β Practical real world project.
*ADD ONS*:-
- Pandas notes ποΈ which is worth rupees 299/-
- Statistics notes
- β Power BI guide
- β Data analysis notes
- β Sql notes
- β Power Bi 7 days live session access, early bird access.
*Highlights*:-
- 3 hours live session.
- access to the recording for 2 months
*Here's how you can enroll!!*
- Pay 249/- INR
- Fill out the Google form.
*Date:-* 22nd sept 2024
*Timing:-* 7 pm - 10 pm IST
*Webinar highlights:-*
β Data acquisition
β Data cleaning
β Data analysis
β Data visualization
β Dashboard creation
β Story creation
*Tools that will be covered in the webinar:-*
ππ»Python
ππ»Mysql
ππ»Powerbi
*Goal of the webinar:-*
β A complete data analyst project.
β A good decision maker.
β Practical real world project.
*ADD ONS*:-
- Pandas notes ποΈ which is worth rupees 299/-
- Statistics notes
- β Power BI guide
- β Data analysis notes
- β Sql notes
- β Power Bi 7 days live session access, early bird access.
*Highlights*:-
- 3 hours live session.
- access to the recording for 2 months
*Here's how you can enroll!!*
- Pay 249/- INR
- Fill out the Google form.
*Date:-* 22nd sept 2024
*Timing:-* 7 pm - 10 pm IST
π6β€2
Thinking of starting FREE Python live sessions on zoom in Hindi.
What do you guys think π€?
What do you guys think π€?
Anonymous Poll
94%
Exicted
6%
No not π«
π2
Hereβs the link for the pdf of *PYTHON hand written notes* :-
https://drive.google.com/file/d/1wBEz2Nt9s3pjIRdRIxZpUrwclX8Lt-hg/view?usp=drivesdk
Donβt forget to thank me in the comments.
https://drive.google.com/file/d/1wBEz2Nt9s3pjIRdRIxZpUrwclX8Lt-hg/view?usp=drivesdk
Donβt forget to thank me in the comments.
π4β€1
Alert π¨ π²
Many people reached out to me saying telegram may get banned in their countries. So I've decided to create a WhatsApp channel ππ
Follow the CODING DIDI channel on WhatsApp:
https://whatsapp.com/channel/0029VaiVMpH2kNFyMWeMDV2Z
Donβt worry Guys your contact number will stay hidden!
ENJOY LEARNING ππ
Many people reached out to me saying telegram may get banned in their countries. So I've decided to create a WhatsApp channel ππ
Follow the CODING DIDI channel on WhatsApp:
https://whatsapp.com/channel/0029VaiVMpH2kNFyMWeMDV2Z
Donβt worry Guys your contact number will stay hidden!
ENJOY LEARNING ππ
WhatsApp.com
CODING DIDI | WhatsApp Channel
CODING DIDI WhatsApp Channel. I will provide free resources, for learning machine learning, data analytics, data science and many more in the AI domain. 0 followers
π3
JPMorgan is hiring!
Position: Analyst/ Junior Analyst
Qualification: Bachelorβs/ Masterβs Degree/ Undergraduate
Salary: 5 - 8 LPA (Expected)
Experiencο»Ώe: Freshers/ Experienced
Location: Hyderabad; Bengaluru; Mumbai, India
πApply Now: https://jpmc.fa.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/210546045?keyword=Analyst&location=India&locationId=300000000289360&locationLevel=country&mode=location
https://jpmc.fa.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/210540888
https://jpmc.fa.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/requisitions/preview/210547435/?keyword=Analyst&location=India&locationId=300000000289360&locationLevel=country&mode=location
Like for more β€οΈ
All the best ππ
Position: Analyst/ Junior Analyst
Qualification: Bachelorβs/ Masterβs Degree/ Undergraduate
Salary: 5 - 8 LPA (Expected)
Experiencο»Ώe: Freshers/ Experienced
Location: Hyderabad; Bengaluru; Mumbai, India
πApply Now: https://jpmc.fa.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/210546045?keyword=Analyst&location=India&locationId=300000000289360&locationLevel=country&mode=location
https://jpmc.fa.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/210540888
https://jpmc.fa.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/requisitions/preview/210547435/?keyword=Analyst&location=India&locationId=300000000289360&locationLevel=country&mode=location
Like for more β€οΈ
All the best ππ
JPMC Candidate Experience page
Analyst, Operations Support & Process Control Transport
Team Leader Operations Support & Process Control
β€3
https://www.linkedin.com/posts/akansha-yadav24_100-dbms-questions-activity-7239652499761086465-gJbH?utm_source=share&utm_medium=member_android
100 DBMS interview Questions
100 DBMS interview Questions
Linkedin
100 DBMs Questions | Akansha Yadav
100 DBMS interview Questions!!
Follow Akansha Yadav For more informational posts.
#dbms #sql #interview #Questions
Follow Akansha Yadav For more informational posts.
#dbms #sql #interview #Questions
π4β€2
10 commonly asked data science interview questions along with their answers
1οΈβ£ What is the difference between supervised and unsupervised learning?
Supervised learning involves learning from labeled data to predict outcomes while unsupervised learning involves finding patterns in unlabeled data.
2οΈβ£ Explain the bias-variance tradeoff in machine learning.
The bias-variance tradeoff is a key concept in machine learning. Models with high bias have low complexity and over-simplify, while models with high variance are more complex and over-fit to the training data. The goal is to find the right balance between bias and variance.
3οΈβ£ What is the Central Limit Theorem and why is it important in statistics?
The Central Limit Theorem (CLT) states that the sampling distribution of the sample means will be approximately normally distributed regardless of the underlying population distribution, as long as the sample size is sufficiently large. It is important because it justifies the use of statistics, such as hypothesis testing and confidence intervals, on small sample sizes.
4οΈβ£ Describe the process of feature selection and why it is important in machine learning.
Feature selection is the process of selecting the most relevant features (variables) from a dataset. This is important because unnecessary features can lead to over-fitting, slower training times, and reduced accuracy.
5οΈβ£ What is the difference between overfitting and underfitting in machine learning? How do you address them?
Overfitting occurs when a model is too complex and fits the training data too well, resulting in poor performance on unseen data. Underfitting occurs when a model is too simple and cannot fit the training data well enough, resulting in poor performance on both training and unseen data. Techniques to address overfitting include regularization and early stopping, while techniques to address underfitting include using more complex models or increasing the amount of input data.
6οΈβ£ What is regularization and why is it used in machine learning?
Regularization is a technique used to prevent overfitting in machine learning. It involves adding a penalty term to the loss function to limit the complexity of the model, effectively reducing the impact of certain features.
7οΈβ£ How do you handle missing data in a dataset?
Handling missing data can be done by either deleting the missing samples, imputing the missing values, or using models that can handle missing data directly.
8οΈβ£ What is the difference between classification and regression in machine learning?
Classification is a type of supervised learning where the goal is to predict a categorical or discrete outcome, while regression is a type of supervised learning where the goal is to predict a continuous or numerical outcome.
9οΈβ£ Explain the concept of cross-validation and why it is used.
Cross-validation is a technique used to evaluate the performance of a machine learning model. It involves spliting the data into training and validation sets, and then training and evaluating the model on multiple such splits. Cross-validation gives a better idea of the model's generalization ability and helps prevent over-fitting.
π What evaluation metrics would you use to evaluate a binary classification model?
Some commonly used evaluation metrics for binary classification models are accuracy, precision, recall, F1 score, and ROC-AUC. The choice of metric depends on the specific requirements of the problem.
Like if you need similar content ππ
Hope this helps you π
1οΈβ£ What is the difference between supervised and unsupervised learning?
Supervised learning involves learning from labeled data to predict outcomes while unsupervised learning involves finding patterns in unlabeled data.
2οΈβ£ Explain the bias-variance tradeoff in machine learning.
The bias-variance tradeoff is a key concept in machine learning. Models with high bias have low complexity and over-simplify, while models with high variance are more complex and over-fit to the training data. The goal is to find the right balance between bias and variance.
3οΈβ£ What is the Central Limit Theorem and why is it important in statistics?
The Central Limit Theorem (CLT) states that the sampling distribution of the sample means will be approximately normally distributed regardless of the underlying population distribution, as long as the sample size is sufficiently large. It is important because it justifies the use of statistics, such as hypothesis testing and confidence intervals, on small sample sizes.
4οΈβ£ Describe the process of feature selection and why it is important in machine learning.
Feature selection is the process of selecting the most relevant features (variables) from a dataset. This is important because unnecessary features can lead to over-fitting, slower training times, and reduced accuracy.
5οΈβ£ What is the difference between overfitting and underfitting in machine learning? How do you address them?
Overfitting occurs when a model is too complex and fits the training data too well, resulting in poor performance on unseen data. Underfitting occurs when a model is too simple and cannot fit the training data well enough, resulting in poor performance on both training and unseen data. Techniques to address overfitting include regularization and early stopping, while techniques to address underfitting include using more complex models or increasing the amount of input data.
6οΈβ£ What is regularization and why is it used in machine learning?
Regularization is a technique used to prevent overfitting in machine learning. It involves adding a penalty term to the loss function to limit the complexity of the model, effectively reducing the impact of certain features.
7οΈβ£ How do you handle missing data in a dataset?
Handling missing data can be done by either deleting the missing samples, imputing the missing values, or using models that can handle missing data directly.
8οΈβ£ What is the difference between classification and regression in machine learning?
Classification is a type of supervised learning where the goal is to predict a categorical or discrete outcome, while regression is a type of supervised learning where the goal is to predict a continuous or numerical outcome.
9οΈβ£ Explain the concept of cross-validation and why it is used.
Cross-validation is a technique used to evaluate the performance of a machine learning model. It involves spliting the data into training and validation sets, and then training and evaluating the model on multiple such splits. Cross-validation gives a better idea of the model's generalization ability and helps prevent over-fitting.
π What evaluation metrics would you use to evaluate a binary classification model?
Some commonly used evaluation metrics for binary classification models are accuracy, precision, recall, F1 score, and ROC-AUC. The choice of metric depends on the specific requirements of the problem.
Like if you need similar content ππ
Hope this helps you π
π10β€2
Company name :Intent Sourcer
Job role :Data Analyst Trainee
Job type : Internship Entry level
Job Location :Nashik Division
Qualifications
Bachelor's degree or equivalent experience
Expertise with SPSS, Excel, and PowerPoint
Previous quantitative and qualitative research experience
Fresher: Less than 1 year
βΉ 15K - βΉ 20K (Per Month)
Job role :Data Analyst Trainee
Job type : Internship Entry level
Job Location :Nashik Division
Qualifications
Bachelor's degree or equivalent experience
Expertise with SPSS, Excel, and PowerPoint
Previous quantitative and qualitative research experience
Fresher: Less than 1 year
βΉ 15K - βΉ 20K (Per Month)
π2
@Codingdidi
Company name :Intent Sourcer Job role :Data Analyst Trainee Job type : Internship Entry level Job Location :Nashik Division Qualifications Bachelor's degree or equivalent experience Expertise with SPSS, Excel, and PowerPoint Previous quantitative and qualitativeβ¦
expertia.ai
Research Analyst Job | Nashik Division | Fresher
Responsibility:Job DescriptionContact Discovery, D... | Nashik Division | Fresher | Analytical Skills, Research, Microsoft Excel, Microsoft PowerPoint | Full-Time.
π2
@Codingdidi
https://www.linkedin.com/company/intent-sourcer/
Check out the LinkedIn π profile link
Python project-based interview questions for a data analyst role, along with tips and sample answers [Part-1]
1. Data Cleaning and Preprocessing
- Question: Can you walk me through the data cleaning process you followed in a Python-based project?
- Answer: In my project, I used Pandas for data manipulation. First, I handled missing values by imputing them with the median for numerical columns and the most frequent value for categorical columns using fillna(). I also removed outliers by setting a threshold based on the interquartile range (IQR). Additionally, I standardized numerical columns using StandardScaler from Scikit-learn and performed one-hot encoding for categorical variables using Pandas' get_dummies() function.
- Tip: Mention specific functions you used, like dropna(), fillna(), apply(), or replace(), and explain your rationale for selecting each method.
2. Exploratory Data Analysis (EDA)
- Question: How did you perform EDA in a Python project? What tools did you use?
- Answer: I used Pandas for data exploration, generating summary statistics with describe() and checking for correlations with corr(). For visualization, I used Matplotlib and Seaborn to create histograms, scatter plots, and box plots. For instance, I used sns.pairplot() to visually assess relationships between numerical features, which helped me detect potential multicollinearity. Additionally, I applied pivot tables to analyze key metrics by different categorical variables.
- Tip: Focus on how you used visualization tools like Matplotlib, Seaborn, or Plotly, and mention any specific insights you gained from EDA (e.g., data distributions, relationships, outliers).
3. Pandas Operations
- Question: Can you explain a situation where you had to manipulate a large dataset in Python using Pandas?
- Answer: In a project, I worked with a dataset containing over a million rows. I optimized my operations by using vectorized operations instead of Python loops. For example, I used apply() with a lambda function to transform a column, and groupby() to aggregate data by multiple dimensions efficiently. I also leveraged merge() to join datasets on common keys.
- Tip: Emphasize your understanding of efficient data manipulation with Pandas, mentioning functions like groupby(), merge(), concat(), or pivot().
4. Data Visualization
- Question: How do you create visualizations in Python to communicate insights from data?
- Answer: I primarily use Matplotlib and Seaborn for static plots and Plotly for interactive dashboards. For example, in one project, I used sns.heatmap() to visualize the correlation matrix and sns.barplot() for comparing categorical data. For time-series data, I used Matplotlib to create line plots that displayed trends over time. When presenting the results, I tailored visualizations to the audience, ensuring clarity and simplicity.
- Tip: Mention the specific plots you created and how you customized them (e.g., adding labels, titles, adjusting axis scales). Highlight the importance of clear communication through visualization.
Like this post if you want next part of this interview series πβ€οΈ
Hope it helps :)
1. Data Cleaning and Preprocessing
- Question: Can you walk me through the data cleaning process you followed in a Python-based project?
- Answer: In my project, I used Pandas for data manipulation. First, I handled missing values by imputing them with the median for numerical columns and the most frequent value for categorical columns using fillna(). I also removed outliers by setting a threshold based on the interquartile range (IQR). Additionally, I standardized numerical columns using StandardScaler from Scikit-learn and performed one-hot encoding for categorical variables using Pandas' get_dummies() function.
- Tip: Mention specific functions you used, like dropna(), fillna(), apply(), or replace(), and explain your rationale for selecting each method.
2. Exploratory Data Analysis (EDA)
- Question: How did you perform EDA in a Python project? What tools did you use?
- Answer: I used Pandas for data exploration, generating summary statistics with describe() and checking for correlations with corr(). For visualization, I used Matplotlib and Seaborn to create histograms, scatter plots, and box plots. For instance, I used sns.pairplot() to visually assess relationships between numerical features, which helped me detect potential multicollinearity. Additionally, I applied pivot tables to analyze key metrics by different categorical variables.
- Tip: Focus on how you used visualization tools like Matplotlib, Seaborn, or Plotly, and mention any specific insights you gained from EDA (e.g., data distributions, relationships, outliers).
3. Pandas Operations
- Question: Can you explain a situation where you had to manipulate a large dataset in Python using Pandas?
- Answer: In a project, I worked with a dataset containing over a million rows. I optimized my operations by using vectorized operations instead of Python loops. For example, I used apply() with a lambda function to transform a column, and groupby() to aggregate data by multiple dimensions efficiently. I also leveraged merge() to join datasets on common keys.
- Tip: Emphasize your understanding of efficient data manipulation with Pandas, mentioning functions like groupby(), merge(), concat(), or pivot().
4. Data Visualization
- Question: How do you create visualizations in Python to communicate insights from data?
- Answer: I primarily use Matplotlib and Seaborn for static plots and Plotly for interactive dashboards. For example, in one project, I used sns.heatmap() to visualize the correlation matrix and sns.barplot() for comparing categorical data. For time-series data, I used Matplotlib to create line plots that displayed trends over time. When presenting the results, I tailored visualizations to the audience, ensuring clarity and simplicity.
- Tip: Mention the specific plots you created and how you customized them (e.g., adding labels, titles, adjusting axis scales). Highlight the importance of clear communication through visualization.
Like this post if you want next part of this interview series πβ€οΈ
Hope it helps :)
π20β€1