What does this query do
SELECT order_id, amount FROM orders ORDER BY amount DESC LIMIT 5;
SELECT order_id, amount FROM orders ORDER BY amount DESC LIMIT 5;
Anonymous Quiz
6%
A. Returns 5 random orders
20%
B. Returns 5 smallest orders
10%
C. Returns all orders sorted by amount
64%
D. Returns top 5 highest value orders
❤8
SQL vs NoSQL Databases: Quick Comparison ✅
SQL Databases
- Structured data
- Fixed schema
- Table-based storage
- Strong consistency
- Popular tools: MySQL, PostgreSQL, SQL Server, Oracle
- Best use cases: Banking systems, ERP and CRM, transaction-heavy apps, reporting and analytics
- Job roles: Data Analyst, Backend Developer, Database Engineer, BI Developer
- Hiring reality: Mandatory in enterprises, core skill for analytics roles, used in almost every company
- India salary range: Fresher (4-7 LPA), Mid-level (8-18 LPA)
- Real tasks: Write complex queries, join multiple tables, build reports, ensure data integrity
NoSQL Databases
- Semi-structured or unstructured data
- Flexible schema
- Document, key-value, or graph based
- High scalability
- Popular tools: MongoDB, Cassandra, DynamoDB, Redis
- Best use cases: Real-time apps, big data systems, IoT platforms, rapidly changing products
- Job roles: Backend Developer, Data Engineer, Cloud Engineer, Platform Engineer
- Hiring reality: Strong demand in startups, common in cloud-native systems, often paired with SQL
- India salary range: Fresher (5-8 LPA), Mid-level (10-22 LPA)
- Real tasks: Store JSON documents, handle large traffic, design scalable schemas, optimize read and write speed
Quick Comparison
- Schema: SQL (fixed), NoSQL (flexible)
- Scaling: SQL (vertical), NoSQL (horizontal)
- Consistency: SQL (strong), NoSQL (eventual)
- Queries: SQL (powerful), NoSQL (simpler)
Role-based Choice
- Data Analyst: SQL required
- Backend Developer: Both useful
- Data Engineer: SQL + NoSQL
- Startup products: NoSQL preferred
Best Career Move
- Learn SQL first
- Add NoSQL for modern systems
- Use both in real projects
Which one do you prefer?
SQL ❤️
NoSQL 👍
Both 🙏
None 😮
SQL Databases
- Structured data
- Fixed schema
- Table-based storage
- Strong consistency
- Popular tools: MySQL, PostgreSQL, SQL Server, Oracle
- Best use cases: Banking systems, ERP and CRM, transaction-heavy apps, reporting and analytics
- Job roles: Data Analyst, Backend Developer, Database Engineer, BI Developer
- Hiring reality: Mandatory in enterprises, core skill for analytics roles, used in almost every company
- India salary range: Fresher (4-7 LPA), Mid-level (8-18 LPA)
- Real tasks: Write complex queries, join multiple tables, build reports, ensure data integrity
NoSQL Databases
- Semi-structured or unstructured data
- Flexible schema
- Document, key-value, or graph based
- High scalability
- Popular tools: MongoDB, Cassandra, DynamoDB, Redis
- Best use cases: Real-time apps, big data systems, IoT platforms, rapidly changing products
- Job roles: Backend Developer, Data Engineer, Cloud Engineer, Platform Engineer
- Hiring reality: Strong demand in startups, common in cloud-native systems, often paired with SQL
- India salary range: Fresher (5-8 LPA), Mid-level (10-22 LPA)
- Real tasks: Store JSON documents, handle large traffic, design scalable schemas, optimize read and write speed
Quick Comparison
- Schema: SQL (fixed), NoSQL (flexible)
- Scaling: SQL (vertical), NoSQL (horizontal)
- Consistency: SQL (strong), NoSQL (eventual)
- Queries: SQL (powerful), NoSQL (simpler)
Role-based Choice
- Data Analyst: SQL required
- Backend Developer: Both useful
- Data Engineer: SQL + NoSQL
- Startup products: NoSQL preferred
Best Career Move
- Learn SQL first
- Add NoSQL for modern systems
- Use both in real projects
Which one do you prefer?
SQL ❤️
NoSQL 👍
Both 🙏
None 😮
❤26🔥1
✅ End to End Data Analytics Project Roadmap
Step 1. Define the business problem
Start with a clear question.
Example: Why did sales drop last quarter?
Decide success metric.
Example: Revenue, growth rate.
Step 2. Understand the data
Identify data sources.
Example: Sales table, customers table.
Check rows, columns, data types.
Spot missing values.
Step 3. Clean the data
Remove duplicates.
Handle missing values.
Fix data types.
Standardize text.
Tools: Excel or Power Query SQL for large datasets.
Step 4. Explore the data
Basic summaries.
Trends over time.
Top and bottom performers.
Examples: Monthly sales trend, top 10 products, region-wise revenue.
Step 5. Analyze and find insights
Compare periods.
Segment data.
Identify drivers.
Examples: Sales drop in one region, high churn in one customer segment.
Step 6. Create visuals and dashboard
KPIs on top.
Trends in middle.
Breakdown charts below.
Tools: Power BI or Tableau.
Step 7. Interpret results
What changed?
Why it changed?
Business impact.
Step 8. Give recommendations
Actionable steps.
Example: Increase ads in high margin regions.
Step 9. Validate and iterate
Cross-check numbers.
Ask stakeholder questions.
Step 10. Present clearly
One-page summary.
Simple language.
Focus on impact.
Sample project ideas
• Sales performance analysis.
• Customer churn analysis.
• Marketing campaign analysis.
• HR attrition dashboard.
Mini task
• Choose one project idea.
• Write the business question.
• List 3 metrics you will track.
Example: For Sales Performance Analysis
Business Question: Why did sales drop last quarter?
Metrics:
1. Revenue growth rate
2. Sales target achievement (%)
3. Customer acquisition cost (CAC)
Double Tap ♥️ For More
Step 1. Define the business problem
Start with a clear question.
Example: Why did sales drop last quarter?
Decide success metric.
Example: Revenue, growth rate.
Step 2. Understand the data
Identify data sources.
Example: Sales table, customers table.
Check rows, columns, data types.
Spot missing values.
Step 3. Clean the data
Remove duplicates.
Handle missing values.
Fix data types.
Standardize text.
Tools: Excel or Power Query SQL for large datasets.
Step 4. Explore the data
Basic summaries.
Trends over time.
Top and bottom performers.
Examples: Monthly sales trend, top 10 products, region-wise revenue.
Step 5. Analyze and find insights
Compare periods.
Segment data.
Identify drivers.
Examples: Sales drop in one region, high churn in one customer segment.
Step 6. Create visuals and dashboard
KPIs on top.
Trends in middle.
Breakdown charts below.
Tools: Power BI or Tableau.
Step 7. Interpret results
What changed?
Why it changed?
Business impact.
Step 8. Give recommendations
Actionable steps.
Example: Increase ads in high margin regions.
Step 9. Validate and iterate
Cross-check numbers.
Ask stakeholder questions.
Step 10. Present clearly
One-page summary.
Simple language.
Focus on impact.
Sample project ideas
• Sales performance analysis.
• Customer churn analysis.
• Marketing campaign analysis.
• HR attrition dashboard.
Mini task
• Choose one project idea.
• Write the business question.
• List 3 metrics you will track.
Example: For Sales Performance Analysis
Business Question: Why did sales drop last quarter?
Metrics:
1. Revenue growth rate
2. Sales target achievement (%)
3. Customer acquisition cost (CAC)
Double Tap ♥️ For More
❤27👏2
Data Analyst Interview Preparation Roadmap ✅
Technical skills to revise
- SQL
Write queries from scratch.
Practice joins, group by, subqueries.
Handle duplicates and NULLs.
Window functions basics.
- Excel
Pivot tables without help.
XLOOKUP and IF confidently.
Data cleaning steps.
- Power BI or Tableau
Explain data model.
Write basic DAX.
Explain one dashboard end to end.
- Statistics
Mean vs median.
Standard deviation meaning.
Correlation vs causation.
- Python. If required
Pandas basics.
Groupby and filtering.
Interview question types
- SQL questions
Top N per group.
Running totals.
Duplicate records.
Date based queries.
- Business case questions
Why did sales drop.
Which metric matters most and why.
- Dashboard questions
Explain one KPI.
How users will use this report.
- Project questions
Data source.
Cleaning logic.
Key insight.
Business action.
Resume preparation
- Must have Tools section.
- One strong project.
- Metrics driven points.
Example: Improved reporting time by 30 percent using Power BI.
Mock interviews
- Practice explaining out loud.
- Time your answers.
- Use real datasets.
Daily prep plan
1 SQL problem.
1 dashboard review.
10 interview questions.
- Common mistakes
Memorizing queries.
No project explanation.
Weak business reasoning.
- Final task
- Prepare one project story.
- Prepare one SQL solution on paper.
- Prepare one business metric explanation.
Double Tap ♥️ For More
Technical skills to revise
- SQL
Write queries from scratch.
Practice joins, group by, subqueries.
Handle duplicates and NULLs.
Window functions basics.
- Excel
Pivot tables without help.
XLOOKUP and IF confidently.
Data cleaning steps.
- Power BI or Tableau
Explain data model.
Write basic DAX.
Explain one dashboard end to end.
- Statistics
Mean vs median.
Standard deviation meaning.
Correlation vs causation.
- Python. If required
Pandas basics.
Groupby and filtering.
Interview question types
- SQL questions
Top N per group.
Running totals.
Duplicate records.
Date based queries.
- Business case questions
Why did sales drop.
Which metric matters most and why.
- Dashboard questions
Explain one KPI.
How users will use this report.
- Project questions
Data source.
Cleaning logic.
Key insight.
Business action.
Resume preparation
- Must have Tools section.
- One strong project.
- Metrics driven points.
Example: Improved reporting time by 30 percent using Power BI.
Mock interviews
- Practice explaining out loud.
- Time your answers.
- Use real datasets.
Daily prep plan
1 SQL problem.
1 dashboard review.
10 interview questions.
- Common mistakes
Memorizing queries.
No project explanation.
Weak business reasoning.
- Final task
- Prepare one project story.
- Prepare one SQL solution on paper.
- Prepare one business metric explanation.
Double Tap ♥️ For More
❤27
What is the main purpose of WHERE in SQL
Anonymous Quiz
12%
A. Sort rows
32%
B. Filter groups
51%
C. Filter individual rows
5%
D. Limit rows
❤6
When should you use HAVING instead of WHERE
Anonymous Quiz
9%
A. When filtering text values
15%
B. When filtering before SELECT
69%
C. When filtering aggregated results
7%
D. When filtering columns
❤7
What will this query return
SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id HAVING SUM(amount) > 10000;
SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id HAVING SUM(amount) > 10000;
Anonymous Quiz
4%
C. All customers and their totals
78%
B. Customers with total spend above 10,000
16%
A. Orders above 10,000
3%
D. Orders grouped by amount
❤6
In which order does SQL process these clauses
Anonymous Quiz
47%
A. SELECT → WHERE → GROUP BY → HAVING
13%
B. WHERE → FROM → GROUP BY → HAVING
35%
C. FROM → WHERE → GROUP BY → HAVING
6%
D. FROM → GROUP BY → WHERE → HAVING
❤14
Top 100 Data Analyst Interview Questions
✅ Data Analytics Basics
1. What is data analytics?
2. Difference between data analytics and data science?
3. What problems does a data analyst solve?
4. What are the types of data analytics?
5. What tools do data analysts use daily?
6. What is a KPI?
7. What is a metric vs KPI?
8. What is descriptive analytics?
9. What is diagnostic analytics?
10. What does a typical day of a data analyst look like?
Data and Databases
11. What is structured data?
12. What is semi-structured data?
13. What is unstructured data?
14. What is a database?
15. Difference between OLTP and OLAP?
16. What is a primary key?
17. What is a foreign key?
18. What is a fact table?
19. What is a dimension table?
20. What is a data warehouse?
SQL for Data Analysts
21. What is SELECT used for?
22. Difference between WHERE and HAVING?
23. What is GROUP BY?
24. What are aggregate functions?
25. Difference between INNER and LEFT JOIN?
26. What are subqueries?
27. What is a CTE?
28. How do you handle duplicates in SQL?
29. How do you handle NULL values?
30. What are window functions?
Excel for Data Analysis
31. What are pivot tables?
32. Difference between VLOOKUP and XLOOKUP?
33. What is conditional formatting?
34. What are COUNTIFS and SUMIFS?
35. What is data validation?
36. How do you remove duplicates in Excel?
37. What is IF formula used for?
38. Difference between relative and absolute reference?
39. How do you clean data in Excel?
40. What are common Excel mistakes analysts make?
Data Cleaning and Preparation
41. What is data cleaning?
42. How do you handle missing data?
43. How do you treat outliers?
44. What is data normalization?
45. What is data standardization?
46. How do you check data quality?
47. What is duplicate data?
48. How do you validate source data?
49. What is data transformation?
50. Why is data preparation important?
Statistics for Data Analysts
51. Difference between mean and median?
52. What is standard deviation?
53. What is variance?
54. What is correlation?
55. Difference between correlation and causation?
56. What is an outlier?
57. What is sampling?
58. What is distribution?
59. What is skewness?
60. When do you use median over mean?
Data Visualization
61. Why is data visualization important?
62. Difference between bar and line chart?
63. When do you use a pie chart?
64. What is a dashboard?
65. What makes a good dashboard?
66. What is a KPI card?
67. Common visualization mistakes?
68. How do you choose the right chart?
69. What is drill down?
70. What is data storytelling?
Power BI or Tableau
71. What is Power BI or Tableau used for?
72. What is a data model?
73. What is a relationship?
74. What is DAX?
75. Difference between measure and calculated column?
76. What is Power Query?
77. What are filters and slicers?
78. What is row level security?
79. What is refresh schedule?
80. How do you optimize reports?
Business and Case Questions
81. How do you analyze a sales drop?
82. How do you define success metrics?
83. What business metrics have you worked on?
84. How do you prioritize insights?
85. How do you validate insights?
86. What questions do you ask stakeholders?
87. How do you handle vague requirements?
88. How do you measure business impact?
89. How do you explain numbers to managers?
90. How do you recommend actions?
Projects and Real World
91. Explain your best project.
92. What data sources did you use?
93. How did you clean the data?
94. What insight had the most impact?
95. What challenge did you face?
96. How did you solve it?
97. How did stakeholders use your dashboard?
98. What would you improve in your project?
99. How do you handle tight deadlines?
100. Why should we hire you as a data analyst?
Double Tap ♥️ For Detailed Answers
✅ Data Analytics Basics
1. What is data analytics?
2. Difference between data analytics and data science?
3. What problems does a data analyst solve?
4. What are the types of data analytics?
5. What tools do data analysts use daily?
6. What is a KPI?
7. What is a metric vs KPI?
8. What is descriptive analytics?
9. What is diagnostic analytics?
10. What does a typical day of a data analyst look like?
Data and Databases
11. What is structured data?
12. What is semi-structured data?
13. What is unstructured data?
14. What is a database?
15. Difference between OLTP and OLAP?
16. What is a primary key?
17. What is a foreign key?
18. What is a fact table?
19. What is a dimension table?
20. What is a data warehouse?
SQL for Data Analysts
21. What is SELECT used for?
22. Difference between WHERE and HAVING?
23. What is GROUP BY?
24. What are aggregate functions?
25. Difference between INNER and LEFT JOIN?
26. What are subqueries?
27. What is a CTE?
28. How do you handle duplicates in SQL?
29. How do you handle NULL values?
30. What are window functions?
Excel for Data Analysis
31. What are pivot tables?
32. Difference between VLOOKUP and XLOOKUP?
33. What is conditional formatting?
34. What are COUNTIFS and SUMIFS?
35. What is data validation?
36. How do you remove duplicates in Excel?
37. What is IF formula used for?
38. Difference between relative and absolute reference?
39. How do you clean data in Excel?
40. What are common Excel mistakes analysts make?
Data Cleaning and Preparation
41. What is data cleaning?
42. How do you handle missing data?
43. How do you treat outliers?
44. What is data normalization?
45. What is data standardization?
46. How do you check data quality?
47. What is duplicate data?
48. How do you validate source data?
49. What is data transformation?
50. Why is data preparation important?
Statistics for Data Analysts
51. Difference between mean and median?
52. What is standard deviation?
53. What is variance?
54. What is correlation?
55. Difference between correlation and causation?
56. What is an outlier?
57. What is sampling?
58. What is distribution?
59. What is skewness?
60. When do you use median over mean?
Data Visualization
61. Why is data visualization important?
62. Difference between bar and line chart?
63. When do you use a pie chart?
64. What is a dashboard?
65. What makes a good dashboard?
66. What is a KPI card?
67. Common visualization mistakes?
68. How do you choose the right chart?
69. What is drill down?
70. What is data storytelling?
Power BI or Tableau
71. What is Power BI or Tableau used for?
72. What is a data model?
73. What is a relationship?
74. What is DAX?
75. Difference between measure and calculated column?
76. What is Power Query?
77. What are filters and slicers?
78. What is row level security?
79. What is refresh schedule?
80. How do you optimize reports?
Business and Case Questions
81. How do you analyze a sales drop?
82. How do you define success metrics?
83. What business metrics have you worked on?
84. How do you prioritize insights?
85. How do you validate insights?
86. What questions do you ask stakeholders?
87. How do you handle vague requirements?
88. How do you measure business impact?
89. How do you explain numbers to managers?
90. How do you recommend actions?
Projects and Real World
91. Explain your best project.
92. What data sources did you use?
93. How did you clean the data?
94. What insight had the most impact?
95. What challenge did you face?
96. How did you solve it?
97. How did stakeholders use your dashboard?
98. What would you improve in your project?
99. How do you handle tight deadlines?
100. Why should we hire you as a data analyst?
Double Tap ♥️ For Detailed Answers
❤95🔥7👍3
7 Misconceptions About Data Analytics (and What’s Actually True): 📊🚀
❌ You need to be a math or statistics genius
✅ Basic math + logical thinking is enough. Most real-world analytics is about understanding data, not complex formulas.
❌ You must learn every tool before applying for jobs
✅ Start with core tools (Excel, SQL, one BI tool). Master fundamentals — tools can be learned on the job.
❌ Data analytics is only about numbers
✅ It’s about storytelling with data — explaining insights clearly to non-technical stakeholders.
❌ You need coding skills like a software developer
✅ Not required. SQL + basic Python/R is enough for most analyst roles. Deep coding is optional, not mandatory.
❌ Analysts just make dashboards all day
✅ Dashboards are just one part. Real work includes data cleaning, business understanding, ad-hoc analysis, and decision support.
❌ You need huge datasets to be a “real” data analyst
✅ Even small datasets can provide powerful insights if the questions are right.
❌ Once you learn analytics, your learning is done
✅ Data analytics evolves constantly — new tools, business problems, and techniques mean continuous learning.
💬 Tap ❤️ if you agree
❌ You need to be a math or statistics genius
✅ Basic math + logical thinking is enough. Most real-world analytics is about understanding data, not complex formulas.
❌ You must learn every tool before applying for jobs
✅ Start with core tools (Excel, SQL, one BI tool). Master fundamentals — tools can be learned on the job.
❌ Data analytics is only about numbers
✅ It’s about storytelling with data — explaining insights clearly to non-technical stakeholders.
❌ You need coding skills like a software developer
✅ Not required. SQL + basic Python/R is enough for most analyst roles. Deep coding is optional, not mandatory.
❌ Analysts just make dashboards all day
✅ Dashboards are just one part. Real work includes data cleaning, business understanding, ad-hoc analysis, and decision support.
❌ You need huge datasets to be a “real” data analyst
✅ Even small datasets can provide powerful insights if the questions are right.
❌ Once you learn analytics, your learning is done
✅ Data analytics evolves constantly — new tools, business problems, and techniques mean continuous learning.
💬 Tap ❤️ if you agree
❤31🔥1
✅ Data Analyst Interview Questions with Answers
1. What is data analytics?
Data analytics is the process of collecting, cleaning, analyzing, and interpreting data to support business decisions. The goal is to turn raw data into meaningful insights.
2. Difference between data analytics and data science?
Data analytics focuses on analyzing historical data to answer what happened and why. Data science focuses on building predictive models to answer what will happen next using machine learning.
3. What problems does a data analyst solve?
- Identifying trends and patterns
- Explaining business performance
- Finding reasons behind growth or decline
- Supporting decision-making with data
4. What are the types of data analytics?
- Descriptive – What happened
- Diagnostic – Why it happened
- Predictive – What may happen
- Prescriptive – What action to take
5. What tools do data analysts use daily?
- Excel for quick analysis
- SQL for querying databases
- Power BI or Tableau for dashboards
- Python (sometimes) for automation
- Statistics for interpretation
6. What is a KPI?
A KPI (Key Performance Indicator) is a measurable value that shows how well a business or team is achieving its objectives. Example: Monthly revenue, churn rate.
7. Difference between a metric and a KPI?
Metric: Any measurable value (page views, clicks).
KPI: A critical metric directly linked to business goals (conversion rate, revenue growth).
8. What is descriptive analytics?
Descriptive analytics summarizes historical data to understand past performance. Example: Total sales last month, average order value.
9. What is diagnostic analytics?
Diagnostic analytics explains why something happened by comparing data and identifying root causes. Example: Sales dropped because website traffic decreased.
10. What does a typical day of a data analyst look like?
- Pull data using SQL
- Clean data in Excel or Power Query
- Build or update dashboards
- Analyze trends and metrics
- Share insights with stakeholders
Double Tap ♥️ For Part-2
1. What is data analytics?
Data analytics is the process of collecting, cleaning, analyzing, and interpreting data to support business decisions. The goal is to turn raw data into meaningful insights.
2. Difference between data analytics and data science?
Data analytics focuses on analyzing historical data to answer what happened and why. Data science focuses on building predictive models to answer what will happen next using machine learning.
3. What problems does a data analyst solve?
- Identifying trends and patterns
- Explaining business performance
- Finding reasons behind growth or decline
- Supporting decision-making with data
4. What are the types of data analytics?
- Descriptive – What happened
- Diagnostic – Why it happened
- Predictive – What may happen
- Prescriptive – What action to take
5. What tools do data analysts use daily?
- Excel for quick analysis
- SQL for querying databases
- Power BI or Tableau for dashboards
- Python (sometimes) for automation
- Statistics for interpretation
6. What is a KPI?
A KPI (Key Performance Indicator) is a measurable value that shows how well a business or team is achieving its objectives. Example: Monthly revenue, churn rate.
7. Difference between a metric and a KPI?
Metric: Any measurable value (page views, clicks).
KPI: A critical metric directly linked to business goals (conversion rate, revenue growth).
8. What is descriptive analytics?
Descriptive analytics summarizes historical data to understand past performance. Example: Total sales last month, average order value.
9. What is diagnostic analytics?
Diagnostic analytics explains why something happened by comparing data and identifying root causes. Example: Sales dropped because website traffic decreased.
10. What does a typical day of a data analyst look like?
- Pull data using SQL
- Clean data in Excel or Power Query
- Build or update dashboards
- Analyze trends and metrics
- Share insights with stakeholders
Double Tap ♥️ For Part-2
❤55👍2🔥1
✅ Data Analyst Interview Questions with Answers: Part-2
11. What is structured data?
Structured data is organized in rows and columns with a fixed schema, making it easy to store and query using SQL. Example: Sales tables, customer databases.
12. What is semi-structured data?
Semi-structured data does not follow a strict table format but contains tags or keys. Example: JSON files, XML data, API responses.
13. What is unstructured data?
Unstructured data has no predefined format. Example: Emails, images, videos, customer reviews text.
14. What is a database?
A database is an organized system used to store, manage, and retrieve data efficiently. Example: MySQL, PostgreSQL, SQL Server.
15. Difference between OLTP and OLAP?
OLTP (Online Transaction Processing) → Handles daily transactions (e.g., orders, payments).
OLAP (Online Analytical Processing) → Used for reporting and analysis.
16. What is a primary key?
A primary key uniquely identifies each record in a table. Example: Customer_ID in a customer table.
17. What is a foreign key?
A foreign key links one table to another using the primary key of another table. Example: Customer_ID in Orders table linking to Customers table.
18. What is a fact table?
Fact table contains measurable business data like sales, revenue, or quantity.
19. What is a dimension table?
Dimension table contains descriptive details like customer name, region, product category.
20. What is a data warehouse?
A data warehouse is a centralized system that stores large volumes of historical data for analysis and reporting.
Double Tap ♥️ For Part-3
11. What is structured data?
Structured data is organized in rows and columns with a fixed schema, making it easy to store and query using SQL. Example: Sales tables, customer databases.
12. What is semi-structured data?
Semi-structured data does not follow a strict table format but contains tags or keys. Example: JSON files, XML data, API responses.
13. What is unstructured data?
Unstructured data has no predefined format. Example: Emails, images, videos, customer reviews text.
14. What is a database?
A database is an organized system used to store, manage, and retrieve data efficiently. Example: MySQL, PostgreSQL, SQL Server.
15. Difference between OLTP and OLAP?
OLTP (Online Transaction Processing) → Handles daily transactions (e.g., orders, payments).
OLAP (Online Analytical Processing) → Used for reporting and analysis.
16. What is a primary key?
A primary key uniquely identifies each record in a table. Example: Customer_ID in a customer table.
17. What is a foreign key?
A foreign key links one table to another using the primary key of another table. Example: Customer_ID in Orders table linking to Customers table.
18. What is a fact table?
Fact table contains measurable business data like sales, revenue, or quantity.
19. What is a dimension table?
Dimension table contains descriptive details like customer name, region, product category.
20. What is a data warehouse?
A data warehouse is a centralized system that stores large volumes of historical data for analysis and reporting.
Double Tap ♥️ For Part-3
❤32👍3🔥1
Data Analyst Interview Questions with Answers: Part-3
21. What is SELECT used for?
SELECT is used to fetch specific columns or data from a table.
Example:
SELECT customer_name, sales FROM orders;
This query returns customer names and their sales from the orders table.
22. Difference between WHERE and HAVING?
WHERE filters rows before aggregation.
HAVING filters results after aggregation.
Example:
SELECT product, SUM(sales) AS total_sales
FROM orders
WHERE region = 'East'
GROUP BY product
HAVING SUM(sales) > 100000;
Here, WHERE filters region first, HAVING filters aggregated sales.
23. What is GROUP BY?
GROUP BY groups rows with the same values so aggregate functions can be applied.
Example:
SELECT region, SUM(sales) AS total_sales
FROM orders
GROUP BY region;
This gives total sales per region.
24. What are aggregate functions?
Aggregate functions perform calculations on multiple rows.
Common examples:
• COUNT → total rows
• SUM → total value
• AVG → average
• MIN / MAX → smallest or largest value
Example:
SELECT COUNT(order_id), AVG(sales)
FROM orders;
25. Difference between INNER JOIN and LEFT JOIN?
INNER JOIN: Returns only matching records.
LEFT JOIN: Returns all rows from left table and matching rows from right table.
Example:
SELECT o.order_id, c.customer_name
FROM orders o
LEFT JOIN customers c
ON o.customer_id = c.customer_id;
All orders appear even if customer info is missing.
26. What are subqueries?
A subquery is a query inside another query.
Example:
SELECT *
FROM orders
WHERE sales > (SELECT AVG(sales) FROM orders);
Returns orders with sales above average.
27. What is a CTE?
CTE (Common Table Expression) is a temporary named result set that improves readability.
Example:
WITH sales_summary AS (
SELECT region, SUM(sales) AS total_sales
FROM orders
GROUP BY region
)
SELECT *
FROM sales_summary
WHERE total_sales > 500000;
28. How do you handle duplicates in SQL?
Identify duplicates:
SELECT customer_id, COUNT(*)
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 1;
Remove duplicates (using ROW_NUMBER):
DELETE FROM orders
WHERE order_id IN (
SELECT order_id
FROM (
SELECT order_id, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) rn
FROM orders
) t
WHERE rn > 1
);
29. How do you handle NULL values?
Check NULL:
SELECT *
FROM orders
WHERE sales IS NULL;
Replace NULL:
SELECT COALESCE(sales, 0) AS sales_amount
FROM orders;
30. What are window functions?
Window functions perform calculations across rows without grouping them.
Example:
SELECT customer_id, sales, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY sales DESC) AS rn
FROM orders;
This ranks sales per customer without collapsing rows.
Double Tap ♥️ For Part-4
21. What is SELECT used for?
SELECT is used to fetch specific columns or data from a table.
Example:
SELECT customer_name, sales FROM orders;
This query returns customer names and their sales from the orders table.
22. Difference between WHERE and HAVING?
WHERE filters rows before aggregation.
HAVING filters results after aggregation.
Example:
SELECT product, SUM(sales) AS total_sales
FROM orders
WHERE region = 'East'
GROUP BY product
HAVING SUM(sales) > 100000;
Here, WHERE filters region first, HAVING filters aggregated sales.
23. What is GROUP BY?
GROUP BY groups rows with the same values so aggregate functions can be applied.
Example:
SELECT region, SUM(sales) AS total_sales
FROM orders
GROUP BY region;
This gives total sales per region.
24. What are aggregate functions?
Aggregate functions perform calculations on multiple rows.
Common examples:
• COUNT → total rows
• SUM → total value
• AVG → average
• MIN / MAX → smallest or largest value
Example:
SELECT COUNT(order_id), AVG(sales)
FROM orders;
25. Difference between INNER JOIN and LEFT JOIN?
INNER JOIN: Returns only matching records.
LEFT JOIN: Returns all rows from left table and matching rows from right table.
Example:
SELECT o.order_id, c.customer_name
FROM orders o
LEFT JOIN customers c
ON o.customer_id = c.customer_id;
All orders appear even if customer info is missing.
26. What are subqueries?
A subquery is a query inside another query.
Example:
SELECT *
FROM orders
WHERE sales > (SELECT AVG(sales) FROM orders);
Returns orders with sales above average.
27. What is a CTE?
CTE (Common Table Expression) is a temporary named result set that improves readability.
Example:
WITH sales_summary AS (
SELECT region, SUM(sales) AS total_sales
FROM orders
GROUP BY region
)
SELECT *
FROM sales_summary
WHERE total_sales > 500000;
28. How do you handle duplicates in SQL?
Identify duplicates:
SELECT customer_id, COUNT(*)
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 1;
Remove duplicates (using ROW_NUMBER):
DELETE FROM orders
WHERE order_id IN (
SELECT order_id
FROM (
SELECT order_id, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) rn
FROM orders
) t
WHERE rn > 1
);
29. How do you handle NULL values?
Check NULL:
SELECT *
FROM orders
WHERE sales IS NULL;
Replace NULL:
SELECT COALESCE(sales, 0) AS sales_amount
FROM orders;
30. What are window functions?
Window functions perform calculations across rows without grouping them.
Example:
SELECT customer_id, sales, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY sales DESC) AS rn
FROM orders;
This ranks sales per customer without collapsing rows.
Double Tap ♥️ For Part-4
❤37👍2
Which JOIN returns only the rows that exist in both tables?
Anonymous Quiz
7%
A. LEFT JOIN
6%
B. RIGHT JOIN
68%
C. INNER JOIN
18%
D. FULL JOIN
❤6
What will this query return?
SELECT c.name, o.amount FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id;
SELECT c.name, o.amount FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id;
Anonymous Quiz
28%
A. Only customers who placed orders
54%
B. All customers with their order amounts or NULL
5%
C. Only orders without customers
13%
D. Only matching rows from both tables
❤7
Which JOIN is mainly used to find records missing in another table?
Anonymous Quiz
7%
A. INNER JOIN
67%
B. LEFT JOIN with NULL check
14%
C. FULL JOIN
12%
D. CROSS JOIN
❤6
What happens if both tables contain duplicate values on the JOIN key?
Anonymous Quiz
9%
A. Query fails
40%
B. Only unique rows are returned
30%
C. Rows get multiplied
21%
D. JOIN ignores duplicates
❤8👍1
Which JOIN allows a table to join with itself?
Anonymous Quiz
18%
A. INNER JOIN
10%
B. FULL JOIN
70%
C. SELF JOIN
2%
D. RIGHT JOIN
❤7
Data Analyst Interview Questions with Answers: Part-4
31. What are Pivot Tables?
Pivot tables summarize large datasets quickly.
Example: Rows → Product, Values → Sum of Sales
Result: Total sales per product in seconds.
32. Difference between VLOOKUP and XLOOKUP?
VLOOKUP works left to right only. XLOOKUP works both ways and handles missing values better.
Example: =XLOOKUP(A2, Products!A:A, Products!B:B)
Fetches product name using product ID.
33. What is conditional formatting?
Highlights data based on rules.
Example: Highlight sales > 10000 in green.
Helps spot top performers instantly.
34. What are COUNTIFS and SUMIFS?
They apply conditions while counting or summing.
Example: =SUMIFS(C:C, A:A, "East", B:B, "Laptop")
Total sales of laptops in East region.
35. What is data validation?
Restricts incorrect data entry.
Example: Create dropdown for Region (East, West, North).
Data → Data Validation → List.
36. How do you remove duplicates in Excel?
Select data, Data → Remove Duplicates
Example: Remove duplicate customer IDs.
37. What is IF formula used for?
Applies logical conditions.
Example: =IF(C2>5000,"High Sales","Low Sales")
38. Difference between relative and absolute reference?
Relative → A2 changes when copied
Absolute → $A$2 stays fixed
Example: =A2*$E$1 Tax rate fixed while copying formula.
39. How do you clean data in Excel?
Remove duplicates, TRIM extra spaces, Fix date formats, Handle blanks
Example: =TRIM(A2)
40. What are common Excel mistakes analysts make?
• Merged cells
• Hard-coded values
• No pivot tables
• Poor formatting
• No documentation
Double Tap ♥️ For Part-5
31. What are Pivot Tables?
Pivot tables summarize large datasets quickly.
Example: Rows → Product, Values → Sum of Sales
Result: Total sales per product in seconds.
32. Difference between VLOOKUP and XLOOKUP?
VLOOKUP works left to right only. XLOOKUP works both ways and handles missing values better.
Example: =XLOOKUP(A2, Products!A:A, Products!B:B)
Fetches product name using product ID.
33. What is conditional formatting?
Highlights data based on rules.
Example: Highlight sales > 10000 in green.
Helps spot top performers instantly.
34. What are COUNTIFS and SUMIFS?
They apply conditions while counting or summing.
Example: =SUMIFS(C:C, A:A, "East", B:B, "Laptop")
Total sales of laptops in East region.
35. What is data validation?
Restricts incorrect data entry.
Example: Create dropdown for Region (East, West, North).
Data → Data Validation → List.
36. How do you remove duplicates in Excel?
Select data, Data → Remove Duplicates
Example: Remove duplicate customer IDs.
37. What is IF formula used for?
Applies logical conditions.
Example: =IF(C2>5000,"High Sales","Low Sales")
38. Difference between relative and absolute reference?
Relative → A2 changes when copied
Absolute → $A$2 stays fixed
Example: =A2*$E$1 Tax rate fixed while copying formula.
39. How do you clean data in Excel?
Remove duplicates, TRIM extra spaces, Fix date formats, Handle blanks
Example: =TRIM(A2)
40. What are common Excel mistakes analysts make?
• Merged cells
• Hard-coded values
• No pivot tables
• Poor formatting
• No documentation
Double Tap ♥️ For Part-5
❤22👍5
✅ Data Analyst Interview Questions with Answers: Part-5
41. What is data cleaning?
Data cleaning is the process of fixing or removing incorrect, incomplete, or inconsistent data.
Example: Removing duplicate customer records, Fixing wrong date formats.
42. How do you handle missing data?
Common methods:
- Remove rows (if few missing)
- Replace with mean, median, or 0
- Use forward or backward fill
Example (SQL):
43. How do you treat outliers?
- Identify using sorting, box plots, or Z-score
- Remove or cap extreme values
Example: Sales = 10,000, 12,000, 15,000, 1,00,000 → outlier.
44. What is data normalization?
Scaling data between 0 and 1.
Example: Normalized value = (x - min) / (max - min)
Used in ML and comparisons.
45. What is data standardization?
Centers data around mean 0 with std dev 1.
Example: Z = (x - mean) / std
46. How do you check data quality?
- Accuracy
- Completeness
- Consistency
- Validity
- Timeliness
Example: Sales should never be negative.
47. What is duplicate data?
Same record appearing more than once.
Example: Same customer ID repeated multiple times.
48. How do you validate source data?
- Compare with source systems
- Check row counts
- Verify key metrics
Example: Total revenue in report = total revenue in database.
49. What is data transformation?
Converting data into usable format.
Examples:
- Converting dates
- Creating new columns
- Aggregating values
50. Why is data preparation important?
Clean data = correct insights. Poor data leads to wrong decisions.
Example: Wrong sales data → wrong business strategy.
Double Tap ♥️ For Part-6
41. What is data cleaning?
Data cleaning is the process of fixing or removing incorrect, incomplete, or inconsistent data.
Example: Removing duplicate customer records, Fixing wrong date formats.
42. How do you handle missing data?
Common methods:
- Remove rows (if few missing)
- Replace with mean, median, or 0
- Use forward or backward fill
Example (SQL):
SELECT COALESCE(sales, 0) AS sales FROM orders;43. How do you treat outliers?
- Identify using sorting, box plots, or Z-score
- Remove or cap extreme values
Example: Sales = 10,000, 12,000, 15,000, 1,00,000 → outlier.
44. What is data normalization?
Scaling data between 0 and 1.
Example: Normalized value = (x - min) / (max - min)
Used in ML and comparisons.
45. What is data standardization?
Centers data around mean 0 with std dev 1.
Example: Z = (x - mean) / std
46. How do you check data quality?
- Accuracy
- Completeness
- Consistency
- Validity
- Timeliness
Example: Sales should never be negative.
47. What is duplicate data?
Same record appearing more than once.
Example: Same customer ID repeated multiple times.
48. How do you validate source data?
- Compare with source systems
- Check row counts
- Verify key metrics
Example: Total revenue in report = total revenue in database.
49. What is data transformation?
Converting data into usable format.
Examples:
- Converting dates
- Creating new columns
- Aggregating values
50. Why is data preparation important?
Clean data = correct insights. Poor data leads to wrong decisions.
Example: Wrong sales data → wrong business strategy.
Double Tap ♥️ For Part-6
❤20
✅ Data Analyst Interview Questions with Answers: Part-6
51. Difference between mean and median?
Mean is the average. Median is the middle value.
Example: Salaries - 20k, 22k, 25k, 30k, 1,00k
Mean = 39.4k (skewed)
Median = 25k (better representative)
52. What is standard deviation?
It measures how spread out data is from the mean.
Example: Avg sales = ₹10,000
Std dev = ₹500 → stable
Std dev = ₹5,000 → volatile
53. What is variance?
Square of standard deviation. Shows data spread mathematically.
54. What is correlation?
Measures relationship between two variables. Range -1 to +1
Example: Ad spend vs sales = 0.8 → strong positive correlation.
55. Difference between correlation and causation?
Correlation does not mean one causes the other.
Example: Ice cream sales and drowning both increase in summer.
56. What is an outlier?
A value far from others.
Example: Order values - 500, 700, 800, 50,000
57. What is sampling?
Using a subset of data to represent full dataset.
Example: Survey 1,000 customers instead of 1 million.
58. What is distribution?
Pattern showing how data values are spread.
Example: Normal, skewed, uniform distributions.
59. What is skewness?
Measures asymmetry of data.
Example: Income data usually right-skewed.
60. When do you use median over mean?
When data has outliers.
Example: House prices, salaries.
Double Tap ♥️ For Part-7
51. Difference between mean and median?
Mean is the average. Median is the middle value.
Example: Salaries - 20k, 22k, 25k, 30k, 1,00k
Mean = 39.4k (skewed)
Median = 25k (better representative)
52. What is standard deviation?
It measures how spread out data is from the mean.
Example: Avg sales = ₹10,000
Std dev = ₹500 → stable
Std dev = ₹5,000 → volatile
53. What is variance?
Square of standard deviation. Shows data spread mathematically.
54. What is correlation?
Measures relationship between two variables. Range -1 to +1
Example: Ad spend vs sales = 0.8 → strong positive correlation.
55. Difference between correlation and causation?
Correlation does not mean one causes the other.
Example: Ice cream sales and drowning both increase in summer.
56. What is an outlier?
A value far from others.
Example: Order values - 500, 700, 800, 50,000
57. What is sampling?
Using a subset of data to represent full dataset.
Example: Survey 1,000 customers instead of 1 million.
58. What is distribution?
Pattern showing how data values are spread.
Example: Normal, skewed, uniform distributions.
59. What is skewness?
Measures asymmetry of data.
Example: Income data usually right-skewed.
60. When do you use median over mean?
When data has outliers.
Example: House prices, salaries.
Double Tap ♥️ For Part-7
❤25👍1