Quick SQL functions cheat sheet for beginners ✍
Aggregate Functions
COUNT(*): Counts rows.
SUM(column): Total sum.
AVG(column): Average value.
MAX(column): Maximum value.
MIN(column): Minimum value.
String Functions
CONCAT(a, b, …): Concatenates strings.
SUBSTRING(s, start, length): Extracts part of a string.
UPPER(s) / LOWER(s): Converts string case.
TRIM(s): Removes leading/trailing spaces.
Date Time Functions
CURRENT_DATE / CURRENT_TIME / CURRENT_TIMESTAMP: Current date/time.
EXTRACT(unit FROM date): Retrieves a date part (e.g., year, month).
DATE_ADD(date, INTERVAL n unit): Adds an interval to a date.
Numeric Functions
ROUND(num, decimals): Rounds to a specified decimal.
CEIL(num) / FLOOR(num): Rounds up/down.
ABS(num): Absolute value.
MOD(a, b): Returns the remainder.
Control Flow Functions
CASE: Conditional logic.
COALESCE(val1, val2, …): Returns the first non-null value.
Like for more free Cheatsheets ❤️
Share with credits: https://t.me/sqlspecialist
Hope it helps :)
Aggregate Functions
COUNT(*): Counts rows.
SUM(column): Total sum.
AVG(column): Average value.
MAX(column): Maximum value.
MIN(column): Minimum value.
String Functions
CONCAT(a, b, …): Concatenates strings.
SUBSTRING(s, start, length): Extracts part of a string.
UPPER(s) / LOWER(s): Converts string case.
TRIM(s): Removes leading/trailing spaces.
Date Time Functions
CURRENT_DATE / CURRENT_TIME / CURRENT_TIMESTAMP: Current date/time.
EXTRACT(unit FROM date): Retrieves a date part (e.g., year, month).
DATE_ADD(date, INTERVAL n unit): Adds an interval to a date.
Numeric Functions
ROUND(num, decimals): Rounds to a specified decimal.
CEIL(num) / FLOOR(num): Rounds up/down.
ABS(num): Absolute value.
MOD(a, b): Returns the remainder.
Control Flow Functions
CASE: Conditional logic.
COALESCE(val1, val2, …): Returns the first non-null value.
Like for more free Cheatsheets ❤️
Share with credits: https://t.me/sqlspecialist
Hope it helps :)
❤21
Don't Confuse to learn Python.
Learn This Concept to be proficient in Python.
𝗕𝗮𝘀𝗶𝗰𝘀 𝗼𝗳 𝗣𝘆𝘁𝗵𝗼𝗻:
- Python Syntax
- Data Types
- Variables
- Operators
- Control Structures:
if-elif-else
Loops
Break and Continue
try-except block
- Functions
- Modules and Packages
𝗢𝗯𝗷𝗲𝗰𝘁-𝗢𝗿𝗶𝗲𝗻𝘁𝗲𝗱 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻:
- Classes and Objects
- Inheritance
- Polymorphism
- Encapsulation
- Abstraction
𝗣𝘆𝘁𝗵𝗼𝗻 𝗟𝗶𝗯𝗿𝗮𝗿𝗶𝗲𝘀:
- Pandas
- Numpy
𝗣𝗮𝗻𝗱𝗮𝘀:
- What is Pandas?
- Installing Pandas
- Importing Pandas
- Pandas Data Structures (Series, DataFrame, Index)
𝗪𝗼𝗿𝗸𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗗𝗮𝘁𝗮𝗙𝗿𝗮𝗺𝗲𝘀:
- Creating DataFrames
- Accessing Data in DataFrames
- Filtering and Selecting Data
- Adding and Removing Columns
- Merging and Joining DataFrames
- Grouping and Aggregating Data
- Pivot Tables
𝗗𝗮𝘁𝗮 𝗖𝗹𝗲𝗮𝗻𝗶𝗻𝗴 𝗮𝗻𝗱 𝗣𝗿𝗲𝗽𝗮𝗿𝗮𝘁𝗶𝗼𝗻:
- Handling Missing Values
- Handling Duplicates
- Data Formatting
- Data Transformation
- Data Normalization
𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗧𝗼𝗽𝗶𝗰𝘀:
- Handling Large Datasets with Dask
- Handling Categorical Data with Pandas
- Handling Text Data with Pandas
- Using Pandas with Scikit-learn
- Performance Optimization with Pandas
𝗗𝗮𝘁𝗮 𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝘀 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻:
- Lists
- Tuples
- Dictionaries
- Sets
𝗙𝗶𝗹𝗲 𝗛𝗮𝗻𝗱𝗹𝗶𝗻𝗴 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻:
- Reading and Writing Text Files
- Reading and Writing Binary Files
- Working with CSV Files
- Working with JSON Files
𝗡𝘂𝗺𝗽𝘆:
- What is NumPy?
- Installing NumPy
- Importing NumPy
- NumPy Arrays
𝗡𝘂𝗺𝗣𝘆 𝗔𝗿𝗿𝗮𝘆 𝗢𝗽𝗲𝗿𝗮𝘁𝗶𝗼𝗻𝘀:
- Creating Arrays
- Accessing Array Elements
- Slicing and Indexing
- Reshaping Arrays
- Combining Arrays
- Splitting Arrays
- Arithmetic Operations
- Broadcasting
𝗪𝗼𝗿𝗸𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗗𝗮𝘁𝗮 𝗶𝗻 𝗡𝘂𝗺𝗣𝘆:
- Reading and Writing Data with NumPy
- Filtering and Sorting Data
- Data Manipulation with NumPy
- Interpolation
- Fourier Transforms
- Window Functions
𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗡𝘂𝗺𝗣𝘆:
- Vectorization
- Memory Management
- Multithreading and Multiprocessing
- Parallel Computing
Like this post if you need more content like this 👍❤️
Learn This Concept to be proficient in Python.
𝗕𝗮𝘀𝗶𝗰𝘀 𝗼𝗳 𝗣𝘆𝘁𝗵𝗼𝗻:
- Python Syntax
- Data Types
- Variables
- Operators
- Control Structures:
if-elif-else
Loops
Break and Continue
try-except block
- Functions
- Modules and Packages
𝗢𝗯𝗷𝗲𝗰𝘁-𝗢𝗿𝗶𝗲𝗻𝘁𝗲𝗱 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻:
- Classes and Objects
- Inheritance
- Polymorphism
- Encapsulation
- Abstraction
𝗣𝘆𝘁𝗵𝗼𝗻 𝗟𝗶𝗯𝗿𝗮𝗿𝗶𝗲𝘀:
- Pandas
- Numpy
𝗣𝗮𝗻𝗱𝗮𝘀:
- What is Pandas?
- Installing Pandas
- Importing Pandas
- Pandas Data Structures (Series, DataFrame, Index)
𝗪𝗼𝗿𝗸𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗗𝗮𝘁𝗮𝗙𝗿𝗮𝗺𝗲𝘀:
- Creating DataFrames
- Accessing Data in DataFrames
- Filtering and Selecting Data
- Adding and Removing Columns
- Merging and Joining DataFrames
- Grouping and Aggregating Data
- Pivot Tables
𝗗𝗮𝘁𝗮 𝗖𝗹𝗲𝗮𝗻𝗶𝗻𝗴 𝗮𝗻𝗱 𝗣𝗿𝗲𝗽𝗮𝗿𝗮𝘁𝗶𝗼𝗻:
- Handling Missing Values
- Handling Duplicates
- Data Formatting
- Data Transformation
- Data Normalization
𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗧𝗼𝗽𝗶𝗰𝘀:
- Handling Large Datasets with Dask
- Handling Categorical Data with Pandas
- Handling Text Data with Pandas
- Using Pandas with Scikit-learn
- Performance Optimization with Pandas
𝗗𝗮𝘁𝗮 𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝘀 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻:
- Lists
- Tuples
- Dictionaries
- Sets
𝗙𝗶𝗹𝗲 𝗛𝗮𝗻𝗱𝗹𝗶𝗻𝗴 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻:
- Reading and Writing Text Files
- Reading and Writing Binary Files
- Working with CSV Files
- Working with JSON Files
𝗡𝘂𝗺𝗽𝘆:
- What is NumPy?
- Installing NumPy
- Importing NumPy
- NumPy Arrays
𝗡𝘂𝗺𝗣𝘆 𝗔𝗿𝗿𝗮𝘆 𝗢𝗽𝗲𝗿𝗮𝘁𝗶𝗼𝗻𝘀:
- Creating Arrays
- Accessing Array Elements
- Slicing and Indexing
- Reshaping Arrays
- Combining Arrays
- Splitting Arrays
- Arithmetic Operations
- Broadcasting
𝗪𝗼𝗿𝗸𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗗𝗮𝘁𝗮 𝗶𝗻 𝗡𝘂𝗺𝗣𝘆:
- Reading and Writing Data with NumPy
- Filtering and Sorting Data
- Data Manipulation with NumPy
- Interpolation
- Fourier Transforms
- Window Functions
𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗡𝘂𝗺𝗣𝘆:
- Vectorization
- Memory Management
- Multithreading and Multiprocessing
- Parallel Computing
Like this post if you need more content like this 👍❤️
❤32👍1👏1
Quick Excel Functions Cheat Sheet for Beginners 📊✍️
Excel offers powerful functions for data analysis, calculations, and automation—perfect for beginners handling spreadsheets.
▎Aggregation Functions
• SUM(range): Totals all values in a range, e.g., SUM(A1:A10).
• AVERAGE(range): Computes the mean of numbers, ignoring blanks.
• COUNT(range): Counts cells with numbers.
• COUNTA(range): Counts non-empty cells.
• MAX(range): Finds the highest value.
• MIN(range): Finds the lowest value.
▎Lookup Functions
• VLOOKUP(value, table, col_index, [range_lookup]): Searches vertically for a value and returns from specified column.
• HLOOKUP(value, table, row_index, [range_lookup]): Searches horizontally.
• INDEX(range, row_num, [column_num]): Returns value at specific position.
• MATCH(lookup_value, range, [match_type]): Finds position of a value.
▎Logical Functions
• IF(condition, true_value, false_value): Executes based on condition, e.g., IF(A1>10, "High", "Low").
• AND(condition1, condition2): True if all conditions met.
• OR(condition1, condition2): True if any condition met.
• NOT(logical): Reverses TRUE/FALSE.
▎Text Functions
• CONCATENATE(text1, text2): Joins text strings (or use operator).
• LEFT(text, num_chars): Extracts from start.
• RIGHT(text, num_chars): Extracts from end.
• LEN(text): Counts characters.
• TRIM(text): Removes extra spaces.
▎Date Time Functions
• TODAY(): Current date.
• NOW(): Current date and time.
• YEAR(date): Extracts year.
• MONTH(date): Extracts month.
• DATEDIF(start_date, end_date, unit): Calculates interval (Y/M/D).
▎Math Stats Functions
• ROUND(number, num_digits): Rounds to digits.
• SUMIF(range, criteria, sum_range): Sums based on condition.
• COUNTIF(range, criteria): Counts based on condition.
• ABS(number): Absolute value.
Excel Resources: https://whatsapp.com/channel/0029VaifY548qIzv0u1AHz3i
Double Tap ♥️ For More
Excel offers powerful functions for data analysis, calculations, and automation—perfect for beginners handling spreadsheets.
▎Aggregation Functions
• SUM(range): Totals all values in a range, e.g., SUM(A1:A10).
• AVERAGE(range): Computes the mean of numbers, ignoring blanks.
• COUNT(range): Counts cells with numbers.
• COUNTA(range): Counts non-empty cells.
• MAX(range): Finds the highest value.
• MIN(range): Finds the lowest value.
▎Lookup Functions
• VLOOKUP(value, table, col_index, [range_lookup]): Searches vertically for a value and returns from specified column.
• HLOOKUP(value, table, row_index, [range_lookup]): Searches horizontally.
• INDEX(range, row_num, [column_num]): Returns value at specific position.
• MATCH(lookup_value, range, [match_type]): Finds position of a value.
▎Logical Functions
• IF(condition, true_value, false_value): Executes based on condition, e.g., IF(A1>10, "High", "Low").
• AND(condition1, condition2): True if all conditions met.
• OR(condition1, condition2): True if any condition met.
• NOT(logical): Reverses TRUE/FALSE.
▎Text Functions
• CONCATENATE(text1, text2): Joins text strings (or use operator).
• LEFT(text, num_chars): Extracts from start.
• RIGHT(text, num_chars): Extracts from end.
• LEN(text): Counts characters.
• TRIM(text): Removes extra spaces.
▎Date Time Functions
• TODAY(): Current date.
• NOW(): Current date and time.
• YEAR(date): Extracts year.
• MONTH(date): Extracts month.
• DATEDIF(start_date, end_date, unit): Calculates interval (Y/M/D).
▎Math Stats Functions
• ROUND(number, num_digits): Rounds to digits.
• SUMIF(range, criteria, sum_range): Sums based on condition.
• COUNTIF(range, criteria): Counts based on condition.
• ABS(number): Absolute value.
Excel Resources: https://whatsapp.com/channel/0029VaifY548qIzv0u1AHz3i
Double Tap ♥️ For More
❤21👍2👏1
⚙️ Data Analytics Roadmap
📂 Excel/Google Sheets (VLOOKUP, Pivot Tables, Charts)
∟📂 SQL (SELECT, JOINs, GROUP BY, Window Functions)
∟📂 Python/R Basics (Pandas, Data Cleaning)
∟📂 Statistics (Descriptive, Inferential, Correlation)
∟📂 Data Visualization (Tableau, Power BI, Matplotlib)
∟📂 ETL Processes (Extract, Transform, Load)
∟📂 Dashboard Design (KPIs, Storytelling)
∟📂 Business Intelligence Tools (Looker, Metabase)
∟📂 Data Quality & Governance
∟📂 A/B Testing & Experimentation
∟📂 Advanced Analytics (Cohort Analysis, Funnel Analysis)
∟📂 Big Data Basics (Spark, Airflow)
∟📂 Communication (Reports, Presentations)
∟📂 Projects (Sales Dashboard, Customer Segmentation)
∟✅ Apply for Data Analyst / BI Analyst Roles
💬 Tap ❤️ for more!
📂 Excel/Google Sheets (VLOOKUP, Pivot Tables, Charts)
∟📂 SQL (SELECT, JOINs, GROUP BY, Window Functions)
∟📂 Python/R Basics (Pandas, Data Cleaning)
∟📂 Statistics (Descriptive, Inferential, Correlation)
∟📂 Data Visualization (Tableau, Power BI, Matplotlib)
∟📂 ETL Processes (Extract, Transform, Load)
∟📂 Dashboard Design (KPIs, Storytelling)
∟📂 Business Intelligence Tools (Looker, Metabase)
∟📂 Data Quality & Governance
∟📂 A/B Testing & Experimentation
∟📂 Advanced Analytics (Cohort Analysis, Funnel Analysis)
∟📂 Big Data Basics (Spark, Airflow)
∟📂 Communication (Reports, Presentations)
∟📂 Projects (Sales Dashboard, Customer Segmentation)
∟✅ Apply for Data Analyst / BI Analyst Roles
💬 Tap ❤️ for more!
❤50
Quick Python Cheat Sheet for Beginners 🐍✍️
Python is widely used for data analysis, automation, and AI—perfect for beginners starting their coding journey.
Aggregation Functions 📊
• sum(list) → Adds all values
👉 sum([1,2,3]) = 6
• len(list) → Counts total elements
👉 len([1,2,3]) = 3
• max(list) → Highest value
👉 max([4,7,2]) = 7
• min(list) → Lowest value
👉 min([4,7,2]) = 2
• sum(list)/len(list) → Average
👉 sum([10,20])/2 = 15
Lookup / Searching 🔍
• in → Check existence
👉 5 in [1,2,5] = True
• list.index(value) → Position of value
👉 [10,20,30].index(20) = 1
• Dictionary lookup
👉 data = {"name": "John", "age": 25} data["name"] # John
Logical Operations 🧠
• if condition: → Decision making
👉 if x > 10: print("High") else: print("Low")
• and → All conditions true
• or → Any condition true
• not → Reverse condition
Text (String) Functions 🔤
• len(text) → Length
👉 len("hello") = 5
• text.lower() → Lowercase
• text.upper() → Uppercase
• text.strip() → Remove spaces
👉 " hi ".strip() = "hi"
• text.replace(old, new)
👉 "hi".replace("h","H") = "Hi"
• String concatenation
👉 "Hello " + "World"
Date Time Functions 📅
• from datetime import datetime
• datetime.now() → Current date time
• Extract values:
now = datetime.now() now.year now.month now.day
Math Functions ➗
• import math
• math.sqrt(x) → Square root
• math.ceil(x) → Round up
• math.floor(x) → Round down
• abs(x) → Absolute value
Conditional Aggregation (Like Excel SUMIF) ⚡
• Using list comprehension
nums = [10, 20, 30, 40] sum(x for x in nums if x > 20) # 70
• Count condition
len([x for x in nums if x > 20]) # 2
Pro Tip for Data Analysts 💡
👉 For real-world work, use libraries: pandas & numpy
Example:
import pandas as pd df["salary"].mean()
Python Resources: https://whatsapp.com/channel/0029VaiM08SDuMRaGKd9Wv0L
Double Tap ♥️ For More
Python is widely used for data analysis, automation, and AI—perfect for beginners starting their coding journey.
Aggregation Functions 📊
• sum(list) → Adds all values
👉 sum([1,2,3]) = 6
• len(list) → Counts total elements
👉 len([1,2,3]) = 3
• max(list) → Highest value
👉 max([4,7,2]) = 7
• min(list) → Lowest value
👉 min([4,7,2]) = 2
• sum(list)/len(list) → Average
👉 sum([10,20])/2 = 15
Lookup / Searching 🔍
• in → Check existence
👉 5 in [1,2,5] = True
• list.index(value) → Position of value
👉 [10,20,30].index(20) = 1
• Dictionary lookup
👉 data = {"name": "John", "age": 25} data["name"] # John
Logical Operations 🧠
• if condition: → Decision making
👉 if x > 10: print("High") else: print("Low")
• and → All conditions true
• or → Any condition true
• not → Reverse condition
Text (String) Functions 🔤
• len(text) → Length
👉 len("hello") = 5
• text.lower() → Lowercase
• text.upper() → Uppercase
• text.strip() → Remove spaces
👉 " hi ".strip() = "hi"
• text.replace(old, new)
👉 "hi".replace("h","H") = "Hi"
• String concatenation
👉 "Hello " + "World"
Date Time Functions 📅
• from datetime import datetime
• datetime.now() → Current date time
• Extract values:
now = datetime.now() now.year now.month now.day
Math Functions ➗
• import math
• math.sqrt(x) → Square root
• math.ceil(x) → Round up
• math.floor(x) → Round down
• abs(x) → Absolute value
Conditional Aggregation (Like Excel SUMIF) ⚡
• Using list comprehension
nums = [10, 20, 30, 40] sum(x for x in nums if x > 20) # 70
• Count condition
len([x for x in nums if x > 20]) # 2
Pro Tip for Data Analysts 💡
👉 For real-world work, use libraries: pandas & numpy
Example:
import pandas as pd df["salary"].mean()
Python Resources: https://whatsapp.com/channel/0029VaiM08SDuMRaGKd9Wv0L
Double Tap ♥️ For More
❤12
✅ SQL Real-world Interview Questions with Answers 🖥️
📊 TABLE: employees
id | name | department | salary
1 | Rahul | IT | 50000
2 | Priya | IT | 70000
3 | Amit | HR | 60000
4 | Neha | HR | 70000
5 | Karan | IT | 80000
6 | Simran | HR | 60000
🎯 1️⃣ Find the 2nd highest salary
🧠 Logic: Get highest salary Then find max salary below that
✅ Query:
SELECT MAX(salary) FROM employees WHERE salary < ( SELECT MAX(salary) FROM employees );
🎯 2️⃣ Find employees earning more than average salary
🧠 Logic: Calculate overall average salary Compare each employee
✅ Query:
SELECT name, salary FROM employees WHERE salary > ( SELECT AVG(salary) FROM employees );
🎯 3️⃣ Find highest salary in each department
🧠 Logic: Group by department Use MAX
✅ Query:
SELECT department, MAX(salary) AS highest_salary FROM employees GROUP BY department;
🎯 4️⃣ Find top 2 highest salaries in each department
🧠 Logic: Use ROW_NUMBER Partition by department Filter top 2
✅ Query:
SELECT * FROM (
SELECT name, department, salary,
ROW_NUMBER() OVER( PARTITION BY department ORDER BY salary DESC ) r
FROM employees
) t WHERE r <= 2;
🎯 5️⃣ Find employees earning more than their department average
🧠 Logic: Use correlated subquery Compare with department avg
✅ Query:
SELECT e.name, e.department, e.salary
FROM employees e
WHERE e.salary > (
SELECT AVG(salary) FROM employees WHERE department = e.department
);
⭐ What Interviewer Checks Here
These 5 questions test:
✔ Subqueries
✔ GROUP BY
✔ Window functions
✔ Correlated queries
✔ Real business logic
SQL Resources: https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
Double Tap ♥️ For More
📊 TABLE: employees
id | name | department | salary
1 | Rahul | IT | 50000
2 | Priya | IT | 70000
3 | Amit | HR | 60000
4 | Neha | HR | 70000
5 | Karan | IT | 80000
6 | Simran | HR | 60000
🎯 1️⃣ Find the 2nd highest salary
🧠 Logic: Get highest salary Then find max salary below that
✅ Query:
SELECT MAX(salary) FROM employees WHERE salary < ( SELECT MAX(salary) FROM employees );
🎯 2️⃣ Find employees earning more than average salary
🧠 Logic: Calculate overall average salary Compare each employee
✅ Query:
SELECT name, salary FROM employees WHERE salary > ( SELECT AVG(salary) FROM employees );
🎯 3️⃣ Find highest salary in each department
🧠 Logic: Group by department Use MAX
✅ Query:
SELECT department, MAX(salary) AS highest_salary FROM employees GROUP BY department;
🎯 4️⃣ Find top 2 highest salaries in each department
🧠 Logic: Use ROW_NUMBER Partition by department Filter top 2
✅ Query:
SELECT * FROM (
SELECT name, department, salary,
ROW_NUMBER() OVER( PARTITION BY department ORDER BY salary DESC ) r
FROM employees
) t WHERE r <= 2;
🎯 5️⃣ Find employees earning more than their department average
🧠 Logic: Use correlated subquery Compare with department avg
✅ Query:
SELECT e.name, e.department, e.salary
FROM employees e
WHERE e.salary > (
SELECT AVG(salary) FROM employees WHERE department = e.department
);
⭐ What Interviewer Checks Here
These 5 questions test:
✔ Subqueries
✔ GROUP BY
✔ Window functions
✔ Correlated queries
✔ Real business logic
SQL Resources: https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
Double Tap ♥️ For More
❤14👏1
🎯 📊 DATA ANALYST MOCK INTERVIEW (WITH ANSWERS)
🧠 1️⃣ Tell me about yourself
✅ Sample Answer:
“I have around 3 years of experience working with data. My core skills include SQL, Excel, and Power BI. I regularly work with data cleaning, transformation, and building dashboards to generate business insights. Recently, I’ve also been strengthening my Python skills for data analysis. I enjoy solving business problems using data and presenting insights in a simple and actionable way.”
📊 2️⃣ What is the difference between WHERE and HAVING?
✅ Answer:
WHERE filters rows before aggregation
HAVING filters after aggregation
Example:
SELECT department, COUNT(*)
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;
🔗 3️⃣ Explain different types of JOINs
✅ Answer:
INNER JOIN → only matching records
LEFT JOIN → all left + matching right
RIGHT JOIN → all right + matching left
FULL JOIN → all records from both
👉 In analytics, LEFT JOIN is most used.
🧠 4️⃣ How do you find duplicate records in SQL?
✅ Answer:
SELECT column, COUNT(*)
FROM table
GROUP BY column
HAVING COUNT(*) > 1;
👉 Used for data cleaning.
📈 5️⃣ What are window functions?
✅ Answer:
“Window functions perform calculations across rows without reducing the number of rows. They are used for ranking, running totals, and comparisons.”
Example:
SELECT salary, RANK() OVER(ORDER BY salary DESC)
FROM employees;
📊 6️⃣ How do you handle missing data?
✅ Answer:
Remove rows (if small impact)
Replace with mean/median
Use default values
Use interpolation (advanced)
👉 Depends on business context.
📉 7️⃣ What is the difference between COUNT(_) and COUNT(column)?
✅ Answer:
COUNT(*) → counts all rows
COUNT(column) → ignores NULL values
📊 8️⃣ What is a KPI? Give example
✅ Answer:
“KPI (Key Performance Indicator) is a measurable value used to track performance.”
Examples: Revenue growth, Conversion rate, Customer retention
🧠 9️⃣ How would you find the 2nd highest salary?
✅ Answer:
SELECT MAX(salary)
FROM employees
WHERE salary < ( SELECT MAX(salary) FROM employees );
📊 🔟 Explain your dashboard project
✅ Strong Answer:
“I created a sales dashboard in Power BI where I analyzed revenue trends, top-performing products, and regional performance. I used DAX for calculations and added filters for better interactivity. This helped stakeholders identify key areas for growth.”
🔥 1️⃣1️⃣ What is normalization?
✅ Answer:
“Normalization is the process of organizing data to reduce redundancy and improve data integrity.”
📊 1️⃣2️⃣ Difference between INNER JOIN and LEFT JOIN?
✅ Answer:
INNER JOIN → only matching data
LEFT JOIN → keeps all left table data
👉 LEFT JOIN is preferred in analytics.
🧠 1️⃣3️⃣ What is a CTE?
✅ Answer:
“A CTE (Common Table Expression) is a temporary result set defined using WITH clause to improve readability.”
📈 1️⃣4️⃣ How do you explain insights to non-technical people?
✅ Answer:
“I focus on storytelling. Instead of technical terms, I explain insights in simple business language with visuals and examples.”
📊 1️⃣5️⃣ What tools have you used?
✅ Answer:
SQL, Excel, Power BI, Python (basic/advanced depending on you)
💼 1️⃣6️⃣ Behavioral Question: Tell me about a challenge
✅ Answer:
“While working on a dataset, I found inconsistencies in data. I cleaned and standardized it using SQL and Excel, ensuring accurate analysis. This improved the dashboard reliability.”
Double Tap ♥️ For More
🧠 1️⃣ Tell me about yourself
✅ Sample Answer:
“I have around 3 years of experience working with data. My core skills include SQL, Excel, and Power BI. I regularly work with data cleaning, transformation, and building dashboards to generate business insights. Recently, I’ve also been strengthening my Python skills for data analysis. I enjoy solving business problems using data and presenting insights in a simple and actionable way.”
📊 2️⃣ What is the difference between WHERE and HAVING?
✅ Answer:
WHERE filters rows before aggregation
HAVING filters after aggregation
Example:
SELECT department, COUNT(*)
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;
🔗 3️⃣ Explain different types of JOINs
✅ Answer:
INNER JOIN → only matching records
LEFT JOIN → all left + matching right
RIGHT JOIN → all right + matching left
FULL JOIN → all records from both
👉 In analytics, LEFT JOIN is most used.
🧠 4️⃣ How do you find duplicate records in SQL?
✅ Answer:
SELECT column, COUNT(*)
FROM table
GROUP BY column
HAVING COUNT(*) > 1;
👉 Used for data cleaning.
📈 5️⃣ What are window functions?
✅ Answer:
“Window functions perform calculations across rows without reducing the number of rows. They are used for ranking, running totals, and comparisons.”
Example:
SELECT salary, RANK() OVER(ORDER BY salary DESC)
FROM employees;
📊 6️⃣ How do you handle missing data?
✅ Answer:
Remove rows (if small impact)
Replace with mean/median
Use default values
Use interpolation (advanced)
👉 Depends on business context.
📉 7️⃣ What is the difference between COUNT(_) and COUNT(column)?
✅ Answer:
COUNT(*) → counts all rows
COUNT(column) → ignores NULL values
📊 8️⃣ What is a KPI? Give example
✅ Answer:
“KPI (Key Performance Indicator) is a measurable value used to track performance.”
Examples: Revenue growth, Conversion rate, Customer retention
🧠 9️⃣ How would you find the 2nd highest salary?
✅ Answer:
SELECT MAX(salary)
FROM employees
WHERE salary < ( SELECT MAX(salary) FROM employees );
📊 🔟 Explain your dashboard project
✅ Strong Answer:
“I created a sales dashboard in Power BI where I analyzed revenue trends, top-performing products, and regional performance. I used DAX for calculations and added filters for better interactivity. This helped stakeholders identify key areas for growth.”
🔥 1️⃣1️⃣ What is normalization?
✅ Answer:
“Normalization is the process of organizing data to reduce redundancy and improve data integrity.”
📊 1️⃣2️⃣ Difference between INNER JOIN and LEFT JOIN?
✅ Answer:
INNER JOIN → only matching data
LEFT JOIN → keeps all left table data
👉 LEFT JOIN is preferred in analytics.
🧠 1️⃣3️⃣ What is a CTE?
✅ Answer:
“A CTE (Common Table Expression) is a temporary result set defined using WITH clause to improve readability.”
📈 1️⃣4️⃣ How do you explain insights to non-technical people?
✅ Answer:
“I focus on storytelling. Instead of technical terms, I explain insights in simple business language with visuals and examples.”
📊 1️⃣5️⃣ What tools have you used?
✅ Answer:
SQL, Excel, Power BI, Python (basic/advanced depending on you)
💼 1️⃣6️⃣ Behavioral Question: Tell me about a challenge
✅ Answer:
“While working on a dataset, I found inconsistencies in data. I cleaned and standardized it using SQL and Excel, ensuring accurate analysis. This improved the dashboard reliability.”
Double Tap ♥️ For More
❤27🔥4👍1
🔰 Data Analyst Roadmap 2026
├── 🗃 Introduction to Data Analysis
│ ├── Role overview & career paths
│ ├── Key skills: SQL, Excel, storytelling
│ └── Tools ecosystem (Colab, Tableau Public)
├── 📊 Excel Mastery (Formulas, Pivots)
│ ├── VLOOKUP, INDEX-MATCH, XLOOKUP
│ ├── PivotTables, slicers, Power Query
│ ├── Charts & conditional formatting
│ └── ETL basics in spreadsheets
├── 🔍 SQL for Analytics (Joins, Aggregates)
│ ├── Advanced SELECT with WHERE, GROUP BY
│ ├── JOINS (INNER, LEFT, window functions)
│ └── Performance: indexes, EXPLAIN plans
├── 📈 Visualization Principles (Charts, Dashboards)
│ ├── Chart types (bar, line, heatmaps)
│ ├── Design rules (avoid chart junk)
│ └── Color theory & accessibility
├── 🐍 Python Basics (Pandas, NumPy)
│ ├── DataFrames: load, clean, merge
│ ├── Grouping, pivoting, NumPy arrays
│ └── Jupyter notebooks & stats intro
├── 🔢 Statistics Fundamentals (Averages, Tests)
│ ├── Descriptive (mean, median, distributions)
│ ├── Hypothesis testing (t-tests, chi-square)
│ └── A/B testing & confidence intervals
├── 🛠 Tableau/Power BI Essentials
│ ├── Tableau: calculated fields, LOD
│ ├── Power BI: DAX, data modeling
│ └── Interactive dashboards & storytelling
├── 🤖 AI Tools for Insights (Prompts, AutoML)
│ ├── Prompt engineering for SQL/viz
│ ├── Tableau Einstein, Power BI Copilot
│ └── AutoML basics (no-code modeling)
├── ☁️ Cloud Platforms (BigQuery Basics)
│ ├── BigQuery SQL & massive datasets
│ ├── AWS QuickSight, Snowflake intro
│ └── Free tier cost optimization
├── 📊 Data Storytelling Frameworks
│ ├── Pyramid Principle for reports
│ ├── KPI dashboards & executive summaries
│ └── Narrative structure (context-insight-action)
├── 🔗 ETL Pipelines Intro (dbt, Airflow)
│ ├── Data transformation with dbt
│ ├── Orchestration (Airflow basics)
│ └── No-code: Zapier automation
├── 💼 Portfolio & Interview Prep
│ ├── 3-5 projects (sales, churn analysis)
│ ├── Kaggle datasets & GitHub portfolio
│ └── STAR method, mock interviews
└── 🧪 Real-world Challenges (Kaggle, Cases)
├── E-commerce churn prediction
├── Marketing ROI analysis
├── Supply chain optimization
└── LeetCode SQL, case studies
Like for detailed explanation ❤️
├── 🗃 Introduction to Data Analysis
│ ├── Role overview & career paths
│ ├── Key skills: SQL, Excel, storytelling
│ └── Tools ecosystem (Colab, Tableau Public)
├── 📊 Excel Mastery (Formulas, Pivots)
│ ├── VLOOKUP, INDEX-MATCH, XLOOKUP
│ ├── PivotTables, slicers, Power Query
│ ├── Charts & conditional formatting
│ └── ETL basics in spreadsheets
├── 🔍 SQL for Analytics (Joins, Aggregates)
│ ├── Advanced SELECT with WHERE, GROUP BY
│ ├── JOINS (INNER, LEFT, window functions)
│ └── Performance: indexes, EXPLAIN plans
├── 📈 Visualization Principles (Charts, Dashboards)
│ ├── Chart types (bar, line, heatmaps)
│ ├── Design rules (avoid chart junk)
│ └── Color theory & accessibility
├── 🐍 Python Basics (Pandas, NumPy)
│ ├── DataFrames: load, clean, merge
│ ├── Grouping, pivoting, NumPy arrays
│ └── Jupyter notebooks & stats intro
├── 🔢 Statistics Fundamentals (Averages, Tests)
│ ├── Descriptive (mean, median, distributions)
│ ├── Hypothesis testing (t-tests, chi-square)
│ └── A/B testing & confidence intervals
├── 🛠 Tableau/Power BI Essentials
│ ├── Tableau: calculated fields, LOD
│ ├── Power BI: DAX, data modeling
│ └── Interactive dashboards & storytelling
├── 🤖 AI Tools for Insights (Prompts, AutoML)
│ ├── Prompt engineering for SQL/viz
│ ├── Tableau Einstein, Power BI Copilot
│ └── AutoML basics (no-code modeling)
├── ☁️ Cloud Platforms (BigQuery Basics)
│ ├── BigQuery SQL & massive datasets
│ ├── AWS QuickSight, Snowflake intro
│ └── Free tier cost optimization
├── 📊 Data Storytelling Frameworks
│ ├── Pyramid Principle for reports
│ ├── KPI dashboards & executive summaries
│ └── Narrative structure (context-insight-action)
├── 🔗 ETL Pipelines Intro (dbt, Airflow)
│ ├── Data transformation with dbt
│ ├── Orchestration (Airflow basics)
│ └── No-code: Zapier automation
├── 💼 Portfolio & Interview Prep
│ ├── 3-5 projects (sales, churn analysis)
│ ├── Kaggle datasets & GitHub portfolio
│ └── STAR method, mock interviews
└── 🧪 Real-world Challenges (Kaggle, Cases)
├── E-commerce churn prediction
├── Marketing ROI analysis
├── Supply chain optimization
└── LeetCode SQL, case studies
Like for detailed explanation ❤️
❤33👍5
🗃 Introduction to Data Analysis
This is the foundation of your entire data analyst journey. If you get this right, everything else becomes easier.
🎯 1. What Does a Data Analyst Actually Do?
A Data Analyst turns raw data into useful insights that help businesses make decisions.
👉 Simple Flow:
Raw Data → Clean → Analyze → Visualize → Tell Story → Decision
🔍 Real Example:
Imagine an e-commerce company:
Data Analyst checks: Why sales dropped last month?
Finds: Mobile users faced checkout issues
Suggests: Fix mobile UX
Result: Sales improve
👉 This is the real job — not just coding.
🧭 2. Career Paths in Data Analytics
You don’t have just one path. You can specialize based on your interest:
🔹 Business Analyst
Focus: Business decisions
Tools: Excel, Power BI
Work: Reports, KPIs, dashboards
🔹 Product Analyst
Focus: User behavior (apps/websites)
Tools: SQL, Python
Work: A/B testing, funnels
🔹 Data Analyst (Core)
Focus: Data querying reporting
Tools: SQL, Excel, Tableau
Work: Data cleaning, dashboards
🔹 Analytics Engineer (Advanced)
Focus: Data pipelines + modeling
Tools: SQL, dbt
Work: Clean data for analysts
🧠 3. Key Skills You MUST Build
🟢 1. SQL (Most Important Skill)
Used to extract data from databases
You’ll write queries like: SELECT, WHERE, GROUP BY, JOIN
🟡 2. Excel (Underrated but Powerful)
• Quick analysis tool
• Used everywhere in companies
Key things: Pivot Tables
Lookups (XLOOKUP)
Dashboards
🔵 3. Data Storytelling
This is what separates average vs high-paid analysts
👉 Anyone can analyze data
👉 Few can explain it simply
Example: Instead of saying:
> “Sales dropped by 20%”
Say:
“Sales dropped by 20% mainly due to mobile checkout issues, fixing this can recover revenue quickly.”
🧰 4. Tools Ecosystem (What You’ll Use)
🧪 Notebooks Practice
Google Colab
👉 Run Python in browser (no setup needed)
📊 Visualization Tools
Tableau Public
👉 Create dashboards portfolio
Microsoft Power BI
👉 Industry-level reporting tool
🧮 Data Sources (Where data lives)
• Databases (MySQL, PostgreSQL)
• Excel files
• APIs
⚡ 5. Types of Data You’ll Work With
📄 Structured Data
Tables (rows columns)
Example: Excel, SQL tables
🧾 Unstructured Data
Text, images, videos
Example: Reviews, tweets
📊 Semi-structured
JSON, XML
Used in APIs
🔁 6. Typical Data Analyst Workflow
Step-by-step:
1. Understand the problem
2. Collect data
3. Clean data (most time spent here!)
4. Analyze
5. Visualize
6. Communicate insights
👉 70% of work = cleaning + understanding data
👉 Only 30% = actual analysis
🚨 7. Beginner Mistakes to Avoid
❌ Learning too many tools at once
❌ Ignoring SQL
❌ Only watching tutorials (no practice)
❌ Not building projects
💡 Reality Check
👉 Data Analysis is NOT about coding
👉 It’s about thinking, problem-solving, and communication
Double Tap ❤️ For More
This is the foundation of your entire data analyst journey. If you get this right, everything else becomes easier.
🎯 1. What Does a Data Analyst Actually Do?
A Data Analyst turns raw data into useful insights that help businesses make decisions.
👉 Simple Flow:
Raw Data → Clean → Analyze → Visualize → Tell Story → Decision
🔍 Real Example:
Imagine an e-commerce company:
Data Analyst checks: Why sales dropped last month?
Finds: Mobile users faced checkout issues
Suggests: Fix mobile UX
Result: Sales improve
👉 This is the real job — not just coding.
🧭 2. Career Paths in Data Analytics
You don’t have just one path. You can specialize based on your interest:
🔹 Business Analyst
Focus: Business decisions
Tools: Excel, Power BI
Work: Reports, KPIs, dashboards
🔹 Product Analyst
Focus: User behavior (apps/websites)
Tools: SQL, Python
Work: A/B testing, funnels
🔹 Data Analyst (Core)
Focus: Data querying reporting
Tools: SQL, Excel, Tableau
Work: Data cleaning, dashboards
🔹 Analytics Engineer (Advanced)
Focus: Data pipelines + modeling
Tools: SQL, dbt
Work: Clean data for analysts
🧠 3. Key Skills You MUST Build
🟢 1. SQL (Most Important Skill)
Used to extract data from databases
You’ll write queries like: SELECT, WHERE, GROUP BY, JOIN
🟡 2. Excel (Underrated but Powerful)
• Quick analysis tool
• Used everywhere in companies
Key things: Pivot Tables
Lookups (XLOOKUP)
Dashboards
🔵 3. Data Storytelling
This is what separates average vs high-paid analysts
👉 Anyone can analyze data
👉 Few can explain it simply
Example: Instead of saying:
> “Sales dropped by 20%”
Say:
“Sales dropped by 20% mainly due to mobile checkout issues, fixing this can recover revenue quickly.”
🧰 4. Tools Ecosystem (What You’ll Use)
🧪 Notebooks Practice
Google Colab
👉 Run Python in browser (no setup needed)
📊 Visualization Tools
Tableau Public
👉 Create dashboards portfolio
Microsoft Power BI
👉 Industry-level reporting tool
🧮 Data Sources (Where data lives)
• Databases (MySQL, PostgreSQL)
• Excel files
• APIs
⚡ 5. Types of Data You’ll Work With
📄 Structured Data
Tables (rows columns)
Example: Excel, SQL tables
🧾 Unstructured Data
Text, images, videos
Example: Reviews, tweets
📊 Semi-structured
JSON, XML
Used in APIs
🔁 6. Typical Data Analyst Workflow
Step-by-step:
1. Understand the problem
2. Collect data
3. Clean data (most time spent here!)
4. Analyze
5. Visualize
6. Communicate insights
👉 70% of work = cleaning + understanding data
👉 Only 30% = actual analysis
🚨 7. Beginner Mistakes to Avoid
❌ Learning too many tools at once
❌ Ignoring SQL
❌ Only watching tutorials (no practice)
❌ Not building projects
💡 Reality Check
👉 Data Analysis is NOT about coding
👉 It’s about thinking, problem-solving, and communication
Double Tap ❤️ For More
❤34
What is the primary role of a Data Analyst?
Anonymous Quiz
3%
A. Build machine learning models
2%
B. Design websites
94%
C. Convert data into insights for decision-making
1%
D. Manage servers
❤2
Which skill is considered MOST essential for a Data Analyst?
Anonymous Quiz
3%
A. Graphic Design
94%
B. SQL
2%
C. Video Editing
1%
D. Networking
❤1
Which tool is commonly used for data visualization?
Anonymous Quiz
6%
A. MS Word
91%
B. Tableau
2%
C. Notepad
1%
D. Photoshop
What type of data is stored in rows and columns?
Anonymous Quiz
9%
A. Unstructured Data
9%
B. Semi-structured Data
78%
C. Structured Data
4%
D. Random Data
❤1
Which step usually takes the MOST time in data analysis?
Anonymous Quiz
19%
A. Visualization
72%
B. Data Cleaning
4%
C. Presentation
4%
D. Reporting
❤1
Scenario based Interview Questions & Answers for Data Analyst
1. Scenario: You are working on a SQL database that stores customer information. The database has a table called "Orders" that contains order details. Your task is to write a SQL query to retrieve the total number of orders placed by each customer.
Question:
- Write a SQL query to find the total number of orders placed by each customer.
Expected Answer:
SELECT CustomerID, COUNT(*) AS TotalOrders
FROM Orders
GROUP BY CustomerID;
2. Scenario: You are working on a SQL database that stores employee information. The database has a table called "Employees" that contains employee details. Your task is to write a SQL query to retrieve the names of all employees who have been with the company for more than 5 years.
Question:
- Write a SQL query to find the names of employees who have been with the company for more than 5 years.
Expected Answer:
SELECT Name
FROM Employees
WHERE DATEDIFF(year, HireDate, GETDATE()) > 5;
Power BI Scenario-Based Questions
1. Scenario: You have been given a dataset in Power BI that contains sales data for a company. Your task is to create a report that shows the total sales by product category and region.
Expected Answer:
- Load the dataset into Power BI.
- Create relationships if necessary.
- Use the "Fields" pane to select the necessary fields (Product Category, Region, Sales).
- Drag these fields into the "Values" area of a new visualization (e.g., a table or bar chart).
- Use the "Filters" pane to filter data as needed.
- Format the visualization to enhance clarity and readability.
2. Scenario: You have been asked to create a Power BI dashboard that displays real-time stock prices for a set of companies. The stock prices are available through an API.
Expected Answer:
- Use Power BI Desktop to connect to the API.
- Go to "Get Data" > "Web" and enter the API URL.
- Configure the data refresh settings to ensure real-time updates (e.g., setting up a scheduled refresh or using DirectQuery if supported).
- Create visualizations using the imported data.
- Publish the report to the Power BI service and set up a data gateway if needed for continuous refresh.
3. Scenario: You have been given a Power BI report that contains multiple visualizations. The report is taking a long time to load and is impacting the performance of the application.
Expected Answer:
- Analyze the current performance using Performance Analyzer.
- Optimize data model by reducing the number of columns and rows, and removing unnecessary calculations.
- Use aggregated tables to pre-compute results.
- Simplify DAX calculations.
- Optimize visualizations by reducing the number of visuals per page and avoiding complex custom visuals.
- Ensure proper indexing on the data source.
Free SQL Resources: https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
Like if you need more similar content
Hope it helps :)
1. Scenario: You are working on a SQL database that stores customer information. The database has a table called "Orders" that contains order details. Your task is to write a SQL query to retrieve the total number of orders placed by each customer.
Question:
- Write a SQL query to find the total number of orders placed by each customer.
Expected Answer:
SELECT CustomerID, COUNT(*) AS TotalOrders
FROM Orders
GROUP BY CustomerID;
2. Scenario: You are working on a SQL database that stores employee information. The database has a table called "Employees" that contains employee details. Your task is to write a SQL query to retrieve the names of all employees who have been with the company for more than 5 years.
Question:
- Write a SQL query to find the names of employees who have been with the company for more than 5 years.
Expected Answer:
SELECT Name
FROM Employees
WHERE DATEDIFF(year, HireDate, GETDATE()) > 5;
Power BI Scenario-Based Questions
1. Scenario: You have been given a dataset in Power BI that contains sales data for a company. Your task is to create a report that shows the total sales by product category and region.
Expected Answer:
- Load the dataset into Power BI.
- Create relationships if necessary.
- Use the "Fields" pane to select the necessary fields (Product Category, Region, Sales).
- Drag these fields into the "Values" area of a new visualization (e.g., a table or bar chart).
- Use the "Filters" pane to filter data as needed.
- Format the visualization to enhance clarity and readability.
2. Scenario: You have been asked to create a Power BI dashboard that displays real-time stock prices for a set of companies. The stock prices are available through an API.
Expected Answer:
- Use Power BI Desktop to connect to the API.
- Go to "Get Data" > "Web" and enter the API URL.
- Configure the data refresh settings to ensure real-time updates (e.g., setting up a scheduled refresh or using DirectQuery if supported).
- Create visualizations using the imported data.
- Publish the report to the Power BI service and set up a data gateway if needed for continuous refresh.
3. Scenario: You have been given a Power BI report that contains multiple visualizations. The report is taking a long time to load and is impacting the performance of the application.
Expected Answer:
- Analyze the current performance using Performance Analyzer.
- Optimize data model by reducing the number of columns and rows, and removing unnecessary calculations.
- Use aggregated tables to pre-compute results.
- Simplify DAX calculations.
- Optimize visualizations by reducing the number of visuals per page and avoiding complex custom visuals.
- Ensure proper indexing on the data source.
Free SQL Resources: https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v
Like if you need more similar content
Hope it helps :)
❤14
Real-world SQL Questions with Answers 🔥
Let's dive into some real-world SQL questions with a mini dataset.
📊 Dataset: employees
1. Find average salary per department
2. Find employees earning above department average
3. Find highest salary in each department
4. Find employees who earn more than their manager
5. Count employees in each department
6. Find departments with more than 2 employees
7. Find second highest salary
8. Find employees without managers
9. Rank employees by salary
10. Find duplicate salaries
11. Top 2 highest salaries
Double Tap ❤️ For More
Let's dive into some real-world SQL questions with a mini dataset.
📊 Dataset: employees
id name department salary manager_id
1 Aditi HR 30000 5
2 Rahul IT 50000 6
3 Neha IT 60000 6
4 Aman Sales 40000 7
5 Kiran HR 70000 NULL
6 Mohit IT 80000 NULL
7 Suresh Sales 65000 NULL
8 Pooja HR 30000 5
1. Find average salary per department
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
2. Find employees earning above department average
SELECT name, department, salary
FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
WHERE department = e.department
);
3. Find highest salary in each department
SELECT department, MAX(salary) AS max_salary
FROM employees
GROUP BY department;
4. Find employees who earn more than their manager
SELECT e.name
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
5. Count employees in each department
SELECT department, COUNT(*) AS total_employees
FROM employees
GROUP BY department;
6. Find departments with more than 2 employees
SELECT department, COUNT(*) AS total
FROM employees
GROUP BY department
HAVING COUNT(*) > 2;
7. Find second highest salary
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
8. Find employees without managers
SELECT name
FROM employees
WHERE manager_id IS NULL;
9. Rank employees by salary
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
10. Find duplicate salaries
SELECT salary, COUNT(*)
FROM employees
GROUP BY salary
HAVING COUNT(*) > 1;
11. Top 2 highest salaries
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 2;
Double Tap ❤️ For More
❤40👍3👏1
Here's a concise cheat sheet to help you get started with Python for Data Analytics. This guide covers essential libraries and functions that you'll frequently use.
1. Python Basics
- Variables:
- Data Types:
- Integers:
- Control Structures:
-
- Loops:
- While loop:
2. Importing Libraries
- NumPy:
- Pandas:
- Matplotlib:
- Seaborn:
3. NumPy for Numerical Data
- Creating Arrays:
- Array Operations:
- Reshaping Arrays:
- Indexing and Slicing:
4. Pandas for Data Manipulation
- Creating DataFrames:
- Reading Data:
- Basic Operations:
- Selecting Columns:
- Filtering Data:
- Handling Missing Data:
- GroupBy:
5. Data Visualization
- Matplotlib:
- Seaborn:
6. Common Data Operations
- Merging DataFrames:
- Pivot Table:
- Applying Functions:
7. Basic Statistics
- Descriptive Stats:
- Correlation:
This cheat sheet should give you a solid foundation in Python for data analytics. As you get more comfortable, you can delve deeper into each library's documentation for more advanced features.
I have curated the best resources to learn Python 👇👇
https://whatsapp.com/channel/0029VaiM08SDuMRaGKd9Wv0L
Hope you'll like it
Like this post if you need more resources like this 👍❤️
1. Python Basics
- Variables:
x = 10 y = "Hello"
- Data Types:
- Integers:
x = 10
- Floats: y = 3.14
- Strings: name = "Alice"
- Lists: my_list = [1, 2, 3]
- Dictionaries: my_dict = {"key": "value"}
- Tuples: my_tuple = (1, 2, 3)
- Control Structures:
-
if, elif, else statements- Loops:
for i in range(5):
print(i)
- While loop:
while x < 5:
print(x)
x += 1
2. Importing Libraries
- NumPy:
import numpy as np
- Pandas:
import pandas as pd
- Matplotlib:
import matplotlib.pyplot as plt
- Seaborn:
import seaborn as sns
3. NumPy for Numerical Data
- Creating Arrays:
arr = np.array([1, 2, 3, 4])
- Array Operations:
arr.sum()
arr.mean()
- Reshaping Arrays:
arr.reshape((2, 2))
- Indexing and Slicing:
arr[0:2] # First two elements
4. Pandas for Data Manipulation
- Creating DataFrames:
df = pd.DataFrame({
'col1': [1, 2, 3],
'col2': ['A', 'B', 'C']
})
- Reading Data:
df = pd.read_csv('file.csv')
- Basic Operations:
df.head() # First 5 rows
df.describe() # Summary statistics
df.info() # DataFrame info
- Selecting Columns:
df['col1']
df[['col1', 'col2']]
- Filtering Data:
df[df['col1'] > 2]
- Handling Missing Data:
df.dropna() # Drop missing values
df.fillna(0) # Replace missing values
- GroupBy:
df.groupby('col2').mean()
5. Data Visualization
- Matplotlib:
plt.plot(df['col1'], df['col2'])
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Title')
plt.show()
- Seaborn:
sns.histplot(df['col1'])
sns.boxplot(x='col1', y='col2', data=df)
6. Common Data Operations
- Merging DataFrames:
pd.merge(df1, df2, on='key')
- Pivot Table:
df.pivot_table(index='col1', columns='col2', values='col3')
- Applying Functions:
df['col1'].apply(lambda x: x*2)
7. Basic Statistics
- Descriptive Stats:
df['col1'].mean()
df['col1'].median()
df['col1'].std()
- Correlation:
df.corr()
This cheat sheet should give you a solid foundation in Python for data analytics. As you get more comfortable, you can delve deeper into each library's documentation for more advanced features.
I have curated the best resources to learn Python 👇👇
https://whatsapp.com/channel/0029VaiM08SDuMRaGKd9Wv0L
Hope you'll like it
Like this post if you need more resources like this 👍❤️
❤16👍1🔥1
Real‑world Data Analytics Questions with Answers 🔥
Let’s practice end‑to‑end data thinking using this small dataset.
📊 Dataset: customer_orders
| order_id | customer_id | product | category | quantity | unit_price | order_date |
|----------|-------------|---------------|------------|----------|------------|-------------|
| 1 | 1001 | Laptop | Electronics| 2 | 75000 | 2025‑01‑10 |
| 2 | 1002 | Mouse | Electronics| 10 | 1500 | 2025‑01‑12 |
| 3 | 1003 | Chair | Furniture | 5 | 8000 | 2025‑01‑15 |
| 4 | 1001 | Keyboard | Electronics| 8 | 2500 | 2025‑01‑11 |
| 5 | 1004 | Desk | Furniture | 3 | 15000 | 2025‑01‑18 |
| 6 | 1002 | Monitor | Electronics| 4 | 25000 | 2025‑01‑20 |
| 7 | 1005 | Table | Furniture | 6 | 5000 | 2025‑01‑22 |
| 8 | 1003 | Webcam | Electronics| 12 | 3000 | 2025‑01‑14 |
1. Define what “data analysis” means in this context
• Data analysis means transforming raw orders into insights: what sells most, who the best customers are, and how revenue changes over time.
• You’d use SQL to query, Excel/Python to clean, and Power BI to visualize.
2. What are the key metrics you’d track for this business?
• Revenue = quantity × unit_price
• Order count and average order value (AOV)
• Top‑selling categories and best customers by revenue
3. Write a SQL query for total revenue by category
4. How would you find repeat customers?
• Customers with order_count > 1 are repeat buyers.
5. How would you detect “top customers”?
• Define “top” by total_spent or average order value:
– SUM(revenue) / COUNT(orders)
• Use Power BI/Excel to sort descending and highlight top 10%.
6. What would an outlier analysis look like?
• Compute min, max, average, standard deviation of revenue per order.
• Flag orders where:
– revenue > average + 2 * standard_deviation
• Check if such orders are errors or real big deals (e.g., enterprise purchase).
7. How would you report month‑on‑month growth?
• In SQL/Power BI:
– Group by YEAR(order_date) and MONTH(order_date)
– Compute revenue per month
– Then calculate:
▪ MoM % = (CurrentMonthRevenue − PreviousMonthRevenue) / PreviousMonthRevenue
8. How would you turn this into a dashboard?
• Page 1 – Overview: Cards for total revenue, total orders, AOV.
• Page 2 – Trends: Line chart for MoM revenue, bar chart for category split.
• Page 3 – Customers: Table for top 10 customers and repeat customers.
9. How would you handle dirty data (nulls, duplicates)?
• Pre‑check:
– COUNT(*) vs COUNT(customer_id) to spot missing customers.
• Clean:
– Drop or impute missing critical fields.
– Remove duplicate orders using DISTINCT or ROW_NUMBER().
10. How would you explain your findings to a non‑tech manager?
• Use simple language + visuals:
– “Our top product category is Electronics, contributing X% of revenue.”
– “N top customers account for M% of total sales.”
• Avoid formulas; focus on business impact: retention, profitability, growth.
Double Tap ❤️ For More!
Let’s practice end‑to‑end data thinking using this small dataset.
📊 Dataset: customer_orders
| order_id | customer_id | product | category | quantity | unit_price | order_date |
|----------|-------------|---------------|------------|----------|------------|-------------|
| 1 | 1001 | Laptop | Electronics| 2 | 75000 | 2025‑01‑10 |
| 2 | 1002 | Mouse | Electronics| 10 | 1500 | 2025‑01‑12 |
| 3 | 1003 | Chair | Furniture | 5 | 8000 | 2025‑01‑15 |
| 4 | 1001 | Keyboard | Electronics| 8 | 2500 | 2025‑01‑11 |
| 5 | 1004 | Desk | Furniture | 3 | 15000 | 2025‑01‑18 |
| 6 | 1002 | Monitor | Electronics| 4 | 25000 | 2025‑01‑20 |
| 7 | 1005 | Table | Furniture | 6 | 5000 | 2025‑01‑22 |
| 8 | 1003 | Webcam | Electronics| 12 | 3000 | 2025‑01‑14 |
1. Define what “data analysis” means in this context
• Data analysis means transforming raw orders into insights: what sells most, who the best customers are, and how revenue changes over time.
• You’d use SQL to query, Excel/Python to clean, and Power BI to visualize.
2. What are the key metrics you’d track for this business?
• Revenue = quantity × unit_price
• Order count and average order value (AOV)
• Top‑selling categories and best customers by revenue
3. Write a SQL query for total revenue by category
SELECT
category,
SUM(quantity * unit_price) AS total_revenue
FROM customer_orders
GROUP BY category;
4. How would you find repeat customers?
SELECT
customer_id,
COUNT(order_id) AS order_count,
SUM(quantity * unit_price) AS total_spent
FROM customer_orders
GROUP BY customer_id
HAVING COUNT(order_id) > 1;
• Customers with order_count > 1 are repeat buyers.
5. How would you detect “top customers”?
• Define “top” by total_spent or average order value:
– SUM(revenue) / COUNT(orders)
• Use Power BI/Excel to sort descending and highlight top 10%.
6. What would an outlier analysis look like?
• Compute min, max, average, standard deviation of revenue per order.
• Flag orders where:
– revenue > average + 2 * standard_deviation
• Check if such orders are errors or real big deals (e.g., enterprise purchase).
7. How would you report month‑on‑month growth?
• In SQL/Power BI:
– Group by YEAR(order_date) and MONTH(order_date)
– Compute revenue per month
– Then calculate:
▪ MoM % = (CurrentMonthRevenue − PreviousMonthRevenue) / PreviousMonthRevenue
8. How would you turn this into a dashboard?
• Page 1 – Overview: Cards for total revenue, total orders, AOV.
• Page 2 – Trends: Line chart for MoM revenue, bar chart for category split.
• Page 3 – Customers: Table for top 10 customers and repeat customers.
9. How would you handle dirty data (nulls, duplicates)?
• Pre‑check:
– COUNT(*) vs COUNT(customer_id) to spot missing customers.
• Clean:
– Drop or impute missing critical fields.
– Remove duplicate orders using DISTINCT or ROW_NUMBER().
10. How would you explain your findings to a non‑tech manager?
• Use simple language + visuals:
– “Our top product category is Electronics, contributing X% of revenue.”
– “N top customers account for M% of total sales.”
• Avoid formulas; focus on business impact: retention, profitability, growth.
Double Tap ❤️ For More!
❤10
SQL Cheatsheet 📝
This SQL cheatsheet is designed to be your quick reference guide for SQL programming. Whether you’re a beginner learning how to query databases or an experienced developer looking for a handy resource, this cheatsheet covers essential SQL topics.
1. Database Basics
-
-
2. Tables
- Create Table:
- Drop Table:
- Alter Table:
3. Insert Data
-
4. Select Queries
- Basic Select:
- Select Specific Columns:
- Select with Condition:
5. Update Data
-
6. Delete Data
-
7. Joins
- Inner Join:
- Left Join:
- Right Join:
8. Aggregations
- Count:
- Sum:
- Group By:
9. Sorting & Limiting
- Order By:
- Limit Results:
10. Indexes
- Create Index:
- Drop Index:
11. Subqueries
-
12. Views
- Create View:
- Drop View:
This SQL cheatsheet is designed to be your quick reference guide for SQL programming. Whether you’re a beginner learning how to query databases or an experienced developer looking for a handy resource, this cheatsheet covers essential SQL topics.
1. Database Basics
-
CREATE DATABASE db_name;-
USE db_name;2. Tables
- Create Table:
CREATE TABLE table_name (col1 datatype, col2 datatype);- Drop Table:
DROP TABLE table_name;- Alter Table:
ALTER TABLE table_name ADD column_name datatype;3. Insert Data
-
INSERT INTO table_name (col1, col2) VALUES (val1, val2);4. Select Queries
- Basic Select:
SELECT * FROM table_name;- Select Specific Columns:
SELECT col1, col2 FROM table_name;- Select with Condition:
SELECT * FROM table_name WHERE condition;5. Update Data
-
UPDATE table_name SET col1 = value1 WHERE condition;6. Delete Data
-
DELETE FROM table_name WHERE condition;7. Joins
- Inner Join:
SELECT * FROM table1 INNER JOIN table2 ON table1.col = table2.col;- Left Join:
SELECT * FROM table1 LEFT JOIN table2 ON table1.col = table2.col;- Right Join:
SELECT * FROM table1 RIGHT JOIN table2 ON table1.col = table2.col;8. Aggregations
- Count:
SELECT COUNT(*) FROM table_name;- Sum:
SELECT SUM(col) FROM table_name;- Group By:
SELECT col, COUNT(*) FROM table_name GROUP BY col;9. Sorting & Limiting
- Order By:
SELECT * FROM table_name ORDER BY col ASC|DESC;- Limit Results:
SELECT * FROM table_name LIMIT n;10. Indexes
- Create Index:
CREATE INDEX idx_name ON table_name (col);- Drop Index:
DROP INDEX idx_name;11. Subqueries
-
SELECT * FROM table_name WHERE col IN (SELECT col FROM other_table);12. Views
- Create View:
CREATE VIEW view_name AS SELECT * FROM table_name;- Drop View:
DROP VIEW view_name;❤15👍3
✅ Top Data Analyst Interview Questions
✅ SQL
1. What is a window function?
2. What is the difference between RANK() and ROW_NUMBER()?
3. How do you find the second highest salary?
4. What is a recursive CTE?
5. What is the difference between correlated and non-correlated subquery?
6. How do you remove duplicates without DISTINCT?
7. What is an INDEX and when do you use it?
8. Explain self-join with example.
9. What is the difference between DELETE, DROP, and TRUNCATE?
10. How do you pivot/unpivot data in SQL?
11. What is LAG() and LEAD()?
12. How do you handle NULL in aggregates?
13. What is the difference between VIEW and MATERIALIZED VIEW?
14. Explain ACID properties.
15. How do you optimize a slow query?
16. What is the difference between INNER JOIN and EXISTS?
17. What is a FULL OUTER JOIN?
18. How do you find duplicates across tables?
19. What are SQL constraints?
20. Explain GROUPING SETS.
✅ Python
1. How do you handle missing data in Pandas?
2. What is the difference between loc[] and iloc[]?
3. What are lambda functions in data analysis?
4. How do you remove duplicates from DataFrame?
5. Explain groupby() and agg().
6. How do you merge/join DataFrames?
7. What is vectorization?
8. How do you handle outliers using IQR method?
9. What is the difference between list, tuple, dict?
10. How do you pivot data with pivot_table()?
11. What libraries do you use for viz (Matplotlib/Seaborn)?
12. Explain apply() vs map() vs applymap().
13. How do you read CSV with chunks?
14. What is NumPy broadcasting?
15. How do you handle time-series with Pandas (resample, shift)?
16. How do you calculate correlation matrix?
17. How do you optimize Pandas performance (e.g., dtype)?
18. What is the difference between Series vs DataFrame?
19. How do you filter DataFrame conditionally?
20. How do you rename columns efficiently?
✅ Power BI
1. What is DAX?
2. What is the difference between Power Query and Power Pivot?
3. What is the difference between measure vs calculated column?
4. Explain CALCULATE() function.
5. What are relationships (1:M, M:M)?
6. How do you handle many-to-many?
7. What is row-level security (RLS)?
8. How do you setup incremental refresh?
9. What is the difference between filters vs slicers?
10. What is a data model?
11. How do you publish and share reports?
12. What is performance analyzer tool?
13. How do you create month-on-month growth DAX?
14. How do you use custom visuals?
15. What is gateway for refresh?
16. What is a .pbix file?
17. What are quick measures examples?
18. What is data blending in Power BI?
19. How do you optimize large datasets?
20. What is Get Data?
✅ Excel
1. What is INDEX-MATCH combo?
2. How do you use Power Query?
3. What are dynamic arrays (FILTER, UNIQUE)?
4. What is SUMIFS vs SUMPRODUCT?
5. What is Power Pivot?
6. What are Goal Seek and Solver?
7. What are conditional formatting rules?
8. How do you remove duplicates advanced?
9. How do you create sparklines?
10. How do you use INDIRECT function?
11. What are data validation lists?
12. What are array formulas (Ctrl+Shift+Enter)?
13. What are Macros/VBA basics?
14. What are pivot table slicers?
15. What is what-if analysis?
16. What are TEXTJOIN and CONCAT?
17. What are XLOOKUP advantages?
18. What is Flash Fill feature?
19. What are Table vs Range pros?
20. How do you import from SQL/Web?
✅ Double Tap ❤️ For More
✅ SQL
1. What is a window function?
2. What is the difference between RANK() and ROW_NUMBER()?
3. How do you find the second highest salary?
4. What is a recursive CTE?
5. What is the difference between correlated and non-correlated subquery?
6. How do you remove duplicates without DISTINCT?
7. What is an INDEX and when do you use it?
8. Explain self-join with example.
9. What is the difference between DELETE, DROP, and TRUNCATE?
10. How do you pivot/unpivot data in SQL?
11. What is LAG() and LEAD()?
12. How do you handle NULL in aggregates?
13. What is the difference between VIEW and MATERIALIZED VIEW?
14. Explain ACID properties.
15. How do you optimize a slow query?
16. What is the difference between INNER JOIN and EXISTS?
17. What is a FULL OUTER JOIN?
18. How do you find duplicates across tables?
19. What are SQL constraints?
20. Explain GROUPING SETS.
✅ Python
1. How do you handle missing data in Pandas?
2. What is the difference between loc[] and iloc[]?
3. What are lambda functions in data analysis?
4. How do you remove duplicates from DataFrame?
5. Explain groupby() and agg().
6. How do you merge/join DataFrames?
7. What is vectorization?
8. How do you handle outliers using IQR method?
9. What is the difference between list, tuple, dict?
10. How do you pivot data with pivot_table()?
11. What libraries do you use for viz (Matplotlib/Seaborn)?
12. Explain apply() vs map() vs applymap().
13. How do you read CSV with chunks?
14. What is NumPy broadcasting?
15. How do you handle time-series with Pandas (resample, shift)?
16. How do you calculate correlation matrix?
17. How do you optimize Pandas performance (e.g., dtype)?
18. What is the difference between Series vs DataFrame?
19. How do you filter DataFrame conditionally?
20. How do you rename columns efficiently?
✅ Power BI
1. What is DAX?
2. What is the difference between Power Query and Power Pivot?
3. What is the difference between measure vs calculated column?
4. Explain CALCULATE() function.
5. What are relationships (1:M, M:M)?
6. How do you handle many-to-many?
7. What is row-level security (RLS)?
8. How do you setup incremental refresh?
9. What is the difference between filters vs slicers?
10. What is a data model?
11. How do you publish and share reports?
12. What is performance analyzer tool?
13. How do you create month-on-month growth DAX?
14. How do you use custom visuals?
15. What is gateway for refresh?
16. What is a .pbix file?
17. What are quick measures examples?
18. What is data blending in Power BI?
19. How do you optimize large datasets?
20. What is Get Data?
✅ Excel
1. What is INDEX-MATCH combo?
2. How do you use Power Query?
3. What are dynamic arrays (FILTER, UNIQUE)?
4. What is SUMIFS vs SUMPRODUCT?
5. What is Power Pivot?
6. What are Goal Seek and Solver?
7. What are conditional formatting rules?
8. How do you remove duplicates advanced?
9. How do you create sparklines?
10. How do you use INDIRECT function?
11. What are data validation lists?
12. What are array formulas (Ctrl+Shift+Enter)?
13. What are Macros/VBA basics?
14. What are pivot table slicers?
15. What is what-if analysis?
16. What are TEXTJOIN and CONCAT?
17. What are XLOOKUP advantages?
18. What is Flash Fill feature?
19. What are Table vs Range pros?
20. How do you import from SQL/Web?
✅ Double Tap ❤️ For More
❤40