@Codingdidi
9.46K subscribers
26 photos
7 videos
47 files
257 links
Free learning Resources For Data Analysts, Data science, ML, AI, GEN AI and Job updates, career growth, Tech updates
Download Telegram
Top 10 Advanced SQL Queries for Data Mastery

1. Recursive CTE (Common Table Expressions)
Use a recursive CTE to traverse hierarchical data, such as employees and their managers.

WITH RECURSIVE EmployeeHierarchy AS (
SELECT employee_id, employee_name, manager_id
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.employee_name, e.manager_id
FROM employees e
JOIN EmployeeHierarchy eh ON e.manager_id = eh.employee_id
)
SELECT *
FROM EmployeeHierarchy;


2. Pivoting Data
Turn row data into columns (e.g., show product categories as separate columns).

SELECT *
FROM (
SELECT TO_CHAR(order_date, 'YYYY-MM') AS month, product_category, sales_amount
FROM sales
) AS pivot_data
PIVOT (
SUM(sales_amount)
FOR product_category IN ('Electronics', 'Clothing', 'Books')
) AS pivoted_sales;


3. Window Functions
Calculate a running total of sales based on order date.

SELECT 
order_date,
sales_amount,
SUM(sales_amount) OVER (ORDER BY order_date) AS running_total
FROM sales;


4. Ranking with Window Functions
Rank employees’ salaries within each department.

SELECT 
department,
employee_name,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;


5. Finding Gaps in Sequences
Identify missing values in a sequential dataset (e.g., order numbers).

WITH Sequences AS (
SELECT MIN(order_number) AS start_seq, MAX(order_number) AS end_seq
FROM orders
)
SELECT start_seq + 1 AS missing_sequence
FROM Sequences
WHERE NOT EXISTS (
SELECT 1
FROM orders o
WHERE o.order_number = Sequences.start_seq + 1
);


6. Unpivoting Data
Convert columns into rows to simplify analysis of multiple attributes.

SELECT 
product_id,
attribute_name,
attribute_value
FROM products
UNPIVOT (
attribute_value FOR attribute_name IN (color, size, weight)
) AS unpivoted_data;


7. Finding Consecutive Events
Check for consecutive days/orders for the same product using LAG().

WITH ConsecutiveOrders AS (
SELECT
product_id,
order_date,
LAG(order_date) OVER (PARTITION BY product_id ORDER BY order_date) AS prev_order_date
FROM orders
)
SELECT product_id, order_date, prev_order_date
FROM ConsecutiveOrders
WHERE order_date - prev_order_date = 1;


8. Aggregation with the FILTER Clause
Calculate selective averages (e.g., only for the Sales department).

SELECT 
department,
AVG(salary) FILTER (WHERE department = 'Sales') AS avg_salary_sales
FROM employees
GROUP BY department;


9. JSON Data Extraction
Extract values from JSON columns directly in SQL.

SELECT 
order_id,
customer_id,
order_details ->> 'product' AS product_name,
CAST(order_details ->> 'quantity' AS INTEGER) AS quantity
FROM orders;


10. Using Temporary Tables
Create a temporary table for intermediate results, then join it with other tables.

-- Create a temporary table
CREATE TEMPORARY TABLE temp_product_sales AS
SELECT product_id, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY product_id;

-- Use the temp table
SELECT p.product_name, t.total_sales
FROM products p
JOIN temp_product_sales t ON p.product_id = t.product_id;


Why These Matter
Advanced SQL queries let you handle complex data manipulation and analysis tasks with ease. From traversing hierarchical relationships to reshaping data (pivot/unpivot) and working with JSON, these techniques expand your ability to derive insights from relational databases.

Keep practicing these queries to solidify your SQL expertise and make more data-driven decisions!

Here you can find essential Pyspark ResourcesπŸ‘‡
https://www.instagram.com/codingdidi

Like this post if you need more πŸ‘β€οΈ

Hope it helps :)

#sql #dataanalyst
πŸ‘5πŸ”₯2❀1