Excel vs SQL vs Python (pandas):
1οΈβ£ Filtering Data
β³ Excel: =FILTER(A2:D100, B2:B100>50) (Excel 365 users)
β³ SQL: SELECT * FROM table WHERE column > 50;
β³ Python: df_filtered = df[df['column'] > 50]
2οΈβ£ Sorting Data
β³ Excel: Data β Sort (or =SORT(A2:A100, 1, TRUE))
β³ SQL: SELECT * FROM table ORDER BY column ASC;
β³ Python: df_sorted = df.sort_values(by="column")
3οΈβ£ Counting Rows
β³ Excel: =COUNTA(A:A)
β³ SQL: SELECT COUNT(*) FROM table;
β³ Python: row_count = len(df)
4οΈβ£ Removing Duplicates
β³ Excel: Data β Remove Duplicates
β³ SQL: SELECT DISTINCT * FROM table;
β³ Python: df_unique = df.drop_duplicates()
5οΈβ£ Joining Tables
β³ Excel: Power Query β Merge Queries (or VLOOKUP/XLOOKUP)
β³ SQL: SELECT * FROM table1 JOIN table2 ON table1.id = table2.id;
β³ Python: df_merged = pd.merge(df1, df2, on="id")
6οΈβ£ Ranking Data
β³ Excel: =RANK.EQ(A2, $A$2:$A$100)
β³ SQL: SELECT column, RANK() OVER (ORDER BY column DESC) AS rank FROM table;
β³ Python: df["rank"] = df["column"].rank(method="min", ascending=False)
7οΈβ£ Moving Average Calculation
β³ Excel: =AVERAGE(B2:B4) (manually for rolling window)
β³ SQL: SELECT date, AVG(value) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg FROM table;
β³ Python: df["moving_avg"] = df["value"].rolling(window=3).mean()
8οΈβ£ Running Total
β³ Excel: =SUM($B$2:B2) (drag down)
β³ SQL: SELECT date, SUM(value) OVER (ORDER BY date) AS running_total FROM table;
β³ Python: df["running_total"] = df["value"].cumsum()
1οΈβ£ Filtering Data
β³ Excel: =FILTER(A2:D100, B2:B100>50) (Excel 365 users)
β³ SQL: SELECT * FROM table WHERE column > 50;
β³ Python: df_filtered = df[df['column'] > 50]
2οΈβ£ Sorting Data
β³ Excel: Data β Sort (or =SORT(A2:A100, 1, TRUE))
β³ SQL: SELECT * FROM table ORDER BY column ASC;
β³ Python: df_sorted = df.sort_values(by="column")
3οΈβ£ Counting Rows
β³ Excel: =COUNTA(A:A)
β³ SQL: SELECT COUNT(*) FROM table;
β³ Python: row_count = len(df)
4οΈβ£ Removing Duplicates
β³ Excel: Data β Remove Duplicates
β³ SQL: SELECT DISTINCT * FROM table;
β³ Python: df_unique = df.drop_duplicates()
5οΈβ£ Joining Tables
β³ Excel: Power Query β Merge Queries (or VLOOKUP/XLOOKUP)
β³ SQL: SELECT * FROM table1 JOIN table2 ON table1.id = table2.id;
β³ Python: df_merged = pd.merge(df1, df2, on="id")
6οΈβ£ Ranking Data
β³ Excel: =RANK.EQ(A2, $A$2:$A$100)
β³ SQL: SELECT column, RANK() OVER (ORDER BY column DESC) AS rank FROM table;
β³ Python: df["rank"] = df["column"].rank(method="min", ascending=False)
7οΈβ£ Moving Average Calculation
β³ Excel: =AVERAGE(B2:B4) (manually for rolling window)
β³ SQL: SELECT date, AVG(value) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg FROM table;
β³ Python: df["moving_avg"] = df["value"].rolling(window=3).mean()
8οΈβ£ Running Total
β³ Excel: =SUM($B$2:B2) (drag down)
β³ SQL: SELECT date, SUM(value) OVER (ORDER BY date) AS running_total FROM table;
β³ Python: df["running_total"] = df["value"].cumsum()
β€3
Looks weird? You're not alone.
Python caches small integers between -5 and 256 for performance. So a and b point to the same object. But x and y are different objects, even though they have the same value.
This is part of Python's internal optimization. Itβs not about math. itβs about memory references under the hood.
Bottom Line?
Use == when you care about value, not identity.
Because sometimes
Python caches small integers between -5 and 256 for performance. So a and b point to the same object. But x and y are different objects, even though they have the same value.
This is part of Python's internal optimization. Itβs not about math. itβs about memory references under the hood.
Bottom Line?
Use == when you care about value, not identity.
print(x == y)
# True β
Because sometimes
is
isnβt what you think it is πβ€5