Data science/ML/AI

📚 Data Science Riddle

A data engineer complains that your model training job is failing in production due to schema mismatch. What's the root fix?

Anonymous Quiz

12%

Cast data types in code

17%

Skip invalid rows

20%

Retrain with old schema

52%

Use a schema registry

127 voters1.95K views09:01

Data science/ML/AI

K-Means Clustering

❤5

1.85K views07:01

Data science/ML/AI

Covariance vs. Correlation: Same Family, Different Story

People use them interchangeably but they measure different things.

Covariance tells you the direction of relationship (positive or negative).
Correlation goes further; it tells you the strength, normalized between -1 and 1.

So while covariance can be 2345.67, correlation says 0.92. clear, interpretable, scale-free.

Covariance shows movement, correlation shows consistency.

❤5👍1

1.85K views09:01

Data science/ML/AI

📚 Data Science Riddle

You're Processing a dataset with frequent schema evolution. Which format handles it most gracefully?

Anonymous Quiz

❤5

152 voters1.95K views08:25

Data science/ML/AI

Eigenvalues & Eigenvectors — Why PCA Actually Works

You’ve heard of PCA. But what’s really happening underneath?

PCA finds the directions (vectors) where your data varies the most.

Those directions are eigenvectors of the covariance matrix and the eigenvalues tell you how much variance each captures.

You’re basically rotating your data to find its “natural axes.”

PCA isn’t compression — it’s discovering how your data wants to be seen.

❤7👏2

2K views07:34

Data science/ML/AI

📚 Data Science Riddle

Your spark job fails due to executor memory pressure. Most effective optimization?

Anonymous Quiz

More shuffle partitions

17%

Persist fewer objects

107 voters1.99K views10:25

Data science/ML/AI

Forwarded from Programming, data science, ML - free courses by Big Data Specialist

BigDataAnalytics-Lecture.pdf

10.2 MB

Notes on HDFS, MapReduce, YARN, Hadoop vs. traditional systems and much more... from Columbia University.

❤7

1.72K views06:42

Data science/ML/AI

📚 Data Science Riddle

You fit a forecasting model and residuals show increasing variance. What is needed?

Anonymous Quiz

👍3❤1

124 voters1.7K views11:32

Data science/ML/AI

4 Pillars of Data Science

🔥4

1.79K views08:50

Data science/ML/AI

AI vs Machine Learning vs Deep Learning Vs Generative AI

❤6

1.84K views07:40

Data science/ML/AI

📚 Data Science Riddle

A numeric feature has many repeated exact values with occasional jumps. What type of variable is this?

Anonymous Quiz

❤4

135 voters1.92K views10:45

Data science/ML/AI

Machine Learning Notes.pdf

226.8 KB

A Stanford CS' Lecture note diving into supervised/unsupervised algorithms, neural networks, SVMs with math proofs and Python pseudocode.

❤7

1.92K views08:03

Data science/ML/AI

Kafka 101

❤6

1.77K views07:20

Data science/ML/AI

📚 Data Science Riddle

Two team members run the same notebook but get different results. What's the culprit?

Anonymous Quiz

143 voters1.75K views09:50

Data science/ML/AI

The Simplest Machine Learning Cheatsheet

❤6👍1

1.85K views08:03

Data science/ML/AI

📚 Data Science Riddle

A query runs slowly due to large table scans. What's the most targeted fix?

Anonymous Quiz

129 voters1.78K views10:35

Data science/ML/AI

Everything You need To Know About Databricks

❤3

1.86K views07:10

Data science/ML/AI

📚 Data Science Riddle

You want to detect extreme values visually in one plot. Which one is best?

Anonymous Quiz

173 voters1.81K views09:40

Data science/ML/AI

Mining of Massive Datasets (Leskovec, Stanford).pdf

2.9 MB

The Big Data bible from Stanford: MapReduce, Spark, recommendation systems, PageRank, locality-sensitive hashing, Large scale machine learning and mining social networks/streams all explained clearly with real algorithms you can code today. 500 pages of pure gold.

❤4

5.17K views08:04

Data science/ML/AI

If you want to become a Data Scientist, this is the path to follow.

👍6

1.91K views08:15

Data science/ML/AI

📚 Data Science Riddle

You want to prevent inconsistent data across environments. What helps most?

Anonymous Quiz

29%

Checkpoints

17%