Evals in Data Science
π₯ Building models is funβ¦ but hereβs the real test: is your model actually any good, or just pretending? π
Evaluations, or evals, are our modelβs report card. They tell us:
- For a spam filter: Do we catch all spam (recall) without misclassifying grandmaβs emails as junk (precision)?
- For price prediction: How close are our predictions on average (RMSE)?
But evals arenβt just about numbers - they influence trust, fairness, and real-world usefulness of our models.
Discussion prompts:
- Whatβs your go-to evaluation metric and why?
- Seen a model that looked great on paper but flopped in reality?
- Should fairness & usability be considered first-class evaluation metrics alongside accuracy?
Free book to dive deeper:
- Fairness and Machine Learning: rigorous, practical guide to evaluating models for fairness: https://fairmlbook.org/
Drop your thoughts below β¬οΈ
π₯ Building models is funβ¦ but hereβs the real test: is your model actually any good, or just pretending? π
Evaluations, or evals, are our modelβs report card. They tell us:
- For a spam filter: Do we catch all spam (recall) without misclassifying grandmaβs emails as junk (precision)?
- For price prediction: How close are our predictions on average (RMSE)?
But evals arenβt just about numbers - they influence trust, fairness, and real-world usefulness of our models.
Discussion prompts:
- Whatβs your go-to evaluation metric and why?
- Seen a model that looked great on paper but flopped in reality?
- Should fairness & usability be considered first-class evaluation metrics alongside accuracy?
Free book to dive deeper:
- Fairness and Machine Learning: rigorous, practical guide to evaluating models for fairness: https://fairmlbook.org/
Drop your thoughts below β¬οΈ
β€5