Aspiring Data Science

#ml #shap #interpretability #robustness

Вы работаете над задачей объяснения вклада признаков в результат некоторого бизнес-процесса. Натренировали неплохую модель (видите, что на test предсказания значимо лучше случайного гадания aka DummyClassifier). Посчитали кэфы Шепли, отсортировали по модулю, и собираетесь уже презентовать в статье/руководству/заказчику. И вдруг, перезапустив обучение, видите, что с новым инстансом классификатора список фичей существенно изменил порядок: некоторые фичи в списке упали, другие взлетели.

😢1

62 viewsedited 09:59

#ml #interpretability #shap #facet

Новая интересная идея над SHAP. Как известно, каждая фича в рамках SHAP по каждому примеру получает некоторое значение, сдвигающее прогноз от среднего. А почему бы не рассмотреть фичу как вещественный вектор этих значений по всем примерам? Тогда у нас появляется возможность делать матоперации над векторами/фичами: считать углы, к примеру, тем самым выявляя фичи синергичные, ортогональные, и избыточны. Это красиво реализовано в библе Facet (pip install gamma-facet). Мне только кажется, они упустили, что переобучение модельки даст уже другие SHAP VALUES (особенно при мультиколлинеарности), и нельзя закладываться на единственный fit.

Показан реальный пример из строительной индустрии с CV и HPT. Очень интересны и красивы их симуляционные графики.

https://www.youtube.com/watch?v=fTQhfxZxavQ&ab_channel=PyData

YouTube

Jan Ittner & Mateusz Sokół - Exploring Feature Redundancy and Synergy with FACET 2.0

www.pydata.org

Understanding dependencies between features is crucial in the process of developing and interpreting black-box ML models. Mistreating or neglecting this aspect can lead to incorrect conclusions and, consequentially, sub-optimal or wrong decisions…

👍2

49 viewsedited 04:58

Aspiring Data Science

Forwarded from Artem Ryblov’s Data Science Weekly (Artem Ryblov)

Mindful Modeler by Christoph Molnar

The newsletter combines the best of two worlds: the performance mindset of machine learning and the mindfulness of statistical thinking.

Machine learning has become mainstream while falling short in the silliest ways: lack of interpretability, biased and missing data, wrong conclusions, … To statisticians, these shortcomings are often unsurprising. Statisticians are relentless in their quest to understand how the data came about. They make sure that their models reflect the data-generating process and interpret models accordingly.
In a sea of people who basically know how to model.fit() and model.predict() you can stand out by bringing statistical thinking to the arena.
Sign up for this newsletter to combine performance-driven machine learning with statistical thinking. Become a mindful modeller.

You'll learn about:
- Thinking like a statistician while performing like a machine learner
- Spotting non-obvious data problems
- Interpretable machine learning
- Other modelling mindsets such as causal inference and prompt engineering

Link
https://mindfulmodeler.substack.com/

Navigational hashtags: #armknowledgesharing #armnewsletters
General hashtags: #modelling #modeling #ml #machinelearning #statistics #modelinterpretation #data #interpretability #casualinference

@accelerated_learning

Substack

Mindful Modeler | Christoph Molnar | Substack

Better machine learning by thinking like a statistician. About model interpretation, paying attention to data, and always staying critical. Click to read Mindful Modeler, by Christoph Molnar, a Substack publication with tens of thousands of subscribers.

👍1

84 viewsAnatoly Alekseev, 12:59

About

Blog

Apps

Platform