Aspiring Data Science

#statistics #fitter

Приятно увидеть знакомое имя в авторах.

https://medium.com/the-researchers-guide/finding-the-best-distribution-that-fits-your-data-using-pythons-fitter-library-319a5a0972e9

https://fitter.readthedocs.io/en/latest/

Medium

Finding the Best Distribution that Fits Your Data using Python’s Fitter Library

Learn how to identify the best-fitted distribution

39 viewsedited 15:18

Aspiring Data Science

#statistics #fitter

Как компактно описать 1d массив данных неизвестной природы в разумное время? Авторы fitter-а упоминают 80+ распределений из scipy (а по факту фиттятся уже 109). Это занимает даже на скромном массиве длины 1_000 уже 2.5 минуты. При этом многие распределения ну явно близняшки. На добрую половину наверняка не стоило тратить время. Как бы соптимайзить? Найти что-то вроде "опорных распределений"... Best bang for your buck.

get_common_distributions() даёт [ "cauchy", "chi2", "expon", "exponpow", "gamma", "lognorm", "norm", "powerlaw", "rayleigh", "uniform"], но по какому принципу они выбраны?

Хотелось бы получить о данных наиболее полную картину, используя как можно меньше распределений. К примеру, что, если входные данные - это микс 3 распределений?

https://github.com/cokelaer/fitter/issues/81

39 viewsedited 16:11

Aspiring Data Science

#statistics

https://www.youtube.com/watch?v=9jW9G8MO4PQ&ab_channel=CassieKozyrkov

YouTube

What is a p-value?

Demystifying one of the trickiest concepts from statistics: p-values!

This video is part 1. Find parts 2+3 here:
http://bit.ly/quaesita_randomplaylist

Same topic in blog form: http://bit.ly/quaesita_puppies

My other articles to help you understand some…

51 views20:15

Aspiring Data Science

#statistics #informationtheory #entropy #python #featureselection #featureengineering

Ну да ладно, пока просто в личном блоге опубликую, оказывается, вроде потом можно будет статью прикрепить к паблику.

https://medium.com/@fingoldo/15819b261de0

Medium

How to distinguish between structured and random signals in Python

Distinguishing random from structured signals is a fundamental task in statistics, machine learning, and data science in general, as it…

51 viewsedited 21:38

Aspiring Data Science

Forwarded from Artem Ryblov’s Data Science Weekly (Artem Ryblov)

Mindful Modeler by Christoph Molnar

The newsletter combines the best of two worlds: the performance mindset of machine learning and the mindfulness of statistical thinking.

Machine learning has become mainstream while falling short in the silliest ways: lack of interpretability, biased and missing data, wrong conclusions, … To statisticians, these shortcomings are often unsurprising. Statisticians are relentless in their quest to understand how the data came about. They make sure that their models reflect the data-generating process and interpret models accordingly.
In a sea of people who basically know how to model.fit() and model.predict() you can stand out by bringing statistical thinking to the arena.
Sign up for this newsletter to combine performance-driven machine learning with statistical thinking. Become a mindful modeller.

You'll learn about:
- Thinking like a statistician while performing like a machine learner
- Spotting non-obvious data problems
- Interpretable machine learning
- Other modelling mindsets such as causal inference and prompt engineering

Link
https://mindfulmodeler.substack.com/

Navigational hashtags: #armknowledgesharing #armnewsletters
General hashtags: #modelling #modeling #ml #machinelearning #statistics #modelinterpretation #data #interpretability #casualinference

@accelerated_learning

Substack

Mindful Modeler | Christoph Molnar | Substack

Better machine learning by thinking like a statistician. About model interpretation, paying attention to data, and always staying critical. Click to read Mindful Modeler, by Christoph Molnar, a Substack publication with tens of thousands of subscribers.

83 viewsAnatoly Alekseev, 12:59

Aspiring Data Science

#chess #statistics #simulation #cheating #visualization

https://dorianquelle.github.io/blog/How-To-Catch-Smart-Cheaters/

Personal Website of Dorian Quelle

Titled Tuesday Cheating - Part 2 - Fantastic Cheaters and Where to Find Them

Fantastic Cheaters and Where to Find Them. Using the Elo system to detect cheating in Titled Tuesday tournaments.

83 viewsAnatoly Alekseev, edited 17:43

Aspiring Data Science

#chess #statistics #simulation #cheating #visualization #stockfish

Красивые визуализации шахмат от Дориана Квелле. Интересная метрика average loss after novelty, видно, что он сам придумал, классная идея.

"Yet, the standout performer is David Howell, with an average score of 0.9071. Despite his tendency to withdraw or join late to tournaments, this strategy seems to benefit him as he avoids facing high-performing players in later rounds. With a record of 120 wins, 14 draws, and only 6 losses, Howell clearly dominates the field."

"Yet, the left plot exposes a glaring inconsistency in Kramnik’s performance. Despite his high accuracy in playing engine-recommended moves, he registers a higher average loss per move than one would expect for a player of his Elo rating. This suggests that while Kramnik plays the engine move in most cases, he is also prone to significant blunders. Playing the fools mate because you’re suspecting a player of cheating doesn’t help either."

https://dorianquelle.github.io/blog/Cheating-In-Titled-Tuesday/

Personal Website of Dorian Quelle

Analysis of Cheating In Titled Tuesday

An in depth computational analysis of cheating in Chess.com Titled-Tuesday tournaments

93 viewsAnatoly Alekseev, edited 17:45

Aspiring Data Science

Forwarded from Artem Ryblov’s Data Science Weekly (Artem Ryblov)

Thinking Clearly with Data: A Guide to Quantitative Reasoning and Analysis by Ethan Bueno de Mesquita, Anthony Fowler

An introduction to data science or statistics shouldn’t involve proving complex theorems or memorizing obscure terms and formulas, but that is exactly what most introductory quantitative textbooks emphasize. In contrast, Thinking Clearly with Data focuses, first and foremost, on critical thinking and conceptual understanding in order to teach students how to be better consumers and analysts of the kinds of quantitative information and arguments that they will encounter throughout their lives.

Among much else, the book teaches how to assess whether an observed relationship in data reflects a genuine relationship in the world and, if so, whether it is causal; how to make the most informative comparisons for answering questions; what questions to ask others who are making arguments using quantitative evidence; which statistics are particularly informative or misleading; how quantitative evidence should and shouldn’t influence decision-making; and how to make better decisions by using moral values as well as data.

- An ideal textbook for introductory quantitative methods courses in data science, statistics, political science, economics, psychology, sociology, public policy, and other fields
- Introduces the basic toolkit of data analysis―including sampling, hypothesis testing, Bayesian inference, regression, experiments, instrumental variables, differences in differences, and regression discontinuity
- Uses real-world examples and data from a wide variety of subjects
- Includes practice questions and data exercises

Link: https://www.amazon.com/Thinking-Clearly-Data-Quantitative-Reasoning/dp/0691214352

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #datascience #correlation #regression #causation #randomizedexperiments #statistics

@data_science_links

109 viewsAnatoly Alekseev, 00:25

Aspiring Data Science

#statistics #paradoxes

https://www.youtube.com/watch?v=isrvncpdCTk

YouTube

Непараметрика и другие сюжеты статистики. Занятие 1. Самойленко И. А.

152 viewsAnatoly Alekseev, edited 16:38

Aspiring Data Science

Forwarded from asisakov

Please open Telegram to view this post

VIEW IN TELEGRAM

77 viewsAnatoly Alekseev, 19:59

About

Blog

Apps

Platform