Aspiring Data Science

#kaggle #ml #competitions

Интересное преобразование таргета, корень 4-й степени, не слышал раньше про такое. Также fair loss в xgboost, усреднение весов нейросетей (по достижении равновесия), обучение SVR и kNN на подвыбрках. Исправление предсказаний xgboost-а это просто жесть, какой фигнёй только эти кэгглеры не занимаются )

https://www.youtube.com/watch?v=p7ArDjMImiI

YouTube

Kaggle Allstate Claims Severity: предсказание тяжести страхового случая — Алексей Носков

Алексей Носков рассказывает про задачу определения тяжести страхового случая Kaggle Allstate Claims Severity, где занял 2 место. Из видео вы сможете узнать:
- Нужно ли преобразовывать целевую переменную для обучения моделей
- Как можно строить новые признаки…

132 viewsAnatoly Alekseev, edited 20:17

Aspiring Data Science

#ml #competitions #kaggle #dl #cnn

https://www.youtube.com/watch?v=G5UkWXehS_s

YouTube

Kaggle Planet Understanding: классификация спутниковых снимков — Роман Соловьёв

Роман @zfturbo Соловьёв недавно приходил к нам на тренировку и рассказывал про задачу классификации спутниковых снимков (Kaggle Planet Understanding the Amazon from Space). Роман вместе со Станиславом Семёновым занял в этом соревновании 3 место. Из видео…

119 viewsAnatoly Alekseev, 14:48

Aspiring Data Science

#kaggle #tricks #ml #titericz #featureengineering

Before FE, calculate corr coeff of raw features & the target; наверное, лучше всё-таки брать половину сета, чтобы не оверфитить совсем уж. С оценкой корреляций (нелинейных) и "интеракций", кстати, очень может помочь Диоген.

Combine numerical features: log(A)*log(B), A*exp(B), Rank(A)+Rank(B), sin(A)+cos(B) etc;

Use binary flag for NAs;

Do N-way nested OOF Target Encoding;

Try aggregations of one feature by another;

Try extensive target transformations (TT), as y^1/2, y^1/4,log(10+y), 10/y etc;

Try several clustering algos to create new categorical or numerical features based on cluster IDs or distances;

Trees leaves indices as weak features to the linear models (incl. factorization machines);

LOFO feature selection;

Adversarial Validation to tell train apart from test;

https://www.youtube.com/watch?v=RtqtM1UJfZc

YouTube

Kaggle Tips for Feature Engineering and Selection | by Gilberto Titericz | Kaggle Days Meetup Madrid

Gilberto Titericz, Kaggle GrandMaster and top-1 in Kaggle Competitions Ranking for years, talks about two important topics in Machine Learning: Feature Engineering and Feature Selection

25 November 2019, Madrid - Part II

🔥1

141 viewsAnatoly Alekseev, edited 22:47

Aspiring Data Science

Forwarded from Artem Ryblov’s Data Science Weekly

How to Win a Kaggle Competition by Darek Kłeczek

Darek Kłeczek:

When I join a competition, I research winning solutions from past similar competitions. It takes a lot of time to read and digest them, but it's an incredible source of ideas and knowledge. But what if we could learn from all the competitions? We've been given a list of Kaggle writeups in this competition, but there are so many of them! If only we could find a way to extract some structured data and analyze it... Well, it turns out that large language models (LLMs) [1] can help us extract structured data from unstructured writeups.

In this essay, author starts by providing a quick overview of the process he uses to collect data. He then presents several insights from analyzing datasets. The focus is to understand what the community has learned over the past 2 years of working and experimenting with Kaggle competitions. Finally, he mentions some ideas for future research.

Link: Kaggle

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #kaggle #competitions

106 viewsAnatoly Alekseev, 07:02

Aspiring Data Science

Forwarded from Artem Ryblov’s Data Science Weekly

The Kaggle Book by Konrad Banachewicz and Luca Massaron

Millions of data enthusiasts from around the world compete on Kaggle, the most famous data science competition platform of them all. Participating in Kaggle competitions is a surefire way to improve your data analysis skills, network with an amazing community of data scientists, and gain valuable experience to help grow your career.

The first book of its kind, The Kaggle Book assembles in one place the techniques and skills you'll need for success in competitions, data science projects, and beyond. Two Kaggle Grandmasters walk you through modeling strategies you won't easily find elsewhere, and the knowledge they've accumulated along the way. As well as Kaggle-specific tips, you'll learn more general techniques for approaching tasks based on image, tabular, textual data, and reinforcement learning. You'll design better validation schemes and work more comfortably with different evaluation metrics.

Whether you want to climb the ranks of Kaggle, build some more data science skills, or improve the accuracy of your existing models, this book is for you.

Link: Book

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #ml #machinelearning #featureengineering #kaggle #metrics #validation #hyperparameters #tabular #cv #nlp

@data_science_weekly

98 viewsAnatoly Alekseev, 14:36

About

Blog

Apps

Platform