Data Science by ODS.ai 🦜
51.2K subscribers
359 photos
32 videos
7 files
1.51K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp
Download Telegram
Now it is official.

I've started writing a book on JAX. This seems to be the first book ever on this topic.

For those who don't know, JAX is an exceptionally cool numeric computations library from Google, a kind of NumPy on steroids, with autodiff, XLA compilation, and hardware acceleration on TPU/GPU. JAX also brings the functional programming paradigm to deep learning.

JAX is heavily used for deep learning and already pretends to be the deep learning framework #3. Some companies, like DeepMind, have already switched to JAX internally. There are rumors that Google also switches to JAX.

JAX ecosystem is constantly growing. There are a lot of high-quality deep learning-related modules. But JAX is not limited to deep learning. There are many exciting applications and libraries on top of JAX for physics, including molecular dynamics, fluid dynamics, rigid body simulation, quantum computing, astrophysics, ocean modeling, and so on. There are libraries for distributed matrix factorization, streaming data processing, protein folding, and chemical modeling, with other new applications emerging constantly.

Anyway, it's a perfect time to start learning JAX!

The book is available today as a part of the Manning Early Access Program (MEAP), so you can read the book as I write it 🙂 This is a very smart way of learning something new, you do not have to wait until the complete book is ready. You can start learning right away, and at the moment the book is published, you already know everything. Your feedback will also be very valuable, and you can influence how the book is made.

Here's a link to the book: http://mng.bz/QvAG

If you want a decent discount, use the discount code mlsapunov. It will provide you with 40% off, and it's valid through August 11th.
Frankfurt Data Science welcome you back to the onsite event in Frankfurt-am-Main! As usual special - a demo of the new #dalle2 model by statworx team, Sebastian Heinz and Fabian Müller, socialising time, and a cocktail bar with free cocktails. Frankfurt DS Community thanks SAS for supporting and TechQuartier for hosting! They are looking for more supporters.

Event details:
Date: 24th August 2022
Time: 6PM - 11PM
Location: TechQuartier, Frankfurt

Register here
Forwarded from Kali Novskaya (Tatiana Shavrina)
#nlp #про_nlp

Друзья, так как много из моих статей и статей коллег выходит на англ языке, я решила завести отдельную группу для общения про многоязычность в NLP: добавляйтесь!
В группе будут посты и обсуждения новых важных работ по теме — кажется, что это первый чат в tg такого рода.
https://t.me/multilingual_nlp

Multilingual NLP discussion group

I
've decided to create a group on #multilinguality so we could have a place in telegram to discuss important new papers and share cool results during conferences, meetups and public discussions!

Group is public, feel free to share the link!
https://t.me/multilingual_nlp
#opensource : RuLeanALBERT от Yandex Research
2.9B трансформер для русского, которая влезет в домашнюю ПеКарню ресерчера

Мало того, что это самая большая БЕРТ-подобная модель для русского языка, которая показывает крутые результаты в бенчмарках, так еще и с кодом для fine-tuning-а

GitHub

А в статье можете узнать, как обучалась эта модель (а-ля коллаборативное глубокое обучение) на фреймворке по децентрализованному обучению Hivemind
Kolesa Conf 2022 — Kolesa Group's conference for IT community

Kolesa Group will host its traditional conference on 8 October. Registration is open. Kolesa Conf is an annual Kazakhstan IT conference where leading experts from the best IT companies of the CIS take part.

One of the tracks is devoted to Data. There will be 12 speakers. Some of featured reports:

• «Methodology of construction of cloud CDW: case Air Astana».
• How to correctly estimate the sample size and time of the test. And what non-obvious difficulties can be encountered along the way.
• «Creation of a synthetic control group with the help of Propensity Score Matching».
• How the dynamic minimum cheque (surge pricing) hated by users was developed and implemented.
• «Analytics at the launch of a new product. An example of market place spare parts»

Conference website: https://bit.ly/3UmCoWC

#conference #it #data #analysis
Wanna have impromptu Data Science breakfast tomorrow in the center of Helsinki. Please dm @malev if you are interested
🔥Out of One, Many: Using Language Models to Simulate Human Samples

TLDR: GPT-3 has unexpected application — modelling of socialogical studies. Average responses of a certain groups can be to some algorithmical accuracy predicted by in silico modelling.

What this means: sociologists won’t have to conduct costly live researches and will be able to run experiments in simulations. Marketers and politicians are getting their hands on cheap solution for modelling their slogans and value propositions. This enables people to check more hypothesis faster and to manipulate society with more efficiency.

ArXiV: https://arxiv.org/abs/2209.06899

#gpt3 #psychohistory #nlu #sociology
Please open Telegram to view this post
VIEW IN TELEGRAM
CORL: Offline Reinforcement Learning Library

Offline RL is a fresh paradigm that makes RL similar to the supervised learning, thus making it better applicable to the real-world problems. There is a whole bunch of recently developed Offline RL aglorithms, however, there was nots many of reliable open-sourced implementations for such algorithms

Our friends from @tinkoffai do some research in this direction and they recently open-sourced their internal offline RL library.

The main features are:
- Single-file implementations
- SOTA algorithms (Decision Transformer, AWAC, BC, CQL, IQL, TD3+BC, SAC-N, EDAC)
- Benchmarked on widely used D4RL datasets (results match performances reported in the original papers)
- Wandb logs for all of the experiments

Hope you will like it and the whole new world of Offline RL!

Github: https://github.com/tinkoff-ai/CORL

#tinkoff #RL #offline_lib
State of AI Report 2022 - ONLINE.pdf
22.9 MB
State of AI Report 2022

TLDR: We are moving forward and effective international collaboration is the key to progress.

Major Themes:

* New independent research labs are rapidly open sourcing the closed source output of major labs
* Safety is gaining awareness among major AI research entities
* The China-US AI research gap has continued to widen
* AI-driven scientific research continues to lead to breakthroughs

Website: https://www.stateof.ai

#report #stateofai #AI
Just in case you didn’t know what to wear for Halloween party.
Amos: An Adam-style Optimizer with Adaptive Weight Decay towards Model-Oriented Scale

Amos is a new optimizer that we propose to pre-train large language models. It is more efficient and converges faster than AdamW: ≤ 51% memory for slot variables, and better valid loss within ≤ 70% training time!Amos is a new optimizer that we propose to pre-train large language models. It is more efficient and converges faster than AdamW: ≤ 51% memory for slot variables, and better valid loss within ≤ 70% training time!

ArXiV: https://arxiv.org/abs/2210.11693

#NLU #NLP #optimizer
Reinforcement learning course from MIPT.

The course consists of:
- Theoretical and practical material for beginners and advanced users
- Classical approaches based on utility functions and strategy gradient, as well as modern trends in improving the efficiency of the study of the environment, interaction with planning, using memory and hierarchical approaches.
- The best of David Silver's lectures, Sutton and Barto's book, and OpenAI, DeepMind works and articles from 2019-2022.

Materials:
- PDF slides and video lectures on each topic, Colab master classes and video lectures in Russian.

Course: https://clck.ru/32a3c9

If you are interested in an internship at MIPT in the areas of Reinforcement Learning, Computer Vision, Robotics or Self Driving Cars, you can apply here: https://cogmodel.mipt.ru/internship
(RU lang)
#events : ML-тренировка
Когда: 17 (четверг) ноября 2022, 19:00 - 21:30 (сбор с 18:00)
Место: офис Яндекса (Москва, улица Льва Толстого, 16) + онлайн
Язык - русский

В этот раз нас ждёт 3 доклада:
- призер только что завершившегося Yandex ML Cup,
- 2ое место хакатона AgroCode Hack по анализу спутниковых снимков для виноградников
- организатор ML соревнований в информационной безопасности

Подробная программа по ссылке ниже
Будем рады видеть всех очно и онлайн ;)
Регистрация обязательна
🔥Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

TLDR: Scientists kinda learned how to read thoughts. Paper on the reconstruction of the visual stimuli based on fMRI readings.

Website: https://mind-vis.github.io
Github: https://github.com/zjc062/mind-vis

#fMRI #visualstimulireconstruction #mindreading #dl
Data Science by ODS.ai 🦜
Some stats to get the perspective of the development of #dalle «Used 1000 prompts in Dalle over the last 2 days, about 9 hours each day. Of those, saved ~300. 50 I like enough to share w/ socials. 12 enough to rework for future projects. 3 were perfect,…
Tips & Tricks on Image Generation

Generating images with AI tools is a skill, which can be improved and enhanced. So here is couple of articles, covering tips & tricks on how to generate better images with #midjourney. Most interesting one is #huggingface prompt generator, which uses #NLP model to generate sample prompts.

As an example, we tried to reproduce and improve our group avatar, following ideas in the articles. Prompt for an illustration to this post was generated with query ferrofluids in form of a brain, beautiful connections chaos, swirling black network --ar 3:4 --iw 9 --q 2 --s 1250

Midjourney Prompt Generator: https://huggingface.co/spaces/doevent/prompt-generator
List of Midjourney prompts: https://www.followchain.org/midjourney-prompts/
An advanced guide to writing prompts for Midjourney ( text-to-image): https://medium.com/mlearning-ai/an-advanced-guide-to-writing-prompts-for-midjourney-text-to-image-aa12a1e33b6

#visualization #gan #generation #generatinveart #aiart #artgentips