Data Science by ODS.ai 🦜
51K subscribers
363 photos
34 videos
7 files
1.52K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp
Download Telegram
​​Release of 27 pretrained models for NLP / NLU for PyTorch

Hugging Face open sources a new library that contains up to 27 pretrained models to conduct state-of-the-art NLP/NLU tasks.

Link: https://medium.com/dair-ai/pytorch-transformers-for-state-of-the-art-nlp-3348911ffa5b

#SOTA #NLP #NLU #PyTorch #opensource
Great collections of Data Science learning materials

The list includes free books and online courses on range of DS-related disciplines:

Machine learning (#ML)
Deep Learning (#DL)
Reinforcement learning (#RL)
#NLP

Tutorials on #Keras, #Tensorflow, #Torch, #PyTorch, #Theano

Notable researchers, papers and even #datasets. It is a great place to start reviewing your knowledge or learning something new.

Link: https://hackmd.io/@chanderA/aiguide

#wheretostart #entrylevel #novice #studycontent #studymaterials #books #MOOC #meta
PyTorch 1.3 released

- named tensors support
- general availability of Google Cloud TPU support
- captum - SOTA tools to understand how the importance of specific neurons and layers affect predictions made by the models
- crypten - a new research tool for secure machine learning with PyTorch
- many other improvements

Official announce: https://pytorch.org/blog/pytorch-1-dot-3-adds-mobile-privacy-quantization-and-named-tensors/
Captum website: https://www.captum.ai
CrypTen code: https://github.com/facebookresearch/CrypTen
#DL #PyTorch #TPU #GCP #Captum #CrypTen
πŸŽ“ Reinforcement Learning Course from OpenAI

Reinforcement Learning becoming significant part of the data scientist toolbox.
OpenAI created and published one of the best courses in #RL. Algorithms implementation written in #Tensorflow.
But if you are more comfortable with #PyTorch, we have found #PyTorch implementation of this algs

OpenAI Course: https://spinningup.openai.com/en/latest/
Tensorflow Code: https://github.com/openai/spinningup
PyTorch Code: https://github.com/kashif/firedup

#MOOC #edu #course #OpenAI
​​Neighbourhood Components Analysis
a PyTorch implementation of Neighbourhood Components Analysis

NCA learns a linear transformation of the dataset such that the expected leave-one-out performance of kNN in the transformed space is maximized.

The authors propose a novel method for learning a Mahalanobis distance measure to be used in the KNN classification algorithm. The algorithm directly maximizes a stochastic variant of the leave-one-out KNN score on the training set.

It can also learn low-dimensional linear embedding of labeled data that can be used for data visualization and fast classification. Unlike other methods, this classification model is non-parametric, making no assumptions about the shape of the class distributions or the boundaries between them.

The performance of the method is demonstrated on several data sets, both for metric learning and linear dimensionality reduction.

paper (only pdf): https://www.cs.toronto.edu/~hinton/absps/nca.pdf
github: https://github.com/kevinzakka/nca

#kNN #pca #nca #PyTorch
​​StarGAN v2 code release on GitHub

The better news is if you put a human into the animal model you do in fact get out a feline version of the human, and it's even wearing a suit.

GitHub: https://github.com/clovaai/stargan-v2
ArXiV: https://arxiv.org/abs/1912.01865
YouTube: https://www.youtube.com/watch?v=0EVh5Ki4dIY&feature=youtu.be

#GAN #StarGAN #PyTorch
Live U-Net implementation online session today

Famous Abhishek Thakur (First 4x GM on Kaggle) is going to show you how to implement the original U-Net with #PyTorch.

Session starts in 4 hours from now (at 6PM CET / 9.30PM IST), make sure you turned the notifications on if you are interested.

YouTube Link: https://www.youtube.com/watch?v=u1loyDCoGbE

#Livecoding #Unet
​​ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network

The authors propose a set of design principles that improves model performance significantly based on the analysis of representation bottlenecks.

Authors think that commonly used architectures have a representation bottleneck and try to fix it by expanding channel size, using more expand layers, and better activation functions. This also improves the performance of models on ImageNet and good results on transfer learning on classification and object detection.
Authors hope that their design ideas could be used by NAS to create even better models.


Paper: https://arxiv.org/abs/2007.00992
Code: https://github.com/clovaai/rexnet

#deeplearning #pretraining #transferlearning #computervision #pytorch
​​Funnel Activation for Visual Recognition

Authors offer a new activation function for image recognition tasks, called Funnel activation (FReLU), that extends ReLU and PReLU to a 2D activation by adding a negligible overhead of spatial condition.

Extensive experiments on COCO, ImageNet and CityScape show significant improvement and robustness.


Paper: https://arxiv.org/abs/2007.11824
Code: https://github.com/megvii-model/FunnelAct

#deeplearning #activationfunction #computervision #pytorch
πŸ₯Self-supervised Learning for Medical images

Due to standard imaging procedures, medical images (X-ray, CT scans, etc) are usually well aligned.
This paper gives an opportunity to utilize such an alignment to automatically connect similar pairs of images for training.

GitHub: https://github.com/fhaghighi/TransVW
ArXiV: https://arxiv.org/abs/2102.10680

#biolearning #medical #dl #pytorch #keras
🦜 Hi!

We are the first Telegram Data Science channel.


Channel was started as a collection of notable papers, news and releases shared for the members of Open Data Science (ODS) community. Through the years of just keeping the thing going we grew to an independent online Media supporting principles of Free and Open access to the information related to Data Science.


Ultimate Posts

* Where to start learning more about Data Science. https://github.com/open-data-science/ultimate_posts/tree/master/where_to_start
* @opendatascience channel audience research. https://github.com/open-data-science/ods_channel_stats_eda


Open Data Science

ODS.ai is an international community of people anyhow related to Data Science.

Website: https://ods.ai



Hashtags

Through the years we accumulated a big collection of materials, most of them accompanied by hashtags.

#deeplearning #DL β€” post about deep neural networks (> 1 layer)
#cv β€” posts related to Computer Vision. Pictures and videos
#nlp #nlu β€” Natural Language Processing and Natural Language Understanding. Texts and sequences
#audiolearning #speechrecognition β€” related to audio information processing
#ar β€” augmeneted reality related content
#rl β€” Reinforcement Learning (agents, bots and neural networks capable of playing games)
#gan #generation #generatinveart #neuralart β€” about neural artt and image generation
#transformer #vqgan #vae #bert #clip #StyleGAN2 #Unet #resnet #keras #Pytorch #GPT3 #GPT2 β€” related to special architectures or frameworks
#coding #CS β€” content related to software engineering sphere
#OpenAI #microsoft #Github #DeepMind #Yandex #Google #Facebook #huggingface β€” hashtags related to certain companies
#productionml #sota #recommendation #embeddings #selfdriving #dataset #opensource #analytics #statistics #attention #machine #translation #visualization


Chats

- Data Science Chat https://t.me/datascience_chat
- ODS Slack through invite form at website

ODS resources

* Main website: https://ods.ai
* ODS Community Telegram Channel (in Russian): @ods_ru
* ML trainings Telegram Channel: @mltrainings
* ODS Community Twitter: https://twitter.com/ods_ai

Feedback and Contacts

You are welcome to reach administration through telegram bot: @opendatasciencebot
​​Segment Anything

The Segment Anything project aims to democratize image segmentation in computer vision, a core task used across various applications such as scientific imagery analysis and photo editing. Traditionally, accurate segmentation models require specialized expertise, AI training infrastructure, and large amounts of annotated data. This project introduces a new task, dataset, and model for image segmentation to overcome these challenges and make segmentation more accessible.

The researchers are releasing the Segment Anything Model (SAM) and the Segment Anything 1-Billion mask dataset (SA-1B), the largest segmentation dataset to date. These resources will enable a wide range of applications and further research into foundational models for computer vision. The SA-1B dataset is available for research purposes, while the SAM is provided under the permissive Apache 2.0 open license. Users can explore the demo to try SAM with their own images.

Paper link: https://arxiv.org/abs/2304.02643

Code link: https://github.com/facebookresearch/segment-anything

Demo link: https://segment-anything.com/demo

Blogpost link: https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/

Dataset link: https://ai.facebook.com/datasets/segment-anything/

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-sam

#deeplearning #cv #pytorch #imagesegmentation #dataset