Forwarded from Denis Sexy IT 🤖
GPT-3 моделька на английском языке, на 2.7 миллиардов параметров стала доступна онлайн.
Это реплика и обучили ее энтузиасты eleuther.ai а не люди из OpenAI, оригинальная GPT-3 на ~175 миллиардов параметров продается как коммерческий продукт и отдается по эксклюзивной лицензии через Microsoft.
В общем, опенсорс сообщество как всегда великолепно – также выложили модельку поменьше, все по ссылкам ниже.
Код | Коллаб
Количество сгенерированных текстов в интернете увеличилось вдвое 🌚
Это реплика и обучили ее энтузиасты eleuther.ai а не люди из OpenAI, оригинальная GPT-3 на ~175 миллиардов параметров продается как коммерческий продукт и отдается по эксклюзивной лицензии через Microsoft.
В общем, опенсорс сообщество как всегда великолепно – также выложили модельку поменьше, все по ссылкам ниже.
Код | Коллаб
Количество сгенерированных текстов в интернете увеличилось вдвое 🌚
https://arxiv.org/pdf/1912.13318.pdf
LayoutLM: Pre-training of Text and Layout for Document Image Understanding #paper
LayoutLM: Pre-training of Text and Layout for Document Image Understanding #paper
Russian corpus collection
https://natasha.github.io/corus/
https://github.com/natasha/corus
https://nbviewer.jupyter.org/github/natasha/corus/blob/master/docs.ipynb
#Corpus #Dataset
https://natasha.github.io/corus/
https://github.com/natasha/corus
https://nbviewer.jupyter.org/github/natasha/corus/blob/master/docs.ipynb
#Corpus #Dataset
natasha.github.io
Corus — коллекция русскоязычных NLP-датасетов
Ссылки на публичные русскоязычные датасеты, Python-пакет с функциями-загрузчиками
https://github.com/fbdesignpro/sweetviz In-depth EDA (target analysis, comparison, feature analysis, correlation) #Library #Tools
example of use: http://cooltiming.com/SV/SWEETVIZ_REPORT_COMPARED.html
example of use: http://cooltiming.com/SV/SWEETVIZ_REPORT_COMPARED.html
GitHub
GitHub - fbdesignpro/sweetviz: Visualize and compare datasets, target values and associations, with one line of code.
Visualize and compare datasets, target values and associations, with one line of code. - fbdesignpro/sweetviz
https://github.com/jessevig/bertviz
BertViz is a tool for visualizing attention in the Transformer model, supporting most models from the transformers library (BERT, GPT-2, XLNet, RoBERTa, XLM, CTRL, MarianMT, etc.). #Library #Tools
BertViz is a tool for visualizing attention in the Transformer model, supporting most models from the transformers library (BERT, GPT-2, XLNet, RoBERTa, XLM, CTRL, MarianMT, etc.). #Library #Tools
GitHub
GitHub - jessevig/bertviz: BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.) - GitHub - jessevig/bertviz: BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
https://github.com/calculatedcontent/weightwatcher
WeightWatcher (WW): is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data.
#Diagnostic #Tool
WeightWatcher (WW): is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data.
#Diagnostic #Tool
GitHub
GitHub - CalculatedContent/WeightWatcher: The WeightWatcher tool for predicting the accuracy of Deep Neural Networks
The WeightWatcher tool for predicting the accuracy of Deep Neural Networks - GitHub - CalculatedContent/WeightWatcher: The WeightWatcher tool for predicting the accuracy of Deep Neural Networks
https://github.com/tensorflow/lucid
Lucid is a collection of infrastructure and tools for research in neural network interpretability. #Framework
https://github.com/greentfrapp/lucent
Lucid is a collection of infrastructure and tools for research in neural network interpretability. #Framework
https://github.com/greentfrapp/lucent
GitHub
GitHub - tensorflow/lucid: A collection of infrastructure and tools for research in neural network interpretability.
A collection of infrastructure and tools for research in neural network interpretability. - tensorflow/lucid
https://umap-learn.readthedocs.io/en/latest/ Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction. #Framework #Tools
https://elastiknn.com/ Elasticsearch Plugin for Nearest Neighbor Search on dense vectors
#Tools #Library
#Tools #Library
Elastiknn
Home
Elasticsearch Plugin for Nearest Neighbor Search
https://arxiv.org/pdf/2103.14030.pdf #paper
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
https://shap.readthedocs.io/en/latest/index.html
SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.
#Framework
SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.
#Framework
https://docs.determined.ai/latest/index.html#
#Framework distributed training, hyperparameter tuning
https://www.determined.ai/blog/data-version-control-determined
#Framework distributed training, hyperparameter tuning
https://www.determined.ai/blog/data-version-control-determined
Determined AI
Managing ML Training Data with DVC and Determined
Tracking machine learning data sets made easy with Data Version Control (DVC) and Determined.