Deep Learning

https://elastiknn.com/ Elasticsearch Plugin for Nearest Neighbor Search on dense vectors
#Tools #Library

Elastiknn

Home

Elasticsearch Plugin for Nearest Neighbor Search

85 viewsVadim, 20:54

Deep Learning

https://github.com/EthicalML/awesome-production-machine-learning #Library #Frameworks #Tools

GitHub

GitHub - EthicalML/awesome-production-machine-learning: A curated list of awesome open source libraries to deploy, monitor, version…

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning - EthicalML/awesome-production-machine-learning

92 viewsVadim, 11:39

Deep Learning

https://arxiv.org/pdf/2103.14030.pdf #paper
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

75 viewsVadim, 05:25

Deep Learning

https://shap.readthedocs.io/en/latest/index.html
SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.
#Framework

76 viewsVadim, 23:43

Deep Learning

https://www.datagym.ai/ #annotation #tools

81 viewsVadim, 02:25

Deep Learning

https://www.quantamagazine.org/computer-scientists-prove-why-bigger-neural-networks-do-better-20220210/

Quanta Magazine

Computer Scientists Prove Why Bigger Neural Networks Do Better

Two researchers show that for neural networks to be able to remember better, they need far more parameters than previously thought.

77 viewsVadim, 05:33

Deep Learning

https://docs.determined.ai/latest/index.html#
#Framework distributed training, hyperparameter tuning

https://www.determined.ai/blog/data-version-control-determined

Determined AI

Managing ML Training Data with DVC and Determined

Tracking machine learning data sets made easy with Data Version Control (DVC) and Determined.

62 viewsVadim, edited 03:18

Deep Learning

https://keras.io/keras_tuner/
#Framework KerasTuner is an easy-to-use, scalable hyperparameter optimization framework that solves the pain points of hyperparameter search.

keras.io

Keras documentation: KerasTuner

❤1

67 viewsVadim, 03:24

Deep Learning

https://www.microsoft.com/en-us/research/project/document-ai/
Microsoft Document AI (Intelligent Document Processing) #Framework

61 viewsVadim, 11:11

Deep Learning

https://www.linkedin.com/posts/smasis_machinelearning-math-datascience-activity-6951137542079467520-4Bb8/
grid search vs Bayesian Optimization for hyperparameter tuning

It irks me to see that grid search is still the most popular 𝗵𝘆𝗽𝗲𝗿𝗽𝗮𝗿𝗮𝗺𝗲𝘁𝗲𝗿 - Serg Masís on LinkedIn | 96 comments

It irks me to see that grid search is still the most popular 𝗵𝘆𝗽𝗲𝗿𝗽𝗮𝗿𝗮𝗺𝗲𝘁𝗲𝗿 𝘁𝘂𝗻𝗶𝗻𝗴 method despite being usually the most inefficient... 96 comments on LinkedIn

58 viewsVadim, 12:38

Deep Learning

https://github.com/yzhao062/pyod outlier/anomaly detection - many methods in one library #framework #library

GitHub

GitHub - yzhao062/pyod: A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques

A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques - yzhao062/pyod

62 viewsVadim, 11:28

Deep Learning

https://faker.readthedocs.io/en/stable/index.html
https://sdv.dev/
https://gretel.ai/synthetics

Synthetic Data Generators! #Frameworks #Library

sdv.dev

The Synthetic Data Vault. Put synthetic data to work!

The Synthetic Data Vault (SDV) enables end users to easily generate synthetic data for different data modalities, including single table, relational and time series data.

76 viewsVadim, 19:43

Deep Learning

https://medmnist.com/
MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification
#Dataset

75 viewsVadim, 05:06

Deep Learning

https://arxiv.org/pdf/2208.07339.pdf
https://huggingface.co/blog/hf-bitsandbytes-integration
#Performance

huggingface.co

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

79 viewsVadim, edited 23:46

Deep Learning

https://landing.ai/tips-for-a-data-centric-ai-approach/ #tips Andrew Ng

Landing AI

The Data-Centric AI Approach: Tips and Data Labeling Examples

AI systems are built using code and data. Read about applying the data-centric AI approach with principles, tips, and data labeling examples at Landing AI.

100 viewsVadim, 17:50

Deep Learning

https://github.com/clovaai/donut #Frameworks #Model Document Understanding Transformer

GitHub

GitHub - clovaai/donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator…

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022 - clovaai/donut

👍1

92 viewsVadim ☯️, edited 16:07

Deep Learning

High-Performance Large-Scale Image Recognition Without Normalization
https://arxiv.org/pdf/2102.06171.pdf #Paper

92 viewsVadim ☯️, 20:57

Deep Learning

https://github.com/jsbroks/coco-annotator #Frameworks #Tool Image segmentation tool

GitHub

GitHub - jsbroks/coco-annotator: :pencil2: Web-based image segmentation tool for object detection, localization, and keypoints

:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints - jsbroks/coco-annotator

99 viewsVadim ☯️, edited 00:45

Deep Learning

https://tf-explain.readthedocs.io/en/latest/index.html
tf-explain offers interpretability methods for Tensorflow 2.0 to ease neural network’s understanding.
#Frameworks

84 viewsVadim ✨🪁, 21:50

Deep Learning

#Tips Efficient Training Large Models on Multiple GPUs, Main Concepts (from https://huggingface.co/docs/transformers/perf_train_gpu_many):

DataParallel (DP) - the same setup is replicated multiple times, and each being fed a slice of the data. The processing is done in parallel and all setups are synchronized at the end of each training step.
TensorParallel (TP) - each tensor is split up into multiple chunks, so instead of having the whole tensor reside on a single gpu, each shard of the tensor resides on its designated gpu. During processing each shard gets processed separately and in parallel on different GPUs and the results are synced at the end of the step. This is what one may call horizontal parallelism, as the splitting happens on horizontal level.
PipelineParallel (PP) - the model is split up vertically (layer-level) across multiple GPUs, so that only one or several layers of the model are places on a single gpu. Each gpu processes in parallel different stages of the pipeline and working on a small chunk of the batch.
Zero Redundancy Optimizer (ZeRO) - Also performs sharding of the tensors somewhat similar to TP, except the whole tensor gets reconstructed in time for a forward or backward computation, therefore the model doesn’t need to be modified. It also supports various offloading techniques to compensate for limited GPU memory.
Sharded DDP - is another name for the foundational ZeRO concept as used by various other implementations of ZeRO.

#Frameworks :
https://www.deepspeed.ai/
https://fairscale.readthedocs.io/en/latest/
https://github.com/tunib-ai/oslo
https://github.com/microsoft/varuna

huggingface.co

Parallelism methods

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

111 viewsVadim ✨🪁, 15:54

Deep Learning

[2302.14045] Language Is Not All You Need: Aligning Perception with Language Models
https://arxiv.org/abs/2302.14045
#paper New generation

107 viewsVadim ✨🪁, edited 17:34

About

Blog

Apps

Platform