Deep Learning – Telegram

Deep Learning

@deep_learning_prog

37 subscribers

5 photos

136 links

Deep Learning: programming, tools & resources.
#DeepLearning #Python

Download Telegram

About

Blog

Apps

Platform

https://arxiv.org/abs/2207.06881 #Paper Recurrent Memory Transformer. Scaling transformer architecture to long sequences.

116 viewsVadim ✨, 19:40

https://www.deepspeed.ai/training/ Large scale training #Frameworks

Training Overview and Features

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

❤1

112 viewsVadim ✨, 18:26

https://arxiv.org/pdf/2307.02486.pdf Scaling Transformers to
1,000,000,000 Tokens #Paper

113 viewsVadim ✨, 15:42

https://www.deepmind.com/blog/rt-2-new-model-translates-vision-and-language-into-action
New model translates vision and language into action based on LLM

Google DeepMind

RT-2: New model translates vision and language into action

Introducing Robotic Transformer 2 (RT-2), a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for...

119 viewsVadim ✨, 16:01

https://huyenchip.com/2023/08/16/llm-research-open-challenges.html

Open challenges in LLM research

[LinkedIn discussion, Twitter thread]

116 viewsArt 🐲 Zaborskiy, 07:48

https://www.anyscale.com/blog/continuous-batching-llm-inference

LLM inference acceleration #Frameworks

Achieve 23x LLM Inference Throughput & Reduce p50 Latency

In this blog, we discuss continuous batching, a critical systems-level optimization that improves both throughput and latency under load for LLMs.

❤1

81 viewsVadim ✨, 16:42

Goodbye databases, it’s time to embrace Vector Databases!

The AI revolution is reshaping industries, promising remarkable innovations while introducing new challenges. In this transformative…

https://codemaker2016.medium.com/goodbye-databases-its-time-to-embrace-vector-databases-0ffa7879980e
#Tips

🔥1

86 viewsVadim ✨, 13:18

https://llava-vl.github.io/blog/2024-01-30-llava-next/

#Frameworks #Models

LLaVA-NeXT: Improved reasoning, OCR, and world knowledge

LLaVA team presents LLaVA-NeXT, with improved reasoning, OCR, and world knowledge. LLaVA-NeXT even exceeds Gemini Pro on several benchmarks.

75 viewsVadim ✨, 18:50

https://huggingface.co/collections/osanseviero/model-merging-65097893623330a3a51ead66

Model Merging: papers
#Paper

Model Merging - a osanseviero Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it!

🔥1

82 viewsVadim ✨, 05:34

https://github.com/arcee-ai/mergekit
Model Merging: toolkit
#Frameworks

GitHub - arcee-ai/mergekit: Tools for merging pretrained large language models.

Tools for merging pretrained large language models. - arcee-ai/mergekit

82 viewsVadim ✨, 05:34

https://arxiv.org/pdf/2404.19756 Kolmogorov-Arnold Networks #paper

76 viewsVadim ✨, 16:43

https://kindxiaoming.github.io/pykan/ Kolmogorov Arnold Network #framework #library

81 viewsVadim ✨, 16:44

https://github.com/SeldonIO/alibi-detect Algorithms for outlier, adversarial and drift detection
https://github.com/SeldonIO/alibi Algorithms for explaining machine learning models
#Frameworks #library #anomaly #drift

GitHub - SeldonIO/alibi-detect: Algorithms for outlier, adversarial and drift detection

Algorithms for outlier, adversarial and drift detection - SeldonIO/alibi-detect

72 viewsVadim ✨, edited 00:40

Grokking:
1. Fist paper: https://arxiv.org/abs/2201.02177
2. Transformers: https://arxiv.org/pdf/2405.15071
3. Simple framework: https://arxiv.org/pdf/2405.20233

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

In this paper we propose to study generalization of neural networks on small algorithmically generated datasets. In this setting, questions about data efficiency, memorization, generalization, and...

78 viewsVadim ✨, 04:43

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and many other libraries.
https://mars-project.readthedocs.io/
#Frameworks

75 viewsVadim, edited 02:00

Tutorial: Scalable and Distributed ML Workflows with DVC and Ray
Part 1: https://dvc.ai/blog/dvc-ray
Part 2: https://dvc.ai/blog/dvc-ray-part-2

Tutorial: Scalable and Distributed ML Workflows with DVC and Ray (Part 1)

This tutorial introduces you to integrating DVC (Data Version Control) with Ray, turning them into your go-to toolkit for creating automated, scalable, and distributed ML pipelines.

87 viewsVadim, 02:37

https://web.stanford.edu/~jurafsky/slp3/
#books Speech and Language Processing (3rd ed. draft)
Dan Jurafsky and James H. Martin

77 viewsVadim, 14:31

RAG #Frameworks
https://github.com/infiniflow/ragflow

GitHub - infiniflow/ragflow: RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge…

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs - infiniflow/ragflow

75 viewsVadim, 23:18

NVidia monitoring #Tools :
1. GPUStat https://github.com/wookayin/gpustat
2. Nvtop https://github.com/Syllo/nvtop
3. NVITOP https://github.com/XuehaiPan/nvitop

82 viewsVadim, 01:52

SOTA in unsupervised semantic segmentation:
1. STEGO: Unsupervised Semantic Segmentation by Distilling Feature Correspondences - 2022 https://arxiv.org/abs/2203.08414
2. HP: Leveraging Hidden Positives for Unsupervised Semantic Segmentation -2023 https://arxiv.org/abs/2303.15014
3. CAUSE: Causal Unsupervised Semantic Segmentation - 2023 https://arxiv.org/abs/2310.07379
#Paper

Unsupervised Semantic Segmentation by Distilling Feature Correspondences

Unsupervised semantic segmentation aims to discover and localize semantically meaningful categories within image corpora without any form of annotation. To solve this task, algorithms must produce...

🔥1

85 viewsVadim, 19:57

https://arxiv.org/pdf/2408.04840v1
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
#Paper

67 viewsVadim, 04:48