ML Research Hub
32.9K subscribers
4.45K photos
273 videos
23 files
4.81K links
Advancing research in Machine Learning โ€“ practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
๐ŸŒŸ MG-LLaVA - multimodal LLM with advanced capabilities for working with visual information

Just recently, the guys from Shanghai University rolled out MG-LLaVA - MLLM, which expands the capabilities of processing visual information through the use of additional components: special components that are responsible for working with low and high resolution.

MG-LLaVA integrates an additional high-resolution visual encoder to capture fine details, which are then combined with underlying visual features using the Conv-Gate network.

Trained exclusively on publicly available multimodal data, MG-LLaVA achieves excellent results.

๐ŸŸก MG-LLaVA page
๐Ÿ–ฅ GitHub

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘4
๐Ÿ–ฅ Unstructured - Python library for raw data preprocessing

- pip install "unstructured[all-docs]"

Unstructured provides components for preprocessing images and text documents; supports many formats: PDF, HTML, Word docs, etc.

Run the library in a container:
docker run -dt --name unstructured downloads.unstructured.io/unstructured-io/unstructured:latest
docker exec -it unstructured bash


๐Ÿ–ฅ GitHub
๐ŸŸก Docks

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘8โค2
Forwarded from Data Science Premium (Books & Courses)
๐ŸŸข We present to you a list of our paid services:

โœ… Paid Channel (Books): It includes a huge encyclopedia of free channels and a huge collection of important and rare books

๐Ÿ’ต Price: 7$ - one time payment

โœ… Paid channel (courses): includes a complete set of courses downloaded from Udemy, Coursera, and other learning platforms.

๐Ÿ’ต Price: 20$ - one time payment

โœ… Paid bot 1: A bot that contains ten million books and articles in all fields of programming, data science, life sciences, and medicine, with daily updates and additions.

๐Ÿ’ต Price: 10$ - one time payment

โœ… Paid bot 2: The bot contains more than 30 million books and 10 million scientific articles, with your own account, your own control panel, and important features.

๐Ÿ’ต Price: 17$ - one time payment

โœ… Coursera Scholarship: This scholarship gives you free access to all Coursera courses

๐Ÿ’ต Price: 30$ - one time payment

๐Ÿ”ฅ Special offer: Get all four services for only $50

๐Ÿ’ ๐Ÿ’  Available payment methods:
PayPal - Payeer - Crypto - udst
MasterCard - Credit Card

To request a subscription:
t.me/Hussein_Sheikho
Please open Telegram to view this post
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ†’ EAGLE is a method that allows you to generate LLM responses faster

Is it possible to generate LLM response on two RTX 3060s faster than on A100 (which is 16+ times more expensive)?
Yes, it is possible with EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) and the accuracy of the responses is preserved.

EAGLE allows you to extrapolate the context feature vectors of the second top layer of the LLM, which greatly improves the generation efficiency.

EAGLE is 2x faster than Lookahead (13B), and 1.6x faster than Medusa (13B).
And yes, EAGLE can be combined with other acceleration techniques like vLLM, DeepSpeed, Mamba, FlashAttention, quantization and hardware optimization.

๐Ÿค— Hugging Face
๐Ÿ’ป GitHub
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘1
Forwarded from Data Science Premium (Books & Courses)
Building Transformer Models with Attention (2023)

The number one book for learning transformers from beginners to professionalism. The book provides an interesting explanation of transformers. The issue is available to the first twenty people only.

Price: 4$

PayPal / Credit Card: https://www.patreon.com/DataScienceBooks/shop/building-transformer-models-with-2023-253971

Crypto Payment: http://t.me/send?start=IVq3O4aPlSWF
๐Ÿ‘2
๐Ÿ  MedMNIST-C: benchmark dataset based on the MedMNIST+ collection covering 12 2D datasets and 9 imaging modalities.

pip install medmnistc

๐Ÿ–ฅ Github: https://github.com/francescodisalvo05/medmnistc-api

๐Ÿ“• Paper: https://arxiv.org/abs/2406.17536v2

๐Ÿ”ฅ Dataset: https://paperswithcode.com/dataset/imagenet-c

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘3โค1
ุชู†ุจูŠู‡ ุงู„ู†ู…ูˆุฐุฌ ุงู„ู…ูุชูˆุญ ุงู„ุฌุฏูŠุฏ ู…ู† ู…ุฎุชุจุฑ ุดู†ุบู‡ุงูŠ ู„ู„ุฐูƒุงุก ุงู„ุงุตุทู†ุงุนูŠ๐Ÿšจ

ู…ุฑุญุจู‹ุง ุจูƒ ุฏุงุฎู„ูŠู‹ุง 2.5
โค๏ธโ€๐Ÿ”ฅ ู…ูˆุฏูŠู„ 7B ุนุงู„ูŠ ุงู„ุฌูˆุฏุฉ
๐Ÿš€ ู…ุง ูŠุตู„ ุฅู„ู‰ ู…ู„ูŠูˆู† ู†ุงูุฐุฉ ุณูŠุงู‚ูŠุฉ
โš™๏ธ ุฅู…ูƒุงู†ูŠุงุช ุงุณุชุฎุฏุงู… ุงู„ุฃุฏุงุฉ

ู†ุชุทู„ุน ู„ู„ุนุจ ู…ุนู‡ุง

https://huggingface.co/collections/internlm/internlm25-66853f32717072d17581bc13

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘1
Hey community!

Weโ€™re sharing insights on how rewarded ads can boost your revenue and retain users. Our guide covers special promotion days and seamless integration.

Join our new community on Discord and get the ultimate monetization guide: https://discord.gg/2FMh2ZjE8w
๐Ÿ‘3โค1
๐Ÿ”ฅ ESPNet XEUS is the new speech recognition SoTA.

A multi-lingual speech recognition and translation model from Carnegie Mellon University that is trained in over 4000 languages! ๐Ÿ”ฅ

> MIT license
> 577 million parameters.
> Superior to MMS 1B and w2v-BERT v2 2.0
> E-Branchformer architecture
> Dataset 8900 hours of audio recordings in over 4023 languages

git lfs install
git clone https://huggingface.co/espnet/XEUS

โ–ช๏ธ HF: https://huggingface.co/espnet/xeus
โ–ช๏ธ Dataset: https://huggingface.co/datasets/espnet/mms_ulab_v2

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
๐ŸŒŸ MoMA is an open-source model from ByteDance for generating images from a reference.

MoMA requires no training and can quickly generate image images with high detail accuracy and identity preservation.
The speed of MoMA is achieved by optimizing the attention mechanism, which transfers features from the original image to the diffusion model.
The model is a universal adapter and can be applied to various models without modification.
Today, MoMA outperforms similar existing methods in synthetic tests and allows you to create images with a high level of compliance with the prompt while preserving the style of the reference image as much as possible.

โœ๏ธ Recommended parameters for optimizing VRAM consumption:

22 GB or more GPU memory:
args.load_8bit, args.load_4bit = False, False

18 GB or more GPU memory:
args.load_8bit, args.load_4bit = True, False

14 GB or more GPU memory:
args.load_8bit, args.load_4bit = False, True


๐ŸŸก MoMA page
๐Ÿ–ฅ GitHub
๐Ÿค— Hugging Face
๐ŸŸก Demo

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
๐Ÿ‘6โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ Lisa has given away over $100,000 in the last 30 days. Every single one of her subscribers is making money.

She is a professional trader and broadcasts her way of making money trading on her channel EVERY subscriber she has helped, and she will help you.

๐Ÿง  Do this and she will help you earn :

1. Subscribe to her channel
2. Write โ€œGIFTโ€ to her private messages
3. Follow her channel and trade with her.
Repeat transactions after her = earn a lot of money.

Subscribe ๐Ÿ‘‡๐Ÿป
https://t.me/+DqIxOkOWtVw3ZjYx
๐Ÿ‘3
โšก๏ธ InternLM introduced XComposer-2.5 - a multi-modal 7B VLM with increased context for input and output.

InternLM-XComposer-2.5 copes with the tasks of text description of images with complex composition, achieving the capabilities of GPT-4V. Trained with alternating 24 KB image-text contexts, it can easily expand to 96 KB contexts via RoPE extrapolation.

Compared to the previous version 2.0, InternLM-XComposer-2.5 has three major improvements:
- understanding of ultra-high resolution;
- detailed understanding of the video;
- process several images in the context of 1 dialogue.

Using extra Lora, XComposer-2.5 is capable of performing complex tasks:
- creation of web pages;
- creation of high-quality text articles with images.

XComposer-2.5 was evaluated on 28 benchmarks, outperforming existing state-of-the-art open source models in 16 benchmarks . It also closely competes with GPT-4V and Gemini Pro on 16 key tasks.

๐Ÿ–ฅ GitHub
๐ŸŸก Arxiv
๐ŸŸก Model
๐ŸŸก Demo
๐Ÿ“บ Demo video

https://t.me/DataScienceT โœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
โค2