Artificial Intelligence
68.8K subscribers
337 photos
24 videos
78 files
93 links
πŸ”’ Welcome Artificial Intelligence Channel

Buy ads: https://telega.io/c/Artificial_Intelligence_COM
Download Telegram
πŸ”₯ EasyControl is a framework (set of tools and methods) designed to add control signals (conditions) to Diffusion Transformer (DiT)-based image generation models.

In essence, this is an attempt to create an analogue of the popular ControlNet (which is mainly used with U-Net architectures) for a new generation of diffusion models built on transformers. Its goal is to make the process of generation control in DiT models as flexible, efficient and easily pluggable.

How does EasyControl work?

EasyControl solves the problems of control signal integration in DiT by using a combination of several key ideas:

β–ͺ️ Condition Injection LoRA: Instead of retraining the entire huge DiT model or creating bulky copies of its parts for each new condition (e.g. poses, contours, depth), EasyControl uses LoRA (Low-Rank Adaptation). This is a technique that allows "injecting" additional information (control signal) into an existing model by training only a small number of additional parameters. This makes the process of adding new control types very resource-efficient and allows preserving the original "knowledge" and style of the base DiT model (style lossless).

β–ͺ️ Position-Aware Training Paradigm: Transformers (like in DiT) treat an image as a sequence of patches. To ensure that the control signal (e.g. pose map) correctly influences the corresponding patches of the generated image, EasyControl uses a special training approach that helps the model better understand the spatial correspondence between the control signal and the generated content.

β–ͺ️ Attention Optimization and Caching (Causal Attention + KV Cache): To improve efficiency at the inference stage, EasyControl applies optimizations specific to transformers. Using Causal Attention and KV Cache (caching keys and values ​​in the attention mechanism) allows to speed up the generation process, especially when working with long sequences of patches and additional condition modules.

πŸ”— Github
πŸ”—Paper
πŸ‘29❀9πŸ”₯4πŸ‘1
🐬 Dolphin is an improved and expanded version of Whisper, optimized for recognizing a large number of Asian languages ​​and Chinese dialects, outperforming other open models and available for community use.
What is it based on?

Goal : Support a wider range of languages, with a special focus on 40 Oriental languages ​​(East Asia, South Asia, Southeast Asia, Middle East) and 22 Chinese dialects.

How was it trained? A combination of proprietary and open-source datasets were used for training and optimization.

Results: Experiments show that Dolphin significantly outperforms existing best open source models in recognition quality for many languages.

Availability : The developers make the trained models and the source code for using them (inference) publicly available to promote reproducibility and community growth.

🟑 Model :
https://huggingface.co/DataoceanAI/dolphin-base
https://huggingface.co/DataoceanAI/dolphin-small
🟑 Paper :
https://huggingface.co/papers/2503.20212
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘19❀8πŸ”₯1
Python library for finetuning Gemma 3! πŸ”₯

Includes papers on finetuning, sharding, LoRA, PEFT, multimodality, and tokenization in LLM.

100% open source.

pip install gemma

πŸ“Œ Documentation
πŸ‘52❀19πŸ”₯9πŸ₯°4
What is torch.nn really?

When I started working with PyTorch, my biggest question was: "What is torch.nn?".


This article explains it quite well.

πŸ“Œ Read
πŸ‘14❀7πŸ‘Ž2πŸ”₯2
Generative AI in Data Analytics βœ…
πŸ‘13❀6πŸ”₯4😁1
This media is not supported in your browser
VIEW IN TELEGRAM
Have you ever seen a Drone working under a waterway
😱35❀18πŸ‘9πŸ‘€2😁1
πŸ’‘ Tasks of DevOps Engineer
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘26❀12😁2
Machine Learning Roadmap πŸ‘†
πŸ”₯28πŸ‘11❀5πŸ₯°4😁2
Maths for Machine Learning πŸ‘†
❀28πŸ‘5πŸ”₯5
🧠 Build your own ChatGPT

Build an LLM app with Mixture of AI Agents using small Open Source LLMs that can beat GPT-4o in just 40 lines of Python Code


⬇️ step-by-step instructions ⬇️
Please open Telegram to view this post
VIEW IN TELEGRAM
❀18πŸ‘4πŸ”₯4
1. Install the necessary Python Libraries

Run the following commands from your terminal to install the required libraries:
πŸ‘10❀2