ML Research Hub
32.8K subscribers
4.17K photos
251 videos
23 files
4.5K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
🤖🧠 MiniMax-M2: The Open-Source Revolution Powering Coding and Agentic Intelligence

🗓️ 30 Oct 2025
📚 AI News & Trends

Artificial intelligence is evolving faster than ever, but not every innovation needs to be enormous to make an impact. MiniMax-M2, the latest release from MiniMax-AI, demonstrates that efficiency and power can coexist within a streamlined framework. MiniMax-M2 is an open-source Mixture of Experts (MoE) model designed for coding tasks, multi-agent collaboration and automation workflows. With ...

#MiniMaxM2 #OpenSource #MachineLearning #CodingAI #AgenticIntelligence #MixtureOfExperts
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

📝 Summary:
MoE LLMs have suboptimal routers that cause significant performance gaps. Routing Manifold Alignment RoMA aligns routing weights with task embeddings using a regularization term during lightweight finetuning of routers. This improves generalization by encouraging similar samples to share expert c...

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07419
• PDF: https://arxiv.org/pdf/2511.07419
• Github: https://github.com/tianyi-lab/RoMA

==================================

For more data science resources:
https://t.me/DataScienceT

#LLMs #MixtureOfExperts #DeepLearning #AI #MachineLearning
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

📝 Summary:
Uni-MoE 2.0-Omni is an open-source omnimodal large model improving multimodal understanding, reasoning, and generation. It uses dynamic MoE and progressive training to achieve state-of-the-art results across 85 benchmarks, outperforming leading models like Qwen2.5-Omni.

🔹 Publication Date: Published on Nov 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12609
• PDF: https://arxiv.org/pdf/2511.12609
• Project Page: https://idealistxy.github.io/Uni-MoE-v2.github.io/
• Github: https://github.com/HITsz-TMG/Uni-MoE

🔹 Models citing this paper:
https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Omni
https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Base
https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Image

==================================

For more data science resources:
https://t.me/DataScienceT

#OmnimodalAI #LLMs #MixtureOfExperts #MultimodalLearning #AIResearch
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE

📝 Summary:
UniMoE-Audio unifies speech and music generation using a novel Dynamic-Capacity Mixture-of-Experts framework. It addresses data imbalance and task conflicts through a hybrid expert design and a three-stage training, achieving state-of-the-art performance and synergistic cross-domain learning.

🔹 Publication Date: Published on Oct 15

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/unimoe-audio-unified-speech-and-music-generation-with-dynamic-capacity-moe
• PDF: https://arxiv.org/pdf/2510.13344
• Project Page: https://mukioxun.github.io/Uni-MoE-site/home.html
• Github: https://github.com/HITsz-TMG/Uni-MoE/blob/master/UniMoE-Audio

🔹 Models citing this paper:
https://huggingface.co/HIT-TMG/UniMoE-Audio-Preview

==================================

For more data science resources:
https://t.me/DataScienceT

#SpeechGeneration #MusicGeneration #MixtureOfExperts #GenerativeAI #DeepLearning
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

📝 Summary:
Uni-MoE introduces a sparse Multimodal Mixture of Experts LLM efficiently handling diverse data types. It uses modality-specific encoders and a progressive training strategy, reducing performance bias and improving collaboration across modalities.

🔹 Publication Date: Published on May 18, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2405.11273
• PDF: https://arxiv.org/pdf/2405.11273
• Github: https://github.com/hitsz-tmg/umoe-scaling-unified-multimodal-llms

==================================

For more data science resources:
https://t.me/DataScienceT

#MultimodalAI #LLMs #MixtureOfExperts #DeepLearning #AIResearch
YOLO Meets Mixture-of-Experts: Adaptive Expert Routing for Robust Object Detection

📝 Summary:
A new Mixture-of-Experts framework uses adaptive routing among multiple YOLOv9-T experts. This improves object detection performance, achieving higher mAP and AR.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13344
• PDF: https://arxiv.org/pdf/2511.13344

==================================

For more data science resources:
https://t.me/DataScienceT

#ObjectDetection #YOLO #MixtureOfExperts #DeepLearning #ComputerVision
A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models

📝 Summary:
This paper provides a theoretical framework for Auxiliary-Loss-Free Load Balancing ALF-LB in Sparse Mixture-of-Experts s-MoE layers. It analyzes ALF-LB as a primal-dual method, proving approximate-balancing guarantees and logarithmic regret for efficient expert utilization.

🔹 Publication Date: Published on Dec 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03915
• PDF: https://arxiv.org/pdf/2512.03915

==================================

For more data science resources:
https://t.me/DataScienceT

#MixtureOfExperts #LoadBalancing #LargeScaleAI #DeepLearning #AIResearch
2
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

📝 Summary:
Expert-Router Coupling ERC loss aligns MoE router decisions with expert capabilities. It uses proxy tokens and activation constraints to ensure experts specialize, improving performance and computational efficiency. ERC also allows tracking expert specialization during training.

🔹 Publication Date: Published on Dec 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23447
• PDF: https://arxiv.org/pdf/2512.23447

==================================

For more data science resources:
https://t.me/DataScienceT

#MixtureOfExperts #DeepLearning #MachineLearning #AI #NeuralNetworks
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

📝 Summary:
YOLO-Master proposes an Efficient Sparse Mixture-of-Experts ES-MoE block for real-time object detection. It adaptively allocates computational resources based on scene complexity using a dynamic routing network, overcoming static computation limits. This improves accuracy and speed, especially on...

🔹 Publication Date: Published on Dec 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23273
• PDF: https://arxiv.org/pdf/2512.23273

==================================

For more data science resources:
https://t.me/DataScienceT

#ObjectDetection #YOLO #MixtureOfExperts #Transformers #RealTimeAI
1