ML Research Hub
32.8K subscribers
4.1K photos
237 videos
23 files
4.41K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Less is More: Recursive Reasoning with Tiny Networks

📝 Summary:
Tiny Recursive Model TRM uses a simple, two-layer network for recursive reasoning. It significantly outperforms larger language models on complex puzzle tasks like ARC-AGI, achieving high generalization with vastly fewer parameters. TRM demonstrates superior performance with minimal resources.

🔹 Publication Date: Published on Oct 6

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/less-is-more-recursive-reasoning-with-tiny-networks
• PDF: https://arxiv.org/pdf/2510.04871
• Project Page: https://alexiajm.github.io/2025/09/29/tiny_recursive_models.html
• Github: https://github.com/SamsungSAILMontreal/TinyRecursiveModels/issues/2

🔹 Models citing this paper:
https://huggingface.co/wtfmahe/Samsung-TRM
https://huggingface.co/ordlibrary/X402

Datasets citing this paper:
https://huggingface.co/datasets/emiliocantuc/sudoku-extreme-1k-aug-1000

==================================

For more data science resources:
https://t.me/DataScienceT

#RecursiveReasoning #TinyAI #EfficientAI #AIResearch #MachineLearning
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

📝 Summary:
VibeThinker-1.5B, a 1.5B-parameter model, uses the Spectrum-to-Signal Principle to achieve superior reasoning. It outperforms much larger models on math and coding benchmarks, proving small models can deliver advanced AI at low cost.

🔹 Publication Date: Published on Nov 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06221
• PDF: https://arxiv.org/pdf/2511.06221
• Github: https://github.com/WeiboAI/VibeThinker

🔹 Models citing this paper:
https://huggingface.co/WeiboAI/VibeThinker-1.5B
https://huggingface.co/Mungert/VibeThinker-1.5B-GGUF

==================================

For more data science resources:
https://t.me/DataScienceT

#SLM #AIReasoning #ModelOptimization #MachineLearning #EfficientAI
Motif 2 12.7B technical report

📝 Summary:
Motif-2-12.7B is an efficient LLM combining Grouped Differential Attention and system-level optimizations. It achieves competitive performance across diverse benchmarks with a smaller model size.

🔹 Publication Date: Published on Nov 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07464
• PDF: https://arxiv.org/pdf/2511.07464

🔹 Models citing this paper:
https://huggingface.co/Motif-Technologies/optimizer
https://huggingface.co/Motif-Technologies/Motif-2-12.7B-Instruct
https://huggingface.co/Motif-Technologies/Motif-2-12.7B-Base

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #AI #DeepLearning #EfficientAI #AttentionMechanisms
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs

📝 Summary:
Nemotron Elastic embeds multiple submodels within a single large language model, significantly reducing training costs by 360x compared to training separate models. This framework allows zero-shot extraction of optimized submodels for various deployment budgets without additional training or fine...

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16664
• PDF: https://arxiv.org/pdf/2511.16664
• Project Page: https://huggingface.co/nvidia/Nemotron-Elastic-12B

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #AI #MachineLearning #DeepLearning #EfficientAI
Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models

📝 Summary:
Downscaling multimodal models disproportionately harms visual capabilities, including perception, more than LLM abilities. This paper introduces visual extraction tuning combined with step-by-step reasoning to improve smaller models efficiency and performance.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17487
• PDF: https://arxiv.org/pdf/2511.17487
• Project Page: https://web.stanford.edu/~markendo/projects/downscaling_intelligence
• Github: https://github.com/markendo/downscaling_intelligence

==================================

For more data science resources:
https://t.me/DataScienceT

#MultimodalAI #SmallModels #ComputerVision #EfficientAI #AIResearch
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

📝 Summary:
Z-Image is an efficient 6B-parameter diffusion transformer achieving state-of-the-art image generation with significantly reduced computational cost. It enables sub-second inference and consumer hardware compatibility, challenging the scale-at-all-costs paradigm.

🔹 Publication Date: Published on Nov 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.22699
• PDF: https://arxiv.org/pdf/2511.22699
• Project Page: https://tongyi-mai.github.io/Z-Image-blog/
• Github: https://github.com/Tongyi-MAI/Z-Image

==================================

For more data science resources:
https://t.me/DataScienceT

#ImageGeneration #DiffusionModels #EfficientAI #FoundationModels #MachineLearning
1
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead

📝 Summary:
SwiftVLA enhances compact VLA models with efficient 4D understanding. It uses a 4D geometry transformer, Fusion Tokens, and a mask-and-reconstruct strategy. This rivals larger models while drastically improving speed and memory efficiency.

🔹 Publication Date: Published on Nov 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.00903
• PDF: https://arxiv.org/pdf/2512.00903
• Project Page: https://swiftvla.github.io/
• Github: https://swiftvla.github.io/

==================================

For more data science resources:
https://t.me/DataScienceT

#SwiftVLA #VLAModels #SpatiotemporalAI #EfficientAI #Transformers
AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

📝 Summary:
AdaptVision is an efficient VLM that adaptively acquires visual tokens through a coarse-to-fine approach, using a bounding box tool. Trained with reinforcement learning to balance accuracy and efficiency, it achieves superior VQA performance using fewer visual tokens.

🔹 Publication Date: Published on Dec 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03794
• PDF: https://arxiv.org/pdf/2512.03794
• Project Page: https://adaptvision.github.io/
• Github: https://github.com/AdaptVision/AdaptVision

🔹 Models citing this paper:
https://huggingface.co/AdaptVision/AdaptVision-7B

==================================

For more data science resources:
https://t.me/DataScienceT

#VisionLanguageModels #ReinforcementLearning #ComputerVision #AIResearch #EfficientAI
AutoNeural: Co-Designing Vision-Language Models for NPU Inference

📝 Summary:
AutoNeural is an NPU-native VLM co-designed for efficient edge inference. It uses a MobileNetV5-style vision backbone for stable integer quantization and a hybrid SSM-Transformer language backbone. This design reduces quantization errors and latency, improving real-time performance on edge devices.

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.02924
• PDF: https://arxiv.org/pdf/2512.02924

🔹 Models citing this paper:
https://huggingface.co/NexaAI/AutoNeural

==================================

For more data science resources:
https://t.me/DataScienceT

#AutoNeural #VisionLanguageModels #EdgeAI #AIHardware #EfficientAI
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

📝 Summary:
EMMA is an efficient unified architecture for multimodal tasks like understanding, generation, and editing. It uses novel components including an autoencoder, channel-wise concatenation, and mixture-of-experts. EMMA achieves superior performance and efficiency over state-of-the-art unified models.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04810
• PDF: https://arxiv.org/pdf/2512.04810
• Project Page: https://emma-umm.github.io/emma/
• Github: https://emma-umm.github.io/emma/

==================================

For more data science resources:
https://t.me/DataScienceT

#MultimodalAI #GenerativeAI #DeepLearning #AIArchitecture #EfficientAI
3
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

📝 Summary:
HyperVL is an efficient multimodal large language model for edge devices. It uses image tiling, a Visual Resolution Compressor, and Dual Consistency Learning to reduce memory, latency, and power. HyperVL maintains performance, making it practical for on-device inference.

🔹 Publication Date: Published on Dec 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14052
• PDF: https://arxiv.org/pdf/2512.14052

==================================

For more data science resources:
https://t.me/DataScienceT

#HyperVL #MLLM #EdgeAI #EfficientAI #OnDeviceAI
1
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

📝 Summary:
CASA enhances cross-attention for vision-language models by adding local text-to-text interaction. This approach substantially reduces the performance gap with costly token insertion methods on detailed visual tasks. CASA maintains efficiency and scalability for long-context multimodal applicatio...

🔹 Publication Date: Published on Dec 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.19535
• PDF: https://arxiv.org/pdf/2512.19535
• Project Page: https://kyutai.org/casa
• Github: https://github.com/kyutai-labs/casa

🔹 Models citing this paper:
https://huggingface.co/kyutai/CASA-Helium1-VL-2B

Spaces citing this paper:
https://huggingface.co/spaces/kyutai/casa-samples

==================================

For more data science resources:
https://t.me/DataScienceT

#VisionLanguage #MultimodalAI #AttentionMechanisms #EfficientAI #DeepLearning
4
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

📝 Summary:
DLCM shifts computation from individual tokens to a compressed concept space, enabling more efficient reasoning. This hierarchical approach learns semantic boundaries end-to-end and improves performance on benchmarks by reallocating compute.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24617
• PDF: https://arxiv.org/pdf/2512.24617

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #MachineLearning #LargeModels #RepresentationLearning #EfficientAI