ML Research Hub

✨Beyond Binary Preference: Aligning Diffusion Models to Fine-grained Criteria by Decoupling Attributes

📝 Summary:
Current diffusion model alignment struggles with complex, fine-grained human expertise due to simplified preferences. This paper proposes a framework with hierarchical criteria and Complex Preference Optimization CPO, maximizing positive and minimizing negative attributes to improve generation qu...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04300
• PDF: https://arxiv.org/pdf/2601.04300

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#DiffusionModels #AIAlignment #MachineLearning #GenerativeAI #PreferenceLearning

163 views09:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset

📝 Summary:
This paper introduces IMDD-1M, a large dataset of 1 million industrial defect image-text pairs. It enables training a vision-language foundation model tailored for industrial use. This model achieves comparable performance with less data for specialized tasks, promoting data-efficient quality ins...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24160
• PDF: https://arxiv.org/pdf/2512.24160

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#IndustrialAI #VisionLanguageModel #DefectDetection #MultimodalAI #ComputerVision

195 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AgentDevel: Reframing Self-Evolving LLM Agents as Release Engineering

📝 Summary:
AgentDevel reframes LLM agent improvement as release engineering, treating agents as shippable software. It emphasizes stable, auditable improvements through an externalized pipeline that prioritizes non-regression, leading to more reliable and traceable agent development.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04620
• PDF: https://arxiv.org/pdf/2601.04620

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLMAgents #ReleaseEngineering #SoftwareDevelopment #AIResearch #MLOps

219 views10:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding

📝 Summary:
VERSE analyzes Vision-Language Models by visualizing latent representations to find error-prone clusters. It guides synthetic data generation to boost performance in these areas. This significantly improves F1 scores, allowing on-premise models to match or exceed top SaaS solutions.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05125
• PDF: https://arxiv.org/pdf/2601.05125
• Project Page: https://huggingface.co/spaces/de-Rodrigo/Embeddings
• Github: https://github.com/nachoDRT/VrDU-Doctor

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#VisionLanguageModels #DeepLearning #EmbeddingVisualization #SyntheticData #DocumentUnderstanding

161 views11:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ProFuse: Efficient Cross-View Context Fusion for Open-Vocabulary 3D Gaussian Splatting

📝 Summary:
ProFuse enhances open-vocabulary 3DGS understanding via an efficient, context-aware framework. It uses a pre-registration phase to fuse semantic features onto Gaussians for cross-view coherence, completing semantic attachment twice as fast as SOTA.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04754
• PDF: https://arxiv.org/pdf/2601.04754
• Project Page: https://chiou1203.github.io/ProFuse/
• Github: https://chiou1203.github.io/ProFuse/

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#3DGaussianSplatting #ComputerVision #OpenVocabulary #3DReconstruction #DeepLearning

187 views11:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

📝 Summary:
Targeting high-entropy tokens in vision-language models causes significant semantic degradation with reduced budgets. This attack strategy reveals critical transferable safety risks across different VLM architectures.

🔹 Publication Date: Published on Dec 26, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21815
• PDF: https://arxiv.org/pdf/2512.21815

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#VisionLanguageModels #AdversarialAI #AIsecurity #MachineLearning #DeepLearning

185 views12:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Multi-Agent Software Development through Cross-Team Collaboration

📝 Summary:
Existing multi-agent LLM software development yields a single solution, missing better alternatives. We introduce Cross-Team Collaboration CTC, a framework where multiple agent teams propose and communicate diverse decisions. This significantly improves software quality and generalizes well.

🔹 Publication Date: Published on Jun 13, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2406.08979
• PDF: https://arxiv.org/pdf/2406.08979
• Github: https://github.com/OpenBMB/ChatDev

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MultiAgentSystems #LLMAgents #SoftwareDevelopment #AICollaboration #AIResearch

222 views12:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CoV: Chain-of-View Prompting for Spatial Reasoning

📝 Summary:
Chain-of-View CoV prompting enhances spatial reasoning in 3D embodied question answering for vision-language models. It actively explores environments by selecting question-aligned views and iteratively adjusting camera positions to gather context, leading to significant performance gains across ...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05172
• PDF: https://arxiv.org/pdf/2601.05172

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#SpatialReasoning #VisionLanguageModels #EmbodiedAI #Prompting #AI

259 views13:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

📝 Summary:
This paper demonstrates extreme data efficiency in RL for LLMs. A single, carefully designed training sample, called polymath learning, significantly enhances multidisciplinary reasoning, outperforming traditional methods that rely on large datasets. The findings suggest sample quality and design...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03111
• PDF: https://arxiv.org/pdf/2601.03111

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ReinforcementLearning #LLMs #DataEfficiency #AI #DeepLearning

❤1

232 views14:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

201 views15:06

ML Research Hub

✨LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio Suite with Generative Speech Models

📝 Summary:
LEMAS introduces the largest open-source 150K-hour multilingual speech dataset with word-level timestamps. Models trained on this dataset, LEMAS-TTS and LEMAS-Edit, achieve high-quality zero-shot speech synthesis and seamless speech editing.

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04233
• PDF: https://arxiv.org/pdf/2601.04233
• Project Page: https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit

🔹 Models citing this paper:
• https://huggingface.co/LEMAS-Project/LEMAS-TTS

✨ Datasets citing this paper:
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-train
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-eval

✨ Spaces citing this paper:
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-TTS
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit
• https://huggingface.co/spaces/Kaiden423/LEMAS-TTS

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio...

We present the LEMAS-Dataset, which, to our knowledge, is currently the largest open-source multilingual speech corpus with word-level timestamps. Covering over 150,000 hours across 10 major...

240 views15:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Multi-Scale Local Speculative Decoding for Image Generation

📝 Summary:
Multi-Scale Local Speculative Decoding accelerates autoregressive image generation through multi-resolution drafting and spatially informed verification while maintaining semantic quality and perceptu...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05149
• PDF: https://arxiv.org/pdf/2601.05149
• Project Page: https://qualcomm-ai-research.github.io/mulo-sd-webpage/
• Github: https://qualcomm-ai-research.github.io/mulo-sd-webpage

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

211 views16:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

📝 Summary:
Behavior cloning demonstrates improved performance and causal reasoning through scaling model size and training data, achieving human-level gameplay in 3D video games. AI-generated summary Behavior cl...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04575
• PDF: https://arxiv.org/pdf/2601.04575
• Project Page: https://elefant-ai.github.io/open-p2p/
• Github: https://github.com/elefant-ai/open-p2p

🔹 Models citing this paper:
• https://huggingface.co/elefantai/open-p2p

✨ Datasets citing this paper:
• https://huggingface.co/datasets/elefantai/p2p-toy-examples
• https://huggingface.co/datasets/elefantai/p2p-full-data

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

259 views16:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Large-Language-Model-based Multi-Agent Collaboration

📝 Summary:
This paper introduces MacNet for multi-agent collaboration using DAGs for reasoning, outperforming baselines and scaling to many agents. It unveils a collaborative scaling law where emergent abilities appear much earlier than neural emergence.

🔹 Publication Date: Published on Jun 11, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2406.07155
• PDF: https://arxiv.org/pdf/2406.07155
• Project Page: https://github.com/OpenBMB/ChatDev/tree/macnet
• Github: https://github.com/OpenBMB/ChatDev/tree/macnet

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

267 views17:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference

📝 Summary:
Pyramidal diffusion models offer efficient inference by varying resolution based on noise. This paper presents a low-cost finetuning pipeline to convert pretrained diffusion models into pyramidal ones, maintaining output quality. They also explore step distillation for enhanced efficiency.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04792
• PDF: https://arxiv.org/pdf/2601.04792
• Project Page: https://qualcomm-ai-research.github.io/PyramidalWan

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

235 views20:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers

📝 Summary:
ReHyAt introduces a recurrent hybrid attention mechanism that combines softmax and linear attention benefits, enabling efficient video generation with reduced computational costs and improved scalabil...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04342
• PDF: https://arxiv.org/pdf/2601.04342
• Project Page: https://qualcomm-ai-research.github.io/rehyat

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

287 views20:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Learning User Preferences Through Interaction for Long-Term Collaboration

📝 Summary:
MultiSessionCollab benchmark evaluates agents' ability to learn and adapt to user preferences through persistent memory systems that enhance long-term collaboration quality. AI-generated summary As co...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02702
• PDF: https://arxiv.org/pdf/2601.02702

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

415 views20:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

👩‍💻 FREE 2026 IT Learning Kits Giveaway

🔥Whether you're preparing for #Cisco #AWS #PMP #Python #Excel #Google #Microsoft #AI or any other in-demand certification – SPOTO has got you covered!

🎁 Explore Our FREE Study Resources
·IT Certs E-book : https://bit.ly/3YvSMHL
·IT exams skill Test : https://bit.ly/4r4VHnd
·Python, ITIL, PMP, Excel, Cyber Security, cloud, SQL Courses : https://bit.ly/4qNWl8r
·Free AI online preparation material and support tools : https://bit.ly/4qKiKTN

🔗 Need IT Certs Exam Help？ contact: wa.link/dm4kyz
📲 Join IT Study Group for insider tips & expert support:
https://chat.whatsapp.com/BEQ9WrfLnpg1SgzGQw69oM

❤3

452 views04:43

ML Research Hub

ML Research Hub pinned a photo

17:44

ML Research Hub

✨GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

📝 Summary:
GRPO in multi-reward RL suffers from reward normalization collapse, hindering training. GDPO resolves this by decoupling individual reward normalization, improving stability and accuracy. GDPO consistently outperforms GRPO across various reasoning tasks.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05242
• PDF: https://arxiv.org/pdf/2601.05242
• Project Page: https://nvlabs.github.io/GDPO/
• Github: https://github.com/NVlabs/GDPO

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ReinforcementLearning #MultiRewardRL #PolicyOptimization #MachineLearning #AI

200 views09:34

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

📝 Summary:
Learnable multipliers address suboptimal weight norms caused by weight decay in large language models. They free the scale of weight matrices using learnable scalar, then per-row and per-column multipliers, outperforming baselines and improving performance with reduced overhead.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04890
• PDF: https://arxiv.org/pdf/2601.04890
• Project Page: https://tiiuae.github.io/Falcon-H1/
• Github: https://github.com/tiiuae/falcon-h1

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLM #DeepLearning #MachineLearning #AI #Optimization

103 views09:34

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform