ML Research Hub

✨Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

📝 Summary:
Learnable multipliers are introduced to address weight decay-induced normalization artifacts in large language model training, outperforming traditional methods while reducing computational overhead. ...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04890
• PDF: https://arxiv.org/pdf/2601.04890
• Project Page: https://tiiuae.github.io/Falcon-H1/
• Github: https://github.com/tiiuae/falcon-h1

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

125 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing

📝 Summary:
Re-Align addresses the gap between understanding and generation in in-context image generation and editing through structured reasoning-guided alignment and reinforcement learning training. AI-generat...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05124
• PDF: https://arxiv.org/pdf/2601.05124
• Project Page: https://hrz2000.github.io/realign/
• Github: https://github.com/hrz2000/realign

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

164 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views

📝 Summary:
HairGuard is a framework designed to recover fine-grained soft boundary details in 3D vision tasks. It refines depth around these ambiguous regions and synthesizes novel views, achieving state-of-the-art performance for delicate structures like hair.

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03362
• PDF: https://arxiv.org/pdf/2601.03362

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

160 views07:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Enhancing Object Detection with Privileged Information: A Model-Agnostic Teacher-Student Approach

📝 Summary:
Learning Using Privileged Information paradigm enhances object detection accuracy by integrating additional training-time information through teacher-student architectures without increasing inference...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02016
• PDF: https://arxiv.org/pdf/2601.02016

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

184 views07:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AT^2PO: Agentic Turn-based Policy Optimization via Tree Search

📝 Summary:
AT²PO is a unified framework for multi-turn agentic reinforcement learning that improves exploration diversity, credit assignment, and policy optimization through tree search and turn-level learning o...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04767
• PDF: https://arxiv.org/pdf/2601.04767
• Github: https://github.com/zzfoutofspace/ATPO

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

190 views07:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

📝 Summary:
RL-AWB is a novel framework combining statistical methods with deep reinforcement learning for improved nighttime auto white balance. It is the first RL approach for color constancy, mimicking expert tuning. This method shows superior generalization across various lighting conditions, and a new m...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05249
• PDF: https://arxiv.org/pdf/2601.05249
• Project Page: https://ntuneillee.github.io/research/rl-awb/

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ReinforcementLearning #ComputerVision #ImageProcessing #AutoWhiteBalance #LowLightImaging

❤2

196 views08:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Beyond Binary Preference: Aligning Diffusion Models to Fine-grained Criteria by Decoupling Attributes

📝 Summary:
Current diffusion model alignment struggles with complex, fine-grained human expertise due to simplified preferences. This paper proposes a framework with hierarchical criteria and Complex Preference Optimization CPO, maximizing positive and minimizing negative attributes to improve generation qu...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04300
• PDF: https://arxiv.org/pdf/2601.04300

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#DiffusionModels #AIAlignment #MachineLearning #GenerativeAI #PreferenceLearning

163 views09:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset

📝 Summary:
This paper introduces IMDD-1M, a large dataset of 1 million industrial defect image-text pairs. It enables training a vision-language foundation model tailored for industrial use. This model achieves comparable performance with less data for specialized tasks, promoting data-efficient quality ins...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24160
• PDF: https://arxiv.org/pdf/2512.24160

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#IndustrialAI #VisionLanguageModel #DefectDetection #MultimodalAI #ComputerVision

193 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AgentDevel: Reframing Self-Evolving LLM Agents as Release Engineering

📝 Summary:
AgentDevel reframes LLM agent improvement as release engineering, treating agents as shippable software. It emphasizes stable, auditable improvements through an externalized pipeline that prioritizes non-regression, leading to more reliable and traceable agent development.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04620
• PDF: https://arxiv.org/pdf/2601.04620

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLMAgents #ReleaseEngineering #SoftwareDevelopment #AIResearch #MLOps

217 views10:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding

📝 Summary:
VERSE analyzes Vision-Language Models by visualizing latent representations to find error-prone clusters. It guides synthetic data generation to boost performance in these areas. This significantly improves F1 scores, allowing on-premise models to match or exceed top SaaS solutions.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05125
• PDF: https://arxiv.org/pdf/2601.05125
• Project Page: https://huggingface.co/spaces/de-Rodrigo/Embeddings
• Github: https://github.com/nachoDRT/VrDU-Doctor

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#VisionLanguageModels #DeepLearning #EmbeddingVisualization #SyntheticData #DocumentUnderstanding

160 views11:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ProFuse: Efficient Cross-View Context Fusion for Open-Vocabulary 3D Gaussian Splatting

📝 Summary:
ProFuse enhances open-vocabulary 3DGS understanding via an efficient, context-aware framework. It uses a pre-registration phase to fuse semantic features onto Gaussians for cross-view coherence, completing semantic attachment twice as fast as SOTA.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04754
• PDF: https://arxiv.org/pdf/2601.04754
• Project Page: https://chiou1203.github.io/ProFuse/
• Github: https://chiou1203.github.io/ProFuse/

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#3DGaussianSplatting #ComputerVision #OpenVocabulary #3DReconstruction #DeepLearning

186 views11:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

📝 Summary:
Targeting high-entropy tokens in vision-language models causes significant semantic degradation with reduced budgets. This attack strategy reveals critical transferable safety risks across different VLM architectures.

🔹 Publication Date: Published on Dec 26, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21815
• PDF: https://arxiv.org/pdf/2512.21815

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#VisionLanguageModels #AdversarialAI #AIsecurity #MachineLearning #DeepLearning

184 views12:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Multi-Agent Software Development through Cross-Team Collaboration

📝 Summary:
Existing multi-agent LLM software development yields a single solution, missing better alternatives. We introduce Cross-Team Collaboration CTC, a framework where multiple agent teams propose and communicate diverse decisions. This significantly improves software quality and generalizes well.

🔹 Publication Date: Published on Jun 13, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2406.08979
• PDF: https://arxiv.org/pdf/2406.08979
• Github: https://github.com/OpenBMB/ChatDev

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MultiAgentSystems #LLMAgents #SoftwareDevelopment #AICollaboration #AIResearch

220 views12:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CoV: Chain-of-View Prompting for Spatial Reasoning

📝 Summary:
Chain-of-View CoV prompting enhances spatial reasoning in 3D embodied question answering for vision-language models. It actively explores environments by selecting question-aligned views and iteratively adjusting camera positions to gather context, leading to significant performance gains across ...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05172
• PDF: https://arxiv.org/pdf/2601.05172

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#SpatialReasoning #VisionLanguageModels #EmbodiedAI #Prompting #AI

254 views13:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

📝 Summary:
This paper demonstrates extreme data efficiency in RL for LLMs. A single, carefully designed training sample, called polymath learning, significantly enhances multidisciplinary reasoning, outperforming traditional methods that rely on large datasets. The findings suggest sample quality and design...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03111
• PDF: https://arxiv.org/pdf/2601.03111

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ReinforcementLearning #LLMs #DataEfficiency #AI #DeepLearning

❤1

227 views14:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

198 views15:06

ML Research Hub

✨LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio Suite with Generative Speech Models

📝 Summary:
LEMAS introduces the largest open-source 150K-hour multilingual speech dataset with word-level timestamps. Models trained on this dataset, LEMAS-TTS and LEMAS-Edit, achieve high-quality zero-shot speech synthesis and seamless speech editing.

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04233
• PDF: https://arxiv.org/pdf/2601.04233
• Project Page: https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit

🔹 Models citing this paper:
• https://huggingface.co/LEMAS-Project/LEMAS-TTS

✨ Datasets citing this paper:
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-train
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-eval

✨ Spaces citing this paper:
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-TTS
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit
• https://huggingface.co/spaces/Kaiden423/LEMAS-TTS

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio...

We present the LEMAS-Dataset, which, to our knowledge, is currently the largest open-source multilingual speech corpus with word-level timestamps. Covering over 150,000 hours across 10 major...

236 views15:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Multi-Scale Local Speculative Decoding for Image Generation

📝 Summary:
Multi-Scale Local Speculative Decoding accelerates autoregressive image generation through multi-resolution drafting and spatially informed verification while maintaining semantic quality and perceptu...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05149
• PDF: https://arxiv.org/pdf/2601.05149
• Project Page: https://qualcomm-ai-research.github.io/mulo-sd-webpage/
• Github: https://qualcomm-ai-research.github.io/mulo-sd-webpage

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

199 views16:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

📝 Summary:
Behavior cloning demonstrates improved performance and causal reasoning through scaling model size and training data, achieving human-level gameplay in 3D video games. AI-generated summary Behavior cl...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04575
• PDF: https://arxiv.org/pdf/2601.04575
• Project Page: https://elefant-ai.github.io/open-p2p/
• Github: https://github.com/elefant-ai/open-p2p

🔹 Models citing this paper:
• https://huggingface.co/elefantai/open-p2p

✨ Datasets citing this paper:
• https://huggingface.co/datasets/elefantai/p2p-toy-examples
• https://huggingface.co/datasets/elefantai/p2p-full-data

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

253 views16:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Large-Language-Model-based Multi-Agent Collaboration

📝 Summary:
This paper introduces MacNet for multi-agent collaboration using DAGs for reasoning, outperforming baselines and scaling to many agents. It unveils a collaborative scaling law where emergent abilities appear much earlier than neural emergence.

🔹 Publication Date: Published on Jun 11, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2406.07155
• PDF: https://arxiv.org/pdf/2406.07155
• Project Page: https://github.com/OpenBMB/ChatDev/tree/macnet
• Github: https://github.com/OpenBMB/ChatDev/tree/macnet

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

262 views17:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference

📝 Summary:
Pyramidal diffusion models offer efficient inference by varying resolution based on noise. This paper presents a low-cost finetuning pipeline to convert pretrained diffusion models into pyramidal ones, maintaining output quality. They also explore step distillation for enhanced efficiency.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04792
• PDF: https://arxiv.org/pdf/2601.04792
• Project Page: https://qualcomm-ai-research.github.io/PyramidalWan

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

229 views20:07

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform