ML Research Hub
32.8K subscribers
4.17K photos
251 videos
23 files
4.51K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

📝 Summary:
UniCorn is a self-improvement framework enhancing multimodal model generation. It uses self-play and cognitive reconstruction, without external data or supervision. UniCorn achieves state-of-the-art text-to-image generation.

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03193
• PDF: https://arxiv.org/pdf/2601.03193

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The Sonar Moment: Benchmarking Audio-Language Models in Audio Geo-Localization

📝 Summary:
Audio geo-localization benchmark AGL1K is introduced to advance audio language models' geospatial reasoning capabilities through curated audio clips and evaluation across multiple models. AI-generated...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03227
• PDF: https://arxiv.org/pdf/2601.03227
• Github: https://github.com/Rising0321/AGL1K

Spaces citing this paper:
https://huggingface.co/spaces/RisingZhang/AudioGeoLoc

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
SOP: A Scalable Online Post-Training System for Vision-Language-Action Models

📝 Summary:
SOP is a scalable online post-training system for VLA models that enables real-world robot policy adaptation. It uses a robot fleet to continuously learn from interaction, improving task proficiency while maintaining generality. SOP significantly boosts VLA model performance within hours.

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03044
• PDF: https://arxiv.org/pdf/2601.03044

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
U-Net-Like Spiking Neural Networks for Single Image Dehazing

📝 Summary:
DehazeSNN introduces a U-Net-like Spiking Neural Network with an Orthogonal Leaky-Integrate-and-Fire Block for efficient image dehazing. It achieves competitive performance with reduced computational resources and a smaller model size.

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23950
• PDF: https://arxiv.org/pdf/2512.23950
• Github: https://github.com/HaoranLiu507/DehazeSNN

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners

📝 Summary:
Large reasoning models show multilingual latent reasoning, stronger in resource-rich languages but weaker in low-resource ones. Despite varying strength, their internal prediction evolution is consistent across languages, suggesting an English-centered latent reasoning pathway.

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02996
• PDF: https://arxiv.org/pdf/2601.02996
• Github: https://github.com/cisnlp/multilingual-latent-reasoner

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
UniVideo: Unified Understanding, Generation, and Editing for Videos

📝 Summary:
UniVideo, a dual-stream framework combining a Multimodal Large Language Model and a Multimodal DiT, extends unified modeling to video generation and editing, achieving state-of-the-art performance and...

🔹 Publication Date: Published on Oct 9, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.08377
• PDF: https://arxiv.org/pdf/2510.08377
• Project Page: https://congwei1230.github.io/UniVideo/
• Github: https://github.com/KwaiVGI/UniVideo

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

📝 Summary:
MindWatcher is a tool-integrated reasoning agent using interleaved thinking and multimodal chain-of-thought. It autonomously coordinates diverse tools for complex tasks without human prompts. It outperforms larger models and provides agent training insights.

🔹 Publication Date: Published on Dec 29, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23412
• PDF: https://arxiv.org/pdf/2512.23412
• Github: https://github.com/TIMMY-CHAN/MindWatcher

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

📝 Summary:
MDAgent2 enables automated molecular dynamics code generation and question answering through domain-adapted language models and a multi-agent runtime system. AI-generated summary Molecular dynamics (M...

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02075
• PDF: https://arxiv.org/pdf/2601.02075
• Github: https://github.com/FredericVAN/PKU_MDAgent2

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Choreographing a World of Dynamic Objects

📝 Summary:
CHORD is a universal generative framework that extracts Lagrangian motion information from Eulerian video representations to synthesize diverse 4D dynamic scenes without requiring category-specific ru...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04194
• PDF: https://arxiv.org/pdf/2601.04194
• Project Page: https://yanzhelyu.github.io/chord/

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EpiQAL: Benchmarking Large Language Models in Epidemiological Question Answering for Enhanced Alignment and Reasoning

📝 Summary:
EpiQAL presents a novel benchmark for evaluating epidemiological reasoning in language models through three distinct subsets measuring factual recall, multi-step inference, and conclusion reconstructi...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03471
• PDF: https://arxiv.org/pdf/2601.03471

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

📝 Summary:
Entropy-aware policy optimization method for reinforcement learning in flow matching models that improves exploration through SDE and ODE sampling strategies. AI-generated summary Recent reinforcement...

🔹 Publication Date: Published on Jan 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00423
• PDF: https://arxiv.org/pdf/2601.00423
• Github: https://github.com/shengjun-zhang/VisualGRPO

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Agentic Rubrics as Contextual Verifiers for SWE Agents

📝 Summary:
Agentic Rubrics enable efficient and scalable verification for software engineering agents by creating context-aware checklists that outperform traditional methods while maintaining interpretability. ...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04171
• PDF: https://arxiv.org/pdf/2601.04171

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Klear: Unified Multi-Task Audio-Video Joint Generation

📝 Summary:
Klear addresses audio-video joint generation challenges through a unified model architecture, progressive multitask training, and large-scale dense-caption data construction, achieving superior alignm...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04151
• PDF: https://arxiv.org/pdf/2601.04151

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models

📝 Summary:
RedBench presents a unified dataset with standardized risk categorization for evaluating LLM vulnerabilities across multiple domains and attack types. AI-generated summary As large language models (LL...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03699
• PDF: https://arxiv.org/pdf/2601.03699
• Github: https://github.com/knoveleng/redeval

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

📝 Summary:
Supervised Fine-Tuning causes catastrophic forgetting due to 'Confident Conflicts.' Entropy-Adaptive Fine-Tuning EAFT uses token-level entropy to distinguish uncertainty from knowledge conflict. EAFT suppresses conflicting gradients, mitigating forgetting while matching performance.

🔹 Publication Date: Published on Jan 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02151
• PDF: https://arxiv.org/pdf/2601.02151
• Project Page: https://ymxyll.github.io/EAFT/
• Github: https://ymxyll.github.io/EAFT/

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Benchmark^2: Systematic Evaluation of LLM Benchmarks

📝 Summary:
Researchers developed Benchmark^2, a framework with three metrics to evaluate benchmark quality for large language models, revealing significant variations in existing benchmarks and enabling more eff...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03986
• PDF: https://arxiv.org/pdf/2601.03986

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

📝 Summary:
ThinkRL-Edit enhances reasoning-centric image editing through reinforcement learning by expanding visual reasoning exploration beyond denoising stochasticity and using unbiased reward strategies. AI-g...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03467
• PDF: https://arxiv.org/pdf/2601.03467

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

📝 Summary:
ATLAS is a dual-path framework that dynamically selects optimal model-tool combinations for complex reasoning. It uses cluster-based routing for domain-specific tasks and RL-based multi-step routing for generalization. ATLAS outperforms GPT-4o and other methods on diverse benchmarks.

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03872
• PDF: https://arxiv.org/pdf/2601.03872

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Pearmut: Human Evaluation of Translation Made Trivial

📝 Summary:
Pearmut is a lightweight platform that simplifies complex human evaluation for multilingual NLP, particularly machine translation. It removes setup barriers by supporting various protocols, document context, and learning strategies. This makes reliable human evaluation a routine and practical par...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02933
• PDF: https://arxiv.org/pdf/2601.02933
• Github: https://github.com/zouharvi/pearmut

Datasets citing this paper:
https://huggingface.co/datasets/zouharvi/hearing2translate-humeval

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation

📝 Summary:
A novel 1D visual tokenizer called Residual Tokenizer is introduced that incorporates hierarchical residuals to improve autoregressive image generation by leveraging vision-specific design principles ...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03955
• PDF: https://arxiv.org/pdf/2601.03955

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research