ML Research Hub
32.8K subscribers
4.18K photos
253 videos
23 files
4.52K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Benchmark^2: Systematic Evaluation of LLM Benchmarks

📝 Summary:
Researchers developed Benchmark^2, a framework with three metrics to evaluate benchmark quality for large language models, revealing significant variations in existing benchmarks and enabling more eff...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03986
• PDF: https://arxiv.org/pdf/2601.03986

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

📝 Summary:
ThinkRL-Edit enhances reasoning-centric image editing through reinforcement learning by expanding visual reasoning exploration beyond denoising stochasticity and using unbiased reward strategies. AI-g...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03467
• PDF: https://arxiv.org/pdf/2601.03467

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

📝 Summary:
ATLAS is a dual-path framework that dynamically selects optimal model-tool combinations for complex reasoning. It uses cluster-based routing for domain-specific tasks and RL-based multi-step routing for generalization. ATLAS outperforms GPT-4o and other methods on diverse benchmarks.

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03872
• PDF: https://arxiv.org/pdf/2601.03872

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts

📝 Summary:
A case study of four LLM agent attempts to autonomously generate ML research papers reveals six recurring failure modes. Most attempts failed, though one was accepted to a special AI-first author venue, leading to proposed design principles for future AI-scientist systems.

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03315
• PDF: https://arxiv.org/pdf/2601.03315

==================================

For more data science resources:
https://t.me/DataScienceT

#LLMs #AIResearch #MachineLearning #AIAgents #AutonomousSystems
Evolving Programmatic Skill Networks

📝 Summary:
The Programmatic Skill Network PSN enables continual skill acquisition through executable symbolic programs that evolve via reflection, progressive optimization, and structural refactoring. This framework demonstrates robust skill reuse, rapid adaptation, and strong generalization in open-ended e...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03509
• PDF: https://arxiv.org/pdf/2601.03509

==================================

For more data science resources:
https://t.me/DataScienceT

#ProgrammaticAI #SkillAcquisition #EvolutionaryAI #MachineLearning #AIResearch
1
Pearmut: Human Evaluation of Translation Made Trivial

📝 Summary:
Pearmut is a lightweight platform that simplifies complex human evaluation for multilingual NLP, particularly machine translation. It removes setup barriers by supporting various protocols, document context, and learning strategies. This makes reliable human evaluation a routine and practical par...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02933
• PDF: https://arxiv.org/pdf/2601.02933
• Github: https://github.com/zouharvi/pearmut

Datasets citing this paper:
https://huggingface.co/datasets/zouharvi/hearing2translate-humeval

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation

📝 Summary:
A novel 1D visual tokenizer called Residual Tokenizer is introduced that incorporates hierarchical residuals to improve autoregressive image generation by leveraging vision-specific design principles ...

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03955
• PDF: https://arxiv.org/pdf/2601.03955

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

📝 Summary:
ROI Reasoning enables large language models to strategically allocate computation under strict token budgets. It uses meta-cognition to predict costs and utilities, optimizing sequential decisions with reinforcement learning. This improves performance and reduces regret on budgeted reasoning tasks.

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03822
• PDF: https://arxiv.org/pdf/2601.03822

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RelayLLM: Efficient Reasoning via Collaborative Decoding

📝 Summary:
RelayLLM enables efficient collaborative reasoning by having a small language model dynamically invoke a large language model only for critical tokens. This token-level collaboration achieves high accuracy with minimal computational overhead. It reduces LLM invocation to just 1.07% of tokens, lea...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05167
• PDF: https://arxiv.org/pdf/2601.05167

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice

📝 Summary:
VideoAuto-R1 framework employs a reason-when-necessary strategy for video understanding, using a Thinking Once, Answering Twice training paradigm with verifiable rewards and confidence-based reasoning...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05175
• PDF: https://arxiv.org/pdf/2601.05175
• Project Page: https://ivul-kaust.github.io/projects/videoauto-r1/

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research