ML Research Hub
32.8K subscribers
4.36K photos
267 videos
23 files
4.71K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
SmartSearch: Process Reward-Guided Query Refinement for Search Agents

📝 Summary:
SmartSearch enhances LLM-based search agents through process rewards and query refinement mechanisms that improve intermediate search query quality via a three-stage curriculum learning approach. AI-g...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04888
• PDF: https://arxiv.org/pdf/2601.04888
• Github: https://github.com/MYVAE/SmartSearch?tab=readme-ov-file

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Router-Suggest: Dynamic Routing for Multimodal Auto-Completion in Visually-Grounded Dialogs

📝 Summary:
Multimodal auto-completion leverages visual and textual context to improve real-time prediction accuracy in conversational interfaces, with a router framework enabling efficient model selection based ...

🔹 Publication Date: Published on Jan 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05851
• PDF: https://arxiv.org/pdf/2601.05851

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgentOCR: Reimagining Agent History via Optical Self-Compression

📝 Summary:
AgentOCR reimagines agent history as visual tokens to reduce token consumption and memory in agentic systems. It leverages optical caching and adaptive self-compression. This framework maintains strong performance while significantly cutting token usage and boosting efficiency.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04786
• PDF: https://arxiv.org/pdf/2601.04786

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MMFormalizer: Multimodal Autoformalization in the Wild

📝 Summary:
MMFormalizer enables multimodal autoformalization by integrating visual perception with formal mathematical reasoning, supporting complex physical domains from classical mechanics to quantum mechanics...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03017
• PDF: https://arxiv.org/pdf/2601.03017
• Project Page: https://mmformalizer.github.io/

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

📝 Summary:
The Qwen3-VL-Embedding and Qwen3-VL-Reranker models form an end-to-end multimodal search pipeline, leveraging multi-stage training and cross-attention mechanisms to achieve high-precision retrieval ac...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://www.arxiv.org/abs/2601.04720
• PDF: https://arxiv.org/pdf/2601.04720
• Github: https://github.com/QwenLM/Qwen3-VL-Embedding

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AnyDepth: Depth Estimation Made Easy

📝 Summary:
A lightweight monocular depth estimation framework uses DINOv3 as visual encoder and a compact transformer decoder to achieve higher accuracy with reduced computational overhead and improved data qual...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02760
• PDF: https://arxiv.org/pdf/2601.02760
• Project Page: https://aigeeksgroup.github.io/AnyDepth
• Github: https://aigeeksgroup.github.io/AnyDepth

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CaricatureGS: Exaggerating 3D Gaussian Splatting Faces With Gaussian Curvature

📝 Summary:
CaricatureGS introduces a 3D caricaturization framework combining Gaussian curvature-based exaggeration with 3D Gaussian Splatting for photorealistic, controllable face avatars. It uses a unique training scheme with synthesized supervision to achieve high fidelity, real-time deformation, and cont...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03319
• PDF: https://arxiv.org/pdf/2601.03319
• Project Page: https://c4ricaturegs.github.io/
• Github: https://c4ricaturegs.github.io/

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning

📝 Summary:
CompassMem is an event-centric memory framework that organizes experiences into an Event Graph to enable structured memory navigation and long-horizon reasoning beyond traditional retrieval methods. A...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04726
• PDF: https://arxiv.org/pdf/2601.04726

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning

📝 Summary:
FAPO improves reinforcement learning for LLMs by penalizing flawed-positive rollouts that reinforce unreliable reasoning. It uses these flaws for initial gains while shifting optimization toward reliable reasoning, enhancing correctness and stability.

🔹 Publication Date: Published on Oct 26, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.22543
• PDF: https://arxiv.org/pdf/2510.22543
• Project Page: https://fapo-rl.github.io/
• Github: https://fapo-rl.github.io

🔹 Models citing this paper:
https://huggingface.co/dyyyyyyyy/FAPO-GenRM-4B
https://huggingface.co/dyyyyyyyy/FAPO-32B

Datasets citing this paper:
https://huggingface.co/datasets/dyyyyyyyy/FAPO-Reasoning-Dataset
https://huggingface.co/datasets/dyyyyyyyy/FAPO-Critic

==================================

For more data science resources:
https://t.me/DataScienceT

#ReinforcementLearning #LLMs #AI #MachineLearning #Reasoning
Distilling Feedback into Memory-as-a-Tool

📝 Summary:
This framework converts transient critiques into retrievable guidelines using a file-based memory system and agent tools. It enables LLMs to achieve test-time refinement performance with significantly reduced inference costs.

🔹 Publication Date: Published on Jan 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05960
• PDF: https://arxiv.org/pdf/2601.05960
• Github: https://github.com/vicgalle/feedback-memory-as-a-tool

==================================

For more data science resources:
https://t.me/DataScienceT

#LLMs #AIAgents #MemorySystems #AIResearch #MachineLearning
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection

📝 Summary:
A new benchmark, mfmdscen, evaluates behavioral biases in large language models for multilingual financial misinformation detection. It uses complex economic scenarios and a multilingual dataset, revealing significant biases across 22 mainstream LLMs.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05403
• PDF: https://arxiv.org/pdf/2601.05403

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #AIbias #FinancialAI #MisinformationDetection #MultilingualAI
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

📝 Summary:
OmniFlatten is a GPT-based model for real-time, natural full-duplex spoken dialogue. It uses a multi-stage post-training method to adapt a text LLM for speech and text generation without altering its architecture, enabling low-latency conversations.

🔹 Publication Date: Published on Oct 23, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.17799
• PDF: https://arxiv.org/pdf/2410.17799
• Github: https://github.com/karpathy/nanogpt

==================================

For more data science resources:
https://t.me/DataScienceT

#GPT #VoiceAI #LLM #RealTimeAI #NLP
TCAndon-Router: Adaptive Reasoning Router for Multi-Agent Collaboration

📝 Summary:
TCAndon-Router TCAR is an adaptive reasoning router for multi-agent systems. It overcomes limitations of existing task routers by supporting dynamic agent onboarding and generating natural language reasoning chains to select agents. TCAR significantly improves routing accuracy, reduces conflicts,...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04544
• PDF: https://arxiv.org/pdf/2601.04544
• Github: https://github.com/Tencent/TCAndon-Router

🔹 Models citing this paper:
https://huggingface.co/tencent/TCAndon-Router

==================================

For more data science resources:
https://t.me/DataScienceT

#MultiAgentSystems #AI #NLP #AdaptiveSystems #AIResearch
Media is too big
VIEW IN TELEGRAM
NitroGen: An Open Foundation Model for Generalist Gaming Agents

📝 Summary:
NitroGen is a vision-action foundation model trained on extensive gameplay data that demonstrates strong cross-game generalization and effective transfer learning capabilities. AI-generated summary We...

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02427
• PDF: https://arxiv.org/pdf/2601.02427
• Project Page: https://nitrogen.minedojo.org/
• Github: https://github.com/MineDojo/NitroGen

🔹 Models citing this paper:
https://huggingface.co/nvidia/NitroGen

Datasets citing this paper:
https://huggingface.co/datasets/nvidia/NitroGen

Spaces citing this paper:
https://huggingface.co/spaces/dennny123/NitroGen-SuperstarSaga
https://huggingface.co/spaces/blanchon/NitroGen-Pokemon

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Plenoptic Video Generation

📝 Summary:
PlenopticDreamer enables consistent multi-view video re-rendering through synchronized generative hallucinations, leveraging camera-guided retrieval and progressive training mechanisms for improved te...

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05239
• PDF: https://arxiv.org/pdf/2601.05239
• Project Page: https://research.nvidia.com/labs/dir/plenopticdreamer/

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
NitroGen: An Open Foundation Model for Generalist Gaming Agents

📝 Summary:
NitroGen is a vision-action foundation model trained on extensive gameplay data that demonstrates strong cross-game generalization and effective transfer learning capabilities. AI-generated summary We...

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02427
• PDF: https://arxiv.org/pdf/2601.02427
• Project Page: https://nitrogen.minedojo.org/
• Github: https://github.com/MineDojo/NitroGen

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ViTNT-FIQA: Training-Free Face Image Quality Assessment with Vision Transformers

📝 Summary:
ViTNT-FIQA is a training-free method for face image quality assessment using Vision Transformers. It measures the stability of patch embeddings across intermediate blocks with a single forward pass. High-quality images show stable feature evolution, achieving competitive results efficiently.

🔹 Publication Date: Published on Jan 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05741
• PDF: https://arxiv.org/pdf/2601.05741
• Github: https://github.com/gurayozgur/ViTNT-FIQA

==================================

For more data science resources:
https://t.me/DataScienceT

#VisionTransformers #FaceQuality #ComputerVision #DeepLearning #AI
2
Afri-MCQA: Multimodal Cultural Question Answering for African Languages

📝 Summary:
Afri-MCQA is the first multimodal cultural QA benchmark for 15 African languages. It shows open-weight LLMs perform poorly, particularly with native language speech and cultural contexts. This highlights the need for speech-first, culturally grounded AI development.

🔹 Publication Date: Published on Jan 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05699
• PDF: https://arxiv.org/pdf/2601.05699

Datasets citing this paper:
https://huggingface.co/datasets/Atnafu/Afri-MCQA

==================================

For more data science resources:
https://t.me/DataScienceT

#AfricanLanguages #MultimodalAI #LLMs #CulturalAI #SpeechAI
Legal Alignment for Safe and Ethical AI

📝 Summary:
Legal alignment explores leveraging legal principles and methods to guide AI system design for safety, ethics, and compliance. This field focuses on AI compliance with legal rules, adapting legal interpretation for AI reasoning, and using legal concepts as a blueprint for AI reliability and trust.

🔹 Publication Date: Published on Jan 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04175
• PDF: https://arxiv.org/pdf/2601.04175

==================================

For more data science resources:
https://t.me/DataScienceT

#EthicalAI #LegalAI #AIRegulation #ResponsibleAI #AISafety
An Empirical Study on Preference Tuning Generalization and Diversity Under Domain Shift

📝 Summary:
Preference tuning performance degrades under domain shift. This study found pseudo-labeling adaptation strategies effectively reduce performance degradation in summarization and question-answering tasks across various alignment objectives.

🔹 Publication Date: Published on Jan 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05882
• PDF: https://arxiv.org/pdf/2601.05882
• Github: https://github.com/ckarouzos/prefadap

==================================

For more data science resources:
https://t.me/DataScienceT

#PreferenceTuning #DomainAdaptation #NLP #MachineLearning #AIResearch
The Persona Paradox: Medical Personas as Behavioral Priors in Clinical Language Models

📝 Summary:
Medical personas in clinical language models show context-dependent effects, improving performance in critical care but degrading it in primary care. They act as behavioral priors, introducing trade-offs rather than guaranteeing expertise or safety.

🔹 Publication Date: Published on Jan 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05376
• PDF: https://arxiv.org/pdf/2601.05376

==================================

For more data science resources:
https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research