ML Research Hub

✨UFO^3: Weaving the Digital Agent Galaxy

📝 Summary:
UFO^3 unifies diverse digital devices into a single orchestration fabric, enabling AI agents to collaborate seamlessly across platforms. It models tasks dynamically for asynchronous execution, achieving efficient, resilient, and accurate cross-device task orchestration with improved parallelism a...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11332
• PDF: https://arxiv.org/pdf/2511.11332
• Project Page: https://microsoft.github.io/UFO/
• Github: https://github.com/microsoft/UFO/

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIAgents #TaskOrchestration #DistributedSystems #EdgeAI #MultiAgentSystems

287 views21:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution

📝 Summary:
Orion is a visual agent framework that orchestrates specialized computer vision tools to execute complex visual workflows. It achieves competitive performance on benchmarks and enables autonomous, tool-driven visual reasoning.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14210
• PDF: https://arxiv.org/pdf/2511.14210

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ComputerVision #AIagents #VisualReasoning #MultimodalAI #DeepLearning

196 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agent READMEs: An Empirical Study of Context Files for Agentic Coding

📝 Summary:
This study analyzed 2303 agent context files, finding them complex and evolving like config code. Developers prioritize functional details but rarely specify non-functional requirements like security or performance. This suggests a gap in guardrails for agent-written code quality.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12884
• PDF: https://arxiv.org/pdf/2511.12884

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIAgents #SoftwareEngineering #CodeQuality #LLMs #AIResearch

186 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniParser for Pure Vision Based GUI Agent

📝 Summary:
OmniParser enhances GPT-4V's ability to act as a GUI agent by improving screen parsing. It identifies interactable icons and understands element semantics using specialized models. This significantly boosts GPT-4V's performance on benchmarks like ScreenSpot, Mind2Web, and AITW.

🔹 Publication Date: Published on Aug 1, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2408.00203
• PDF: https://arxiv.org/pdf/2408.00203
• Github: https://github.com/microsoft/omniparser

🔹 Models citing this paper:
• https://huggingface.co/microsoft/OmniParser
• https://huggingface.co/microsoft/OmniParser-v2.0
• https://huggingface.co/banao-tech/OmniParser

✨ Datasets citing this paper:
• https://huggingface.co/datasets/mlfoundations/Click-100k

✨ Spaces citing this paper:
• https://huggingface.co/spaces/callmeumer/OmniParser-v2
• https://huggingface.co/spaces/nofl/OmniParser-v2
• https://huggingface.co/spaces/SheldonLe/OmniParser-v2

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#GUIagents #ComputerVision #GPT4V #AIagents #DeepLearning

arXiv.org

OmniParser for Pure Vision Based GUI Agent

The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as...

395 views09:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

📝 Summary:
Ideation diversity significantly enhances AI research agent performance. Higher ideation diversity leads to stronger results on the MLE-bench benchmark across different models and scaffolds. This finding holds across various performance metrics.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15593
• PDF: https://arxiv.org/pdf/2511.15593

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIResearch #IdeationDiversity #MachineLearning #AIagents #AIPerformance

493 views14:04

✨ Explore Data Science 📝 Write your paper

✨GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

📝 Summary:
GeoVista is a new agentic model for geolocalization that integrates tool invocation and reinforcement learning. It achieves high performance on the new GeoBench benchmark, surpassing open-source models and matching closed-source models.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15705
• PDF: https://arxiv.org/pdf/2511.15705
• Project Page: https://ekonwang.github.io/geo-vista/
• Github: https://github.com/ekonwang/GeoVista

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#Geolocalization #AI #ReinforcementLearning #ComputerVision #AIAgents

240 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

✨Computer-Use Agents as Judges for Generative User Interface

📝 Summary:
This paper introduces a framework where Computer-Use Agents CUA act as judges for coding language models Coder to automatically design GUIs. The goal is to optimize interfaces for CUA efficiency and task solvability, rather than human aesthetics, using a new benchmark called AUI-Gym.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15567
• PDF: https://arxiv.org/pdf/2511.15567
• Project Page: https://showlab.github.io/AUI/
• Github: https://github.com/showlab/AUI/

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIAgents #GUIDesign #GenerativeAI #AIevaluation #LanguageModels

488 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Budget-Aware Tool-Use Enables Effective Agent Scaling

📝 Summary:
Tool-augmented agents struggle to scale with more tool calls due to a lack of budget awareness. This paper introduces Budget Tracker for continuous budget awareness and BATS for adaptive planning, dynamically adjusting strategy based on remaining resources. These methods significantly improve cos...

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17006
• PDF: https://arxiv.org/pdf/2511.17006

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIAgents #ToolUse #ResourceManagement #AgentScaling #AIResearch

172 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PRInTS: Reward Modeling for Long-Horizon Information Seeking

📝 Summary:
PRInTS is a generative process reward model that improves AI agents information-seeking. It provides dense scoring on step quality and summarizes long trajectories to manage context. PRInTS enhances agent performance, matching or surpassing frontier models with a smaller backbone.

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19314
• PDF: https://arxiv.org/pdf/2511.19314

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#RewardModeling #InformationSeeking #AIagents #GenerativeAI #MachineLearning

209 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

📝 Summary:
PC-Agent is a hierarchical multi-agent framework improving MLLM-based GUI agents for complex PC tasks. It uses an Active Perception Module and a hierarchical decision-making architecture with Manager, Progress, and Decision agents. A Reflection agent provides feedback. It achieved a 32% task succ...

🔹 Publication Date: Published on Feb 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.14282
• PDF: https://arxiv.org/pdf/2502.14282
• Github: https://github.com/X-PLUG/MobileAgent/tree/main/PC-Agent

✨ Spaces citing this paper:
• https://huggingface.co/spaces/junyangwang0410/PC-Agent

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MultiAgentSystems #AIAgents #MLLMs #PCAutomation #DeepLearning

258 views12:08

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Fara-7B: An Efficient Agentic Model for Computer Use

📝 Summary:
FaraGen creates synthetic datasets for computer use agents, solving a data scarcity problem. This data trains Fara-7B, a small on-device model that perceives computers via screenshots and outperforms larger models on diverse web tasks.

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19663
• PDF: https://arxiv.org/pdf/2511.19663

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIAgents #OnDeviceAI #SyntheticData #MachineLearning #ComputerVision

303 views04:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

📝 Summary:
Agent0-VL is a self-evolving vision-language agent that integrates tool usage into both reasoning and self-evaluation. It uses a Solver and Verifier in a self-evolving cycle for continuous improvement without human annotation or external rewards, achieving a 12.5% performance gain.

🔹 Publication Date: Published on Nov 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19900
• PDF: https://arxiv.org/pdf/2511.19900

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AIAgents #VisionLanguage #SelfEvolvingAI #ToolAugmentedAI #AIResearch

270 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Latent Collaboration in Multi-Agent Systems

📝 Summary:
LatentMAS enables LLM agents to collaborate directly in latent space, surpassing text-based communication. This boosts reasoning quality, accuracy, and efficiency speed, tokens without extra training.

🔹 Publication Date: Published on Nov 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20639
• PDF: https://arxiv.org/pdf/2511.20639
• Github: https://github.com/Gen-Verse/LatentMAS

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLM #MultiAgentSystems #LatentSpace #AIAgents #ArtificialIntelligence

337 views04:02

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform