✨Parallel Context-of-Experts Decoding for Retrieval Augmented Generation
📝 Summary:
Parallel Context-of-Experts Decoding Pced is a training-free framework for multi-document RAG that avoids prefill bottlenecks. It treats documents as isolated experts, using a retrieval-aware contrastive decoding rule to synchronize predictions and recover cross-document reasoning.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08670
• PDF: https://arxiv.org/pdf/2601.08670
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#RAG #LLM #NLP #AI #Decoding
📝 Summary:
Parallel Context-of-Experts Decoding Pced is a training-free framework for multi-document RAG that avoids prefill bottlenecks. It treats documents as isolated experts, using a retrieval-aware contrastive decoding rule to synchronize predictions and recover cross-document reasoning.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08670
• PDF: https://arxiv.org/pdf/2601.08670
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#RAG #LLM #NLP #AI #Decoding
✨Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
📝 Summary:
The Engram module introduces conditional memory as a new sparsity axis for Transformers, improving knowledge lookup and reasoning. It outperforms MoE, boosting performance across domains by offloading static knowledge and enhancing efficiency.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07372
• PDF: https://arxiv.org/pdf/2601.07372
• Github: https://github.com/deepseek-ai/Engram
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AI #MachineLearning #Transformers #Sparsity
📝 Summary:
The Engram module introduces conditional memory as a new sparsity axis for Transformers, improving knowledge lookup and reasoning. It outperforms MoE, boosting performance across domains by offloading static knowledge and enhancing efficiency.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07372
• PDF: https://arxiv.org/pdf/2601.07372
• Github: https://github.com/deepseek-ai/Engram
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AI #MachineLearning #Transformers #Sparsity
✨FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
📝 Summary:
FunAudioLLM enhances natural voice interactions with LLMs by integrating SenseVoice for multilingual speech recognition and CosyVoice for natural, multi-style speech generation. This enables applications like speech-to-speech translation and emotional voice chat.
🔹 Publication Date: Published on Jul 4, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2407.04051
• PDF: https://arxiv.org/pdf/2407.04051
• Github: https://github.com/FunAudioLLM
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #VoiceAI #SpeechRecognition #SpeechSynthesis #MultimodalAI
📝 Summary:
FunAudioLLM enhances natural voice interactions with LLMs by integrating SenseVoice for multilingual speech recognition and CosyVoice for natural, multi-style speech generation. This enables applications like speech-to-speech translation and emotional voice chat.
🔹 Publication Date: Published on Jul 4, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2407.04051
• PDF: https://arxiv.org/pdf/2407.04051
• Github: https://github.com/FunAudioLLM
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #VoiceAI #SpeechRecognition #SpeechSynthesis #MultimodalAI
❤1
This media is not supported in your browser
VIEW IN TELEGRAM
✨3AM: Segment Anything with Geometric Consistency in Videos
📝 Summary:
3AM enhances video object segmentation by integrating 3D-aware features from MUSt3R into SAM2. This improves viewpoint consistency and geometric recognition using only RGB input at inference, significantly outperforming prior methods on challenging datasets.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08831
• PDF: https://arxiv.org/pdf/2601.08831
• Project Page: https://jayisaking.github.io/3AM-Page/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#VideoSegmentation #ComputerVision #DeepLearning #GeometricAI #AI
📝 Summary:
3AM enhances video object segmentation by integrating 3D-aware features from MUSt3R into SAM2. This improves viewpoint consistency and geometric recognition using only RGB input at inference, significantly outperforming prior methods on challenging datasets.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08831
• PDF: https://arxiv.org/pdf/2601.08831
• Project Page: https://jayisaking.github.io/3AM-Page/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#VideoSegmentation #ComputerVision #DeepLearning #GeometricAI #AI
✨ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios
📝 Summary:
ViDoRe v3 is a new multimodal RAG benchmark for complex queries over visually rich, multi-language documents. It shows visual retrievers and late-interaction models improve performance, though models struggle with non-textual elements and visual grounding.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08620
• PDF: https://arxiv.org/pdf/2601.08620
✨ Datasets citing this paper:
• https://huggingface.co/datasets/vidore/vidore_v3_physics
• https://huggingface.co/datasets/vidore/vidore_v3_computer_science
• https://huggingface.co/datasets/vidore/vidore_v3_finance_en
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#RAG #MultimodalAI #AIResearch #NLP #ComputerVision
📝 Summary:
ViDoRe v3 is a new multimodal RAG benchmark for complex queries over visually rich, multi-language documents. It shows visual retrievers and late-interaction models improve performance, though models struggle with non-textual elements and visual grounding.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08620
• PDF: https://arxiv.org/pdf/2601.08620
✨ Datasets citing this paper:
• https://huggingface.co/datasets/vidore/vidore_v3_physics
• https://huggingface.co/datasets/vidore/vidore_v3_computer_science
• https://huggingface.co/datasets/vidore/vidore_v3_finance_en
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#RAG #MultimodalAI #AIResearch #NLP #ComputerVision
✨VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding
📝 Summary:
VideoLoom is a unified video large language model that achieves state-of-the-art performance in spatial-temporal video understanding through a specialized dataset and benchmark. AI-generated summary T...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07290
• PDF: https://arxiv.org/pdf/2601.07290
• Github: https://github.com/JPShi12/VideoLoom
🔹 Models citing this paper:
• https://huggingface.co/JPShi/VideoLoom-4B
• https://huggingface.co/JPShi/VideoLoom-8B
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
VideoLoom is a unified video large language model that achieves state-of-the-art performance in spatial-temporal video understanding through a specialized dataset and benchmark. AI-generated summary T...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07290
• PDF: https://arxiv.org/pdf/2601.07290
• Github: https://github.com/JPShi12/VideoLoom
🔹 Models citing this paper:
• https://huggingface.co/JPShi/VideoLoom-4B
• https://huggingface.co/JPShi/VideoLoom-8B
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨UM-Text: A Unified Multimodal Model for Image Understanding
📝 Summary:
A unified multimodal model for visual text editing that understands natural language instructions and maintains stylistic consistency with reference images through visual language modeling and context...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08321
• PDF: https://arxiv.org/pdf/2601.08321
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A unified multimodal model for visual text editing that understands natural language instructions and maintains stylistic consistency with reference images through visual language modeling and context...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08321
• PDF: https://arxiv.org/pdf/2601.08321
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨GeoMotionGPT: Geometry-Aligned Motion Understanding with Large Language Models
📝 Summary:
GeoMotionGPT introduces a framework aligning motion token geometry with language model embeddings using orthogonal constraints and sparse projection. This unified geometric basis enhances LLM motion reasoning, achieving a 20% performance improvement on HumanML3D.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07632
• PDF: https://arxiv.org/pdf/2601.07632
• Project Page: https://huggingface.co/papers?q=sparse%20projection
• Github: https://github.com/JYe16/GeoMotionGPT
🔹 Models citing this paper:
• https://huggingface.co/zy22b/GeoMotionGPT
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
GeoMotionGPT introduces a framework aligning motion token geometry with language model embeddings using orthogonal constraints and sparse projection. This unified geometric basis enhances LLM motion reasoning, achieving a 20% performance improvement on HumanML3D.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07632
• PDF: https://arxiv.org/pdf/2601.07632
• Project Page: https://huggingface.co/papers?q=sparse%20projection
• Github: https://github.com/JYe16/GeoMotionGPT
🔹 Models citing this paper:
• https://huggingface.co/zy22b/GeoMotionGPT
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios
📝 Summary:
EvoEnv is a new dynamic evaluation environment for MLLMs. It assesses agent robustness in real-world tasks, focusing on context-aware scheduling, active exploration, and continuous learning. Current MLLMs show significant deficiencies in these dynamic scenarios.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08173
• PDF: https://arxiv.org/pdf/2601.08173
• Github: https://github.com/KnowledgeXLab/EvoEnv
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
EvoEnv is a new dynamic evaluation environment for MLLMs. It assesses agent robustness in real-world tasks, focusing on context-aware scheduling, active exploration, and continuous learning. Current MLLMs show significant deficiencies in these dynamic scenarios.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08173
• PDF: https://arxiv.org/pdf/2601.08173
• Github: https://github.com/KnowledgeXLab/EvoEnv
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
📝 Summary:
Fast-ThinkAct is an efficient vision-language-action framework that reduces inference latency by 89.3% through compact latent reasoning while maintaining long-horizon planning and few-shot adaptation ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09708
• PDF: https://arxiv.org/pdf/2601.09708
• Project Page: https://jasper0314-huang.github.io/fast-thinkact/
• Github: https://jasper0314-huang.github.io/fast-thinkact/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Fast-ThinkAct is an efficient vision-language-action framework that reduces inference latency by 89.3% through compact latent reasoning while maintaining long-horizon planning and few-shot adaptation ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09708
• PDF: https://arxiv.org/pdf/2601.09708
• Project Page: https://jasper0314-huang.github.io/fast-thinkact/
• Github: https://jasper0314-huang.github.io/fast-thinkact/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation
📝 Summary:
Scientific reasoning relies not only on logical inference but also on activating prior knowledge and experiential structures. Memory can efficiently reuse knowledge and enhance reasoning consistency a...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09274
• PDF: https://arxiv.org/pdf/2601.09274
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Scientific reasoning relies not only on logical inference but also on activating prior knowledge and experiential structures. Memory can efficiently reuse knowledge and enhance reasoning consistency a...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09274
• PDF: https://arxiv.org/pdf/2601.09274
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MAXS: Meta-Adaptive Exploration with LLM Agents
📝 Summary:
MAXS is a meta-adaptive reasoning framework for LLM agents that improves multi-tool reasoning through lookahead strategies and trajectory convergence mechanisms, balancing global effectiveness and com...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09259
• PDF: https://arxiv.org/pdf/2601.09259
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MAXS is a meta-adaptive reasoning framework for LLM agents that improves multi-tool reasoning through lookahead strategies and trajectory convergence mechanisms, balancing global effectiveness and com...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09259
• PDF: https://arxiv.org/pdf/2601.09259
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Controlled Self-Evolution for Algorithmic Code Optimization
📝 Summary:
Controlled Self-Evolution method improves code generation through diversified initialization, feedback-guided genetic evolution, and hierarchical memory to enhance exploration efficiency and solution ...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07348
• PDF: https://arxiv.org/pdf/2601.07348
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Controlled Self-Evolution method improves code generation through diversified initialization, feedback-guided genetic evolution, and hierarchical memory to enhance exploration efficiency and solution ...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07348
• PDF: https://arxiv.org/pdf/2601.07348
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL
📝 Summary:
SkinFlow optimizes dermatological diagnosis by enhancing visual information transmission efficiency, addressing 'diffuse attention' in large models. It uses a Dynamic Vision Encoder and two-stage RL to significantly outperform massive general-purpose models, proving efficiency beats raw parameter...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09136
• PDF: https://arxiv.org/pdf/2601.09136
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SkinFlow optimizes dermatological diagnosis by enhancing visual information transmission efficiency, addressing 'diffuse attention' in large models. It uses a Dynamic Vision Encoder and two-stage RL to significantly outperform massive general-purpose models, proving efficiency beats raw parameter...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09136
• PDF: https://arxiv.org/pdf/2601.09136
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Are LLMs Vulnerable to Preference-Undermining Attacks (PUA)? A Factorial Analysis Methodology for Diagnosing the Trade-off between Preference Alignment and Real-World Validity
📝 Summary:
Research examines how large language models can be manipulated through preference-undermining attacks that exploit alignment objectives, revealing model vulnerabilities and proposing a factorial evalu...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06596
• PDF: https://arxiv.org/pdf/2601.06596
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Research examines how large language models can be manipulated through preference-undermining attacks that exploit alignment objectives, revealing model vulnerabilities and proposing a factorial evalu...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06596
• PDF: https://arxiv.org/pdf/2601.06596
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection
📝 Summary:
FocusUI is an efficient UI grounding framework that reduces computational overhead by selecting relevant visual tokens while preserving positional continuity through a novel PosPad strategy. AI-genera...
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.03928
• PDF: https://arxiv.org/pdf/2601.03928
• Github: https://github.com/showlab/FocusUI
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
FocusUI is an efficient UI grounding framework that reduces computational overhead by selecting relevant visual tokens while preserving positional continuity through a novel PosPad strategy. AI-genera...
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.03928
• PDF: https://arxiv.org/pdf/2601.03928
• Github: https://github.com/showlab/FocusUI
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨Efficient Camera-Controlled Video Generation of Static Scenes via Sparse Diffusion and 3D Rendering
📝 Summary:
Diffusion-based video generation is made more efficient through keyframe-based 3D reconstruction and rendering, enabling faster synthesis with maintained visual quality. AI-generated summary Modern vi...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09697
• PDF: https://arxiv.org/pdf/2601.09697
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Diffusion-based video generation is made more efficient through keyframe-based 3D reconstruction and rendering, enabling faster synthesis with maintained visual quality. AI-generated summary Modern vi...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09697
• PDF: https://arxiv.org/pdf/2601.09697
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation
📝 Summary:
DeepResearchEval presents an automated framework for creating complex research tasks and evaluating them through agent-based methods that adapt to task specifics and verify facts without relying on ci...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09688
• PDF: https://arxiv.org/pdf/2601.09688
• Github: https://github.com/Infinity-AILab/DeepResearchEval
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DeepResearchEval presents an automated framework for creating complex research tasks and evaluating them through agent-based methods that adapt to task specifics and verify facts without relying on ci...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09688
• PDF: https://arxiv.org/pdf/2601.09688
• Github: https://github.com/Infinity-AILab/DeepResearchEval
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨TranslateGemma Technical Report
📝 Summary:
TranslateGemma enhances Gemma 3's multilingual capabilities through two-stage fine-tuning with synthetic and human-translated data, achieving superior translation quality with improved efficiency. AI-...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09012
• PDF: https://arxiv.org/pdf/2601.09012
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
TranslateGemma enhances Gemma 3's multilingual capabilities through two-stage fine-tuning with synthetic and human-translated data, achieving superior translation quality with improved efficiency. AI-...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09012
• PDF: https://arxiv.org/pdf/2601.09012
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding
📝 Summary:
OpenVoxel enables open-vocabulary 3D scene understanding through training-free grouping and captioning of sparse voxels using Vision Language Models and Multi-modal Large Language Models. AI-generated...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09575
• PDF: https://arxiv.org/pdf/2601.09575
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OpenVoxel enables open-vocabulary 3D scene understanding through training-free grouping and captioning of sparse voxels using Vision Language Models and Multi-modal Large Language Models. AI-generated...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09575
• PDF: https://arxiv.org/pdf/2601.09575
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines
📝 Summary:
EvoFSM is a structured self-evolving framework for LLM agents that uses finite state machines to improve adaptability while maintaining control through constrained optimization and memory mechanisms. ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09465
• PDF: https://arxiv.org/pdf/2601.09465
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
EvoFSM is a structured self-evolving framework for LLM agents that uses finite state machines to improve adaptability while maintaining control through constrained optimization and memory mechanisms. ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09465
• PDF: https://arxiv.org/pdf/2601.09465
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research