✨Structured Episodic Event Memory
📝 Summary:
Structured Episodic Event Memory (SEEM) enhances LLMs with hierarchical memory architecture combining graph and episodic layers for improved narrative coherence and reasoning. AI-generated summary Cur...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06411
• PDF: https://arxiv.org/pdf/2601.06411
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Structured Episodic Event Memory (SEEM) enhances LLMs with hierarchical memory architecture combining graph and episodic layers for improved narrative coherence and reasoning. AI-generated summary Cur...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06411
• PDF: https://arxiv.org/pdf/2601.06411
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Lost in the Noise: How Reasoning Models Fail with Contextual Distractors
📝 Summary:
NoisyBench benchmark reveals significant performance degradation in state-of-the-art models when exposed to noisy contextual information, with agentic workflows amplifying errors and attention mechani...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07226
• PDF: https://arxiv.org/pdf/2601.07226
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
NoisyBench benchmark reveals significant performance degradation in state-of-the-art models when exposed to noisy contextual information, with agentic workflows amplifying errors and attention mechani...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07226
• PDF: https://arxiv.org/pdf/2601.07226
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
📝 Summary:
Code LLMs trained on fully synthetic data using a feature-based synthesis pipeline achieve superior performance on competitive programming benchmarks while reducing dependence on real-world coding dat...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06953
• PDF: https://arxiv.org/pdf/2601.06953
• Github: https://github.com/JieWu02/X-Coder
🔹 Models citing this paper:
• https://huggingface.co/IIGroup/X-Coder-SFT-Qwen3-8B
• https://huggingface.co/IIGroup/X-Coder-SFT-Qwen2.5-7B
• https://huggingface.co/IIGroup/X-Coder-RL-Qwen2.5-7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/IIGroup/X-Coder-SFT-376k
• https://huggingface.co/datasets/IIGroup/X-Coder-RL-40k
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Code LLMs trained on fully synthetic data using a feature-based synthesis pipeline achieve superior performance on competitive programming benchmarks while reducing dependence on real-world coding dat...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06953
• PDF: https://arxiv.org/pdf/2601.06953
• Github: https://github.com/JieWu02/X-Coder
🔹 Models citing this paper:
• https://huggingface.co/IIGroup/X-Coder-SFT-Qwen3-8B
• https://huggingface.co/IIGroup/X-Coder-SFT-Qwen2.5-7B
• https://huggingface.co/IIGroup/X-Coder-RL-Qwen2.5-7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/IIGroup/X-Coder-SFT-376k
• https://huggingface.co/datasets/IIGroup/X-Coder-RL-40k
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨ShowUI-Aloha: Human-Taught GUI Agent
📝 Summary:
ShowUI-Aloha presents a pipeline that converts unstructured human screen recordings into structured GUI tasks through recording, semantic interpretation, planning, and execution components. AI-generat...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07181
• PDF: https://arxiv.org/pdf/2601.07181
• Project Page: https://showlab.github.io/Aloha_Page/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ShowUI-Aloha presents a pipeline that converts unstructured human screen recordings into structured GUI tasks through recording, semantic interpretation, planning, and execution components. AI-generat...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07181
• PDF: https://arxiv.org/pdf/2601.07181
• Project Page: https://showlab.github.io/Aloha_Page/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SketchJudge: A Diagnostic Benchmark for Grading Hand-drawn Diagrams with Multimodal Large Language Models
📝 Summary:
SketchJudge benchmark evaluates multimodal large language models' ability to grade hand-drawn STEM diagrams, revealing significant limitations in visual understanding compared to human performance. AI...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06944
• PDF: https://arxiv.org/pdf/2601.06944
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SketchJudge benchmark evaluates multimodal large language models' ability to grade hand-drawn STEM diagrams, revealing significant limitations in visual understanding compared to human performance. AI...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06944
• PDF: https://arxiv.org/pdf/2601.06944
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨BabyVision: Visual Reasoning Beyond Language
📝 Summary:
Current multimodal large language models exhibit significant gaps in fundamental visual understanding compared to human children, as demonstrated by the BabyVision benchmark. AI-generated summary Whil...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06521
• PDF: https://arxiv.org/pdf/2601.06521
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Current multimodal large language models exhibit significant gaps in fundamental visual understanding compared to human children, as demonstrated by the BabyVision benchmark. AI-generated summary Whil...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06521
• PDF: https://arxiv.org/pdf/2601.06521
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence
📝 Summary:
3D CoCa v2 enhances 3D captioning by combining contrastive vision-language learning with spatially-aware 3D scene encoding and test-time search for improved generalization across diverse environments....
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06496
• PDF: https://arxiv.org/pdf/2601.06496
• Github: https://github.com/AIGeeksGroup/3DCoCav2
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
3D CoCa v2 enhances 3D captioning by combining contrastive vision-language learning with spatially-aware 3D scene encoding and test-time search for improved generalization across diverse environments....
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06496
• PDF: https://arxiv.org/pdf/2601.06496
• Github: https://github.com/AIGeeksGroup/3DCoCav2
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨e5-omni: Explicit Cross-modal Alignment for Omni-modal Embeddings
📝 Summary:
Omni-modal embedding models face challenges with modality-dependent similarity scaling, ineffective in-batch negatives, and mismatched statistics across modalities, which are addressed through explici...
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/Haon-Chen/e5-omni
• PDF: https://arxiv.org/pdf/2601.03666
🔹 Models citing this paper:
• https://huggingface.co/Haon-Chen/e5-omni-3B
• https://huggingface.co/Haon-Chen/e5-omni-7B
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Omni-modal embedding models face challenges with modality-dependent similarity scaling, ineffective in-batch negatives, and mismatched statistics across modalities, which are addressed through explici...
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/Haon-Chen/e5-omni
• PDF: https://arxiv.org/pdf/2601.03666
🔹 Models citing this paper:
• https://huggingface.co/Haon-Chen/e5-omni-3B
• https://huggingface.co/Haon-Chen/e5-omni-7B
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era
📝 Summary:
MegaFlow is a distributed orchestration system for large-scale AI agent training and evaluation. It addresses the lack of open-source infrastructure by providing efficient scheduling, resource allocation, and task management through modular services. MegaFlow successfully handles tens of thousand...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07526
• PDF: https://arxiv.org/pdf/2601.07526
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MegaFlow is a distributed orchestration system for large-scale AI agent training and evaluation. It addresses the lack of open-source infrastructure by providing efficient scheduling, resource allocation, and task management through modular services. MegaFlow successfully handles tens of thousand...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07526
• PDF: https://arxiv.org/pdf/2601.07526
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Dr. Zero: Self-Evolving Search Agents without Training Data
📝 Summary:
A data-free self-evolution framework enables large language models to autonomously improve reasoning capabilities through iterative question generation and solving, achieving performance comparable to...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07055
• PDF: https://arxiv.org/pdf/2601.07055
• Github: https://github.com/facebookresearch/drzero
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A data-free self-evolution framework enables large language models to autonomously improve reasoning capabilities through iterative question generation and solving, achieving performance comparable to...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07055
• PDF: https://arxiv.org/pdf/2601.07055
• Github: https://github.com/facebookresearch/drzero
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts
📝 Summary:
Large reasoning models' inference latency can be reduced by routing reasoning steps to larger models based on the entropy of their first token, enabling efficient collaborative inference without addit...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05110
• PDF: https://arxiv.org/pdf/2601.05110
• Github: https://github.com/Zengwh02/GlimpRouter
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large reasoning models' inference latency can be reduced by routing reasoning steps to larger models based on the entropy of their first token, enabling efficient collaborative inference without addit...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05110
• PDF: https://arxiv.org/pdf/2601.05110
• Github: https://github.com/Zengwh02/GlimpRouter
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OpenTinker: Separating Concerns in Agentic Reinforcement Learning
📝 Summary:
OpenTinker provides a modular infrastructure for reinforcement learning of large language model agents with separated components and managed execution runtime. AI-generated summary We introduce OpenTi...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.07376
• PDF: https://arxiv.org/pdf/2601.07376
• Project Page: https://open-tinker.github.io/opentinker-page/
• Github: https://github.com/open-tinker/OpenTinker
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OpenTinker provides a modular infrastructure for reinforcement learning of large language model agents with separated components and managed execution runtime. AI-generated summary We introduce OpenTi...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.07376
• PDF: https://arxiv.org/pdf/2601.07376
• Project Page: https://open-tinker.github.io/opentinker-page/
• Github: https://github.com/open-tinker/OpenTinker
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨On the Fallacy of Global Token Perplexity in Spoken Language Model Evaluation
📝 Summary:
Speech models trained on raw audio can generate appropriate content while maintaining speaker and emotion attributes, but traditional text-based evaluation methods underestimate speech characteristics...
🔹 Publication Date: Published on Jan 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06329
• PDF: https://arxiv.org/pdf/2601.06329
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Speech models trained on raw audio can generate appropriate content while maintaining speaker and emotion attributes, but traditional text-based evaluation methods underestimate speech characteristics...
🔹 Publication Date: Published on Jan 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06329
• PDF: https://arxiv.org/pdf/2601.06329
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Are LLM Decisions Faithful to Verbal Confidence?
📝 Summary:
Large language models exhibit a disconnect between their expressed uncertainty and strategic decision-making under varying penalty conditions, failing to adjust abstention policies even when optimal. ...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07767
• PDF: https://arxiv.org/pdf/2601.07767
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large language models exhibit a disconnect between their expressed uncertainty and strategic decision-making under varying penalty conditions, failing to adjust abstention policies even when optimal. ...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07767
• PDF: https://arxiv.org/pdf/2601.07767
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Codified Foreshadowing-Payoff Text Generation
📝 Summary:
Large language models struggle with maintaining long-range narrative dependencies, but a new framework called CFPG addresses this by structuring narrative continuity through executable causal predicat...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07033
• PDF: https://arxiv.org/pdf/2601.07033
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large language models struggle with maintaining long-range narrative dependencies, but a new framework called CFPG addresses this by structuring narrative continuity through executable causal predicat...
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07033
• PDF: https://arxiv.org/pdf/2601.07033
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction
📝 Summary:
This paper presents SteeM, a framework for dynamically regulating memory reliance in LLM agents. It allows users to balance innovation with historical fidelity, overcoming the all-or-nothing problem of memory use. This approach outperforms conventional methods for personalized human-agent interac...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05107
• PDF: https://arxiv.org/pdf/2601.05107
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AI #HumanAgentInteraction #Memory #MachineLearning
📝 Summary:
This paper presents SteeM, a framework for dynamically regulating memory reliance in LLM agents. It allows users to balance innovation with historical fidelity, overcoming the all-or-nothing problem of memory use. This approach outperforms conventional methods for personalized human-agent interac...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05107
• PDF: https://arxiv.org/pdf/2601.05107
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AI #HumanAgentInteraction #Memory #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
✨DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving
📝 Summary:
DrivingGen is the first comprehensive benchmark for generative driving world models, addressing prior evaluation gaps. It uses diverse datasets and new metrics to assess visual realism, trajectory plausibility, temporal coherence, and controllability. Benchmarking reveals trade-offs between visua...
🔹 Publication Date: Published on Jan 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01528
• PDF: https://arxiv.org/pdf/2601.01528
• Project Page: https://drivinggen-bench.github.io/
• Github: https://github.com/youngzhou1999/DrivingGen
✨ Datasets citing this paper:
• https://huggingface.co/datasets/yangzhou99/DrivingGen
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AutonomousDriving #GenerativeAI #WorldModels #AIResearch #Benchmarking
📝 Summary:
DrivingGen is the first comprehensive benchmark for generative driving world models, addressing prior evaluation gaps. It uses diverse datasets and new metrics to assess visual realism, trajectory plausibility, temporal coherence, and controllability. Benchmarking reveals trade-offs between visua...
🔹 Publication Date: Published on Jan 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01528
• PDF: https://arxiv.org/pdf/2601.01528
• Project Page: https://drivinggen-bench.github.io/
• Github: https://github.com/youngzhou1999/DrivingGen
✨ Datasets citing this paper:
• https://huggingface.co/datasets/yangzhou99/DrivingGen
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AutonomousDriving #GenerativeAI #WorldModels #AIResearch #Benchmarking
✨"TODO: Fix the Mess Gemini Created": Towards Understanding GenAI-Induced Self-Admitted Technical Debt
📝 Summary:
Developers admit technical debt GIST in AI-assisted code, often due to postponed testing, incomplete adaptation, and limited understanding. This debt emerges when incorporating AI-generated code despite developer uncertainty about its behavior or correctness.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07786
• PDF: https://arxiv.org/pdf/2601.07786
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Developers admit technical debt GIST in AI-assisted code, often due to postponed testing, incomplete adaptation, and limited understanding. This debt emerges when incorporating AI-generated code despite developer uncertainty about its behavior or correctness.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07786
• PDF: https://arxiv.org/pdf/2601.07786
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent
📝 Summary:
OS-Symphony is a framework enhancing computer-using agents with robustness and generalization. It features a Reflection-Memory Agent for self-correction and a Multimodal Searcher for visually aligned tutorials. This achieved state-of-the-art results on online benchmarks, including 65.84% on OSWorld.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07779
• PDF: https://arxiv.org/pdf/2601.07779
• Project Page: https://os-copilot.github.io/OS-Symphony
• Github: https://github.com/OS-Copilot/OS-Symphony
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OS-Symphony is a framework enhancing computer-using agents with robustness and generalization. It features a Reflection-Memory Agent for self-correction and a Multimodal Searcher for visually aligned tutorials. This achieved state-of-the-art results on online benchmarks, including 65.84% on OSWorld.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07779
• PDF: https://arxiv.org/pdf/2601.07779
• Project Page: https://os-copilot.github.io/OS-Symphony
• Github: https://github.com/OS-Copilot/OS-Symphony
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
📝 Summary:
Multi-Head Linear Attention addresses the performance degradation in linear attention by preserving representational diversity through head-wise token dimension computation, maintaining linear complex...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07832
• PDF: https://arxiv.org/pdf/2601.07832
• Project Page: https://dagroup-pku.github.io/MHLA/
• Github: https://github.com/DAGroup-PKU/MHLA
🔹 Models citing this paper:
• https://huggingface.co/DAGroup-PKU/MHLA
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-Head Linear Attention addresses the performance degradation in linear attention by preserving representational diversity through head-wise token dimension computation, maintaining linear complex...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07832
• PDF: https://arxiv.org/pdf/2601.07832
• Project Page: https://dagroup-pku.github.io/MHLA/
• Github: https://github.com/DAGroup-PKU/MHLA
🔹 Models citing this paper:
• https://huggingface.co/DAGroup-PKU/MHLA
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models
📝 Summary:
EvoToken-DLM introduces a diffusion-based language modeling approach that uses soft token distributions and continuous trajectory supervision to enable revisable decoding and outperforms existing base...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07351
• PDF: https://arxiv.org/pdf/2601.07351
• Project Page: https://aim-uofa.github.io/EvoTokenDLM/
• Github: https://github.com/aim-uofa/EvoTokenDLM
🔹 Models citing this paper:
• https://huggingface.co/zhongzero/EvoToken_LLaDA_Instruct_8B_Lora
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
EvoToken-DLM introduces a diffusion-based language modeling approach that uses soft token distributions and continuous trajectory supervision to enable revisable decoding and outperforms existing base...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07351
• PDF: https://arxiv.org/pdf/2601.07351
• Project Page: https://aim-uofa.github.io/EvoTokenDLM/
• Github: https://github.com/aim-uofa/EvoTokenDLM
🔹 Models citing this paper:
• https://huggingface.co/zhongzero/EvoToken_LLaDA_Instruct_8B_Lora
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research