ML Research Hub

✨PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

📝 Summary:
PaddleOCR-VL, a vision-language model combining NaViT-style dynamic resolution and ERNIE, achieves state-of-the-art performance in document parsing and element recognition with high efficiency. AI-gen...

🔹 Publication Date: Published on Oct 16, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.14528
• PDF: https://arxiv.org/pdf/2510.14528
• Github: https://github.com/PaddlePaddle/PaddleOCR

🔹 Models citing this paper:
• https://huggingface.co/PaddlePaddle/PaddleOCR-VL
• https://huggingface.co/PaddlePaddle/PP-DocLayoutV2
• https://huggingface.co/unsloth/PaddleOCR-VL

✨ Spaces citing this paper:
• https://huggingface.co/spaces/PaddlePaddle/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/seanpedrickcase/document_redaction
• https://huggingface.co/spaces/markobinario/PaddleOCR-VL_Online_Demo

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B...

In this report, we propose PaddleOCR-VL, a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model...

33 views09:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

34 views09:41

ML Research Hub

✨VibeVoice Technical Report

📝 Summary:
VibeVoice synthesizes long-form multi-speaker speech using next-token diffusion and a highly efficient continuous speech tokenizer, achieving superior performance and fidelity. AI-generated summary Th...

🔹 Publication Date: Published on Aug 26, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19205
• PDF: https://arxiv.org/pdf/2508.19205
• Project Page: https://microsoft.github.io/VibeVoice/
• Github: https://huggingface.co/collections/microsoft/vibevoice

🔹 Models citing this paper:
• https://huggingface.co/microsoft/VibeVoice-1.5B
• https://huggingface.co/microsoft/VibeVoice-Realtime-0.5B
• https://huggingface.co/aoi-ot/VibeVoice-Large

✨ Spaces citing this paper:
• https://huggingface.co/spaces/ChaitanyaChandra/VibeVoice
• https://huggingface.co/spaces/lths/VibeVoice-Demo
• https://huggingface.co/spaces/yasserrmd/VibeVoice

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

VibeVoice Technical Report

This report presents VibeVoice, a novel model designed to synthesize long-form speech with multiple speakers by employing next-token diffusion, which is a unified method for modeling continuous...

31 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Efficient Memory Management for Large Language Model Serving with PagedAttention

📝 Summary:
PagedAttention algorithm and vLLM system enhance the throughput of large language models by efficiently managing memory and reducing waste in the key-value cache. AI-generated summary High throughput ...

🔹 Publication Date: Published on Sep 12, 2023

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2309.06180
• PDF: https://arxiv.org/pdf/2309.06180
• Github: https://github.com/vllm-project/vllm

🔹 Models citing this paper:
• https://huggingface.co/theonlyengine/Flash-attention1

✨ Datasets citing this paper:
• https://huggingface.co/datasets/TheBlueScrubs/TheBlueScrubs-v1

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

24 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

28 views09:41

ML Research Hub

✨MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

📝 Summary:
MinerU2.5, a 1.2B-parameter document parsing vision-language model, achieves state-of-the-art recognition accuracy with computational efficiency through a coarse-to-fine parsing strategy. AI-generated...

🔹 Publication Date: Published on Sep 26, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU

🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF

✨ Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/pzp5700/Paper2Any

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

MinerU2.5: A Decoupled Vision-Language Model for Efficient...

We introduce MinerU2.5, a 1.2B-parameter document parsing vision-language model that achieves state-of-the-art recognition accuracy while maintaining exceptional computational efficiency. Our...

27 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨HunyuanVideo 1.5 Technical Report

📝 Summary:
HunyuanVideo 1.5 is a lightweight video generation model with state-of-the-art visual quality and motion coherence, using a DiT architecture with SSTA and an efficient video super-resolution network. ...

🔹 Publication Date: Published on Nov 24, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18870
• PDF: https://arxiv.org/pdf/2511.18870
• Github: https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5

🔹 Models citing this paper:
• https://huggingface.co/tencent/HunyuanVideo-1.5
• https://huggingface.co/EvanEternal/Astra

✨ Spaces citing this paper:
• https://huggingface.co/spaces/gagndeep/HF-Worldplay
• https://huggingface.co/spaces/akhaliq/anycoder-355bd392
• https://huggingface.co/spaces/Xenurox/tencent-HunyuanVideo-1.5

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

28 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UniVideo: Unified Understanding, Generation, and Editing for Videos

📝 Summary:
UniVideo, a dual-stream framework combining a Multimodal Large Language Model and a Multimodal DiT, extends unified modeling to video generation and editing, achieving state-of-the-art performance and...

🔹 Publication Date: Published on Oct 9, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.08377
• PDF: https://arxiv.org/pdf/2510.08377
• Project Page: https://congwei1230.github.io/UniVideo/
• Github: https://github.com/KwaiVGI/UniVideo

🔹 Models citing this paper:
• https://huggingface.co/KlingTeam/UniVideo

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Harryji168/univideo-studio

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

25 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MinerU: An Open-Source Solution for Precise Document Content Extraction

📝 Summary:
MinerU is an open-source tool that enhances document content extraction using fine-tuned models and pre/postprocessing rules across diverse document types. AI-generated summary Document content analys...

🔹 Publication Date: Published on Sep 27, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU

🔹 Models citing this paper:
• https://huggingface.co/jiaxianustc/BioMiner-MinerU-Model

✨ Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/Hunter0000/MinerU

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

27 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨TradingAgents: Multi-Agents LLM Financial Trading Framework

📝 Summary:
A multi-agent framework using large language models for stock trading simulates real-world trading firms, improving performance metrics like cumulative returns and Sharpe ratio. AI-generated summary S...

🔹 Publication Date: Published on Dec 28, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.20138
• PDF: https://arxiv.org/pdf/2412.20138
• Github: https://github.com/tauricresearch/tradingagents

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList
• https://huggingface.co/spaces/Ervin2077/qiu

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

32 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

1:04

This media is not supported in your browser

VIEW IN TELEGRAM

✨DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

📝 Summary:
A novel video face swapping framework combines image face swapping techniques with diffusion transformers and curriculum learning to achieve superior identity preservation and visual realism. AI-gener...

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01425
• PDF: https://arxiv.org/pdf/2601.01425
• Project Page: https://guoxu1233.github.io/DreamID-V/
• Github: https://guoxu1233.github.io/DreamID-V/

🔹 Models citing this paper:
• https://huggingface.co/XuGuo699/DreamID-V

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

27 views09:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Self-Supervised Prompt Optimization

📝 Summary:
A self-supervised framework optimizes prompts for both closed and open-ended tasks by evaluating LLM outputs without external references, reducing costs and required data. AI-generated summary Well-de...

🔹 Publication Date: Published on Feb 7, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.06855
• PDF: https://arxiv.org/pdf/2502.06855
• Github: https://github.com/geekan/metagpt

✨ Spaces citing this paper:
• https://huggingface.co/spaces/XiangJinYu/SPO
• https://huggingface.co/spaces/tang-x/SPO
• https://huggingface.co/spaces/ositamiles/SPO

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

26 views09:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Recursive Language Models

📝 Summary:
We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference strategy...

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/recursive-language-models-6610-16b3d94b
• PDF: https://arxiv.org/pdf/2512.24601
• Project Page: https://alexzhang13.github.io/blog/2025/rlm/
• Github: https://github.com/alexzhang13/rlm/tree/main

✨ Spaces citing this paper:
• https://huggingface.co/spaces/sergiopaniego/repl

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

30 views09:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

📝 Summary:
Youtu-LLM is a lightweight language model optimized for computational efficiency and agentic intelligence through a compact architecture, STEM-focused training curriculum, and scalable mid-training st...

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/youtu-llm-unlocking-the-native-agentic-potential-for-lightweight-large-language-models-8640-ff62768a
• PDF: https://arxiv.org/pdf/2512.24618
• Project Page: https://youtu-tip.com/#llm
• Github: https://github.com/TencentCloudADP/youtu-tip

🔹 Models citing this paper:
• https://huggingface.co/tencent/Youtu-LLM-2B
• https://huggingface.co/tencent/Youtu-LLM-2B-Base
• https://huggingface.co/tencent/Youtu-LLM-2B-GGUF

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

29 views09:42

✨ Explore Data Science 📝 Write your paper

✨NitroGen: An Open Foundation Model for Generalist Gaming Agents

📝 Summary:
NitroGen is a vision-action foundation model trained on extensive gameplay data that demonstrates strong cross-game generalization and effective transfer learning capabilities. AI-generated summary We...

🔹 Publication Date: Published on Jan 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02427
• PDF: https://arxiv.org/pdf/2601.02427
• Project Page: https://nitrogen.minedojo.org/
• Github: https://github.com/MineDojo/NitroGen

🔹 Models citing this paper:
• https://huggingface.co/nvidia/NitroGen

✨ Datasets citing this paper:
• https://huggingface.co/datasets/nvidia/NitroGen

✨ Spaces citing this paper:
• https://huggingface.co/spaces/dennny123/NitroGen-SuperstarSaga
• https://huggingface.co/spaces/blanchon/NitroGen-Pokemon

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

22 views09:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

📝 Summary:
Mem0, a memory-centric architecture with graph-based memory, enhances long-term conversational coherence in LLMs by efficiently extracting, consolidating, and retrieving information, outperforming exi...

🔹 Publication Date: Published on Apr 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.19413
• PDF: https://arxiv.org/pdf/2504.19413
• Github: https://github.com/mem0ai/mem0

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

27 views09:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

27 views09:42

ML Research Hub

✨Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

📝 Summary:
The study reveals that in text-to-image generation, CFG Augmentation is the primary driver of few-step distillation in Distribution Matching Distillation (DMD), while the distribution matching term ac...

🔹 Publication Date: Published on Nov 27, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.22677
• PDF: https://arxiv.org/pdf/2511.22677
• Project Page: https://tongyi-mai.github.io/Z-Image-blog/
• Github: https://github.com/Tongyi-MAI/Z-Image/tree/main

🔹 Models citing this paper:
• https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
• https://huggingface.co/unsloth/Z-Image-Turbo-GGUF
• https://huggingface.co/tsqn/Z-Image-Turbo_fp32-fp16-bf16_full_and_ema-only

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo
• https://huggingface.co/spaces/mrfakename/Z-Image-Turbo
• https://huggingface.co/spaces/linoyts/open-image-generation

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

Decoupled DMD: CFG Augmentation as the Spear, Distribution...

Diffusion model distillation has emerged as a powerful technique for creating efficient few-step and single-step generators. Among these, Distribution Matching Distillation (DMD) and its variants...

27 views09:42

✨ Explore Data Science 📝 Write your paper

ML Research Hub

32 views09:43

ML Research Hub

✨Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

📝 Summary:
Z-Image, a 6B-parameter Scalable Single-Stream Diffusion Transformer (S3-DiT) model, achieves high-performance image generation with reduced computational cost, offering sub-second inference and compa...

🔹 Publication Date: Published on Nov 27, 2025

🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/z-image-an-efficient-image-generation-foundation-model-with-single-stream-diffusion-transformer-9846-b5faf99f
• PDF: https://arxiv.org/pdf/2511.22699
• Project Page: https://tongyi-mai.github.io/Z-Image-blog/
• Github: https://github.com/Tongyi-MAI/Z-Image

🔹 Models citing this paper:
• https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
• https://huggingface.co/unsloth/Z-Image-Turbo-GGUF
• https://huggingface.co/tsqn/Z-Image-Turbo_fp32-fp16-bf16_full_and_ema-only

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo
• https://huggingface.co/spaces/mrfakename/Z-Image-Turbo
• https://huggingface.co/spaces/linoyts/open-image-generation

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

Arxivlens

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer - AI Research Paper Analysis |…

AI-powered analysis of 'Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer'. The landscape of high-performance image generation models is currently dominated by proprietary systems, such as Nano Banana Pro and…

49 views09:43

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DeepCode: Open Agentic Coding

📝 Summary:
DeepCode, a fully autonomous framework, addresses the challenges of document-to-codebase synthesis by optimizing information flow through source compression, structured indexing, knowledge injection, ...

🔹 Publication Date: Published on Dec 8, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07921
• PDF: https://arxiv.org/pdf/2512.07921
• Project Page: https://huggingface.co/papers/2511.03404
• Github: https://github.com/HKUDS/DeepCode

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

69 views09:43

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform