✨DocDancer: Towards Agentic Document-Grounded Information Seeking
📝 Summary:
DocDancer is an end-to-end trained open-source document question answering agent that formulates the task as an information-seeking problem and uses a tool-driven framework with exploration and synthe...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05163
• PDF: https://arxiv.org/pdf/2601.05163
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DocDancer is an end-to-end trained open-source document question answering agent that formulates the task as an information-seeking problem and uses a tool-driven framework with exploration and synthe...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05163
• PDF: https://arxiv.org/pdf/2601.05163
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Multi-Scale Local Speculative Decoding for Image Generation
📝 Summary:
Multi-Scale Local Speculative Decoding accelerates autoregressive image generation through multi-resolution drafting and spatially informed verification while maintaining semantic quality and perceptu...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05149
• PDF: https://arxiv.org/pdf/2601.05149
• Project Page: https://qualcomm-ai-research.github.io/mulo-sd-webpage/
• Github: https://qualcomm-ai-research.github.io/mulo-sd-webpage
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-Scale Local Speculative Decoding accelerates autoregressive image generation through multi-resolution drafting and spatially informed verification while maintaining semantic quality and perceptu...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05149
• PDF: https://arxiv.org/pdf/2601.05149
• Project Page: https://qualcomm-ai-research.github.io/mulo-sd-webpage/
• Github: https://qualcomm-ai-research.github.io/mulo-sd-webpage
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference
📝 Summary:
Pyramidal diffusion models reduce computational cost through hierarchical resolution processing, with pretrained models converted via low-cost fine-tuning maintaining output quality while enabling eff...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04792
• PDF: https://arxiv.org/pdf/2601.04792
• Project Page: https://qualcomm-ai-research.github.io/PyramidalWan
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Pyramidal diffusion models reduce computational cost through hierarchical resolution processing, with pretrained models converted via low-cost fine-tuning maintaining output quality while enabling eff...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04792
• PDF: https://arxiv.org/pdf/2601.04792
• Project Page: https://qualcomm-ai-research.github.io/PyramidalWan
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ProFuse: Efficient Cross-View Context Fusion for Open-Vocabulary 3D Gaussian Splatting
📝 Summary:
ProFuse enhances 3D scene understanding by integrating semantic information into 3D Gaussian Splatting through efficient context-aware processing and pre-registration phases. AI-generated summary We p...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04754
• PDF: https://arxiv.org/pdf/2601.04754
• Project Page: https://chiou1203.github.io/ProFuse/
• Github: https://chiou1203.github.io/ProFuse/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ProFuse enhances 3D scene understanding by integrating semantic information into 3D Gaussian Splatting through efficient context-aware processing and pre-registration phases. AI-generated summary We p...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04754
• PDF: https://arxiv.org/pdf/2601.04754
• Project Page: https://chiou1203.github.io/ProFuse/
• Github: https://chiou1203.github.io/ProFuse/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing
📝 Summary:
Behavior cloning demonstrates improved performance and causal reasoning through scaling model size and training data, achieving human-level gameplay in 3D video games. AI-generated summary Behavior cl...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04575
• PDF: https://arxiv.org/pdf/2601.04575
• Project Page: https://elefant-ai.github.io/open-p2p/
• Github: https://github.com/elefant-ai/open-p2p
🔹 Models citing this paper:
• https://huggingface.co/elefantai/open-p2p
✨ Datasets citing this paper:
• https://huggingface.co/datasets/elefantai/p2p-full-data
• https://huggingface.co/datasets/elefantai/p2p-toy-examples
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Behavior cloning demonstrates improved performance and causal reasoning through scaling model size and training data, achieving human-level gameplay in 3D video games. AI-generated summary Behavior cl...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04575
• PDF: https://arxiv.org/pdf/2601.04575
• Project Page: https://elefant-ai.github.io/open-p2p/
• Github: https://github.com/elefant-ai/open-p2p
🔹 Models citing this paper:
• https://huggingface.co/elefantai/open-p2p
✨ Datasets citing this paper:
• https://huggingface.co/datasets/elefantai/p2p-full-data
• https://huggingface.co/datasets/elefantai/p2p-toy-examples
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers
📝 Summary:
ReHyAt presents a recurrent hybrid attention mechanism, merging softmax fidelity with linear efficiency. This enables scalable, high-quality video generation by reducing computational cost from quadratic to linear, with significantly lower training costs.
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04342
• PDF: https://arxiv.org/pdf/2601.04342
• Project Page: https://qualcomm-ai-research.github.io/rehyat
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ReHyAt presents a recurrent hybrid attention mechanism, merging softmax fidelity with linear efficiency. This enables scalable, high-quality video generation by reducing computational cost from quadratic to linear, with significantly lower training costs.
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04342
• PDF: https://arxiv.org/pdf/2601.04342
• Project Page: https://qualcomm-ai-research.github.io/rehyat
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views
📝 Summary:
HairGuard is a framework for recovering fine-grained soft boundary details in 3D vision tasks through specialized depth refinement and view synthesis techniques. AI-generated summary Soft boundaries, ...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03362
• PDF: https://arxiv.org/pdf/2601.03362
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
HairGuard is a framework for recovering fine-grained soft boundary details in 3D vision tasks through specialized depth refinement and view synthesis techniques. AI-generated summary Soft boundaries, ...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03362
• PDF: https://arxiv.org/pdf/2601.03362
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset
📝 Summary:
A large-scale industrial multimodal defect dataset with 1 million image-text pairs enables efficient foundation model adaptation for manufacturing quality inspection and generation tasks. AI-generated...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24160
• PDF: https://arxiv.org/pdf/2512.24160
• Project Page: https://ninaneon.github.io/projectpage/
• Github: https://github.com/NinaNeon/IMDD-1M-Towards-Open-Vocabulary-Industrial-Defect-
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A large-scale industrial multimodal defect dataset with 1 million image-text pairs enables efficient foundation model adaptation for manufacturing quality inspection and generation tasks. AI-generated...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24160
• PDF: https://arxiv.org/pdf/2512.24160
• Project Page: https://ninaneon.github.io/projectpage/
• Github: https://github.com/NinaNeon/IMDD-1M-Towards-Open-Vocabulary-Industrial-Defect-
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Memorization in 3D Shape Generation: An Empirical Study
📝 Summary:
Researchers develop a framework to measure memorization in 3D generative models and identify factors affecting it, finding that data modality and model design parameters influence how much training da...
🔹 Publication Date: Published on Dec 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23628
• PDF: https://arxiv.org/pdf/2512.23628
• Github: https://github.com/zlab-princeton/3d-gen-mem
🔹 Models citing this paper:
• https://huggingface.co/pudashi/3DGenMem
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Researchers develop a framework to measure memorization in 3D generative models and identify factors affecting it, finding that data modality and model design parameters influence how much training da...
🔹 Publication Date: Published on Dec 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23628
• PDF: https://arxiv.org/pdf/2512.23628
• Github: https://github.com/zlab-princeton/3d-gen-mem
🔹 Models citing this paper:
• https://huggingface.co/pudashi/3DGenMem
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AgentDevel: Reframing Self-Evolving LLM Agents as Release Engineering
📝 Summary:
AgentDevel presents a release engineering approach for large language model agents that treats them as shippable artifacts and emphasizes stable, auditable improvements through externalized testing an...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04620
• PDF: https://arxiv.org/pdf/2601.04620
• Project Page: https://trotsky1997.github.io/agentdevel-dashboard/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AgentDevel presents a release engineering approach for large language model agents that treats them as shippable artifacts and emphasizes stable, auditable improvements through externalized testing an...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04620
• PDF: https://arxiv.org/pdf/2601.04620
• Project Page: https://trotsky1997.github.io/agentdevel-dashboard/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Beyond Binary Preference: Aligning Diffusion Models to Fine-grained Criteria by Decoupling Attributes
📝 Summary:
A two-stage framework for diffusion model alignment using hierarchical evaluation criteria and complex preference optimization demonstrates improved generation quality and expert alignment. AI-generat...
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04300
• PDF: https://arxiv.org/pdf/2601.04300
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A two-stage framework for diffusion model alignment using hierarchical evaluation criteria and complex preference optimization demonstrates improved generation quality and expert alignment. AI-generat...
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04300
• PDF: https://arxiv.org/pdf/2601.04300
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Learning User Preferences Through Interaction for Long-Term Collaboration
📝 Summary:
MultiSessionCollab benchmark evaluates agents' ability to learn and adapt to user preferences through persistent memory systems that enhance long-term collaboration quality. AI-generated summary As co...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02702
• PDF: https://arxiv.org/pdf/2601.02702
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MultiSessionCollab benchmark evaluates agents' ability to learn and adapt to user preferences through persistent memory systems that enhance long-term collaboration quality. AI-generated summary As co...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02702
• PDF: https://arxiv.org/pdf/2601.02702
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Enhancing Object Detection with Privileged Information: A Model-Agnostic Teacher-Student Approach
📝 Summary:
Learning Using Privileged Information paradigm enhances object detection accuracy by integrating additional training-time information through teacher-student architectures without increasing inference...
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02016
• PDF: https://arxiv.org/pdf/2601.02016
• Github: https://github.com/mbar0075/lupi-for-object-detection
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Learning Using Privileged Information paradigm enhances object detection accuracy by integrating additional training-time information through teacher-student architectures without increasing inference...
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02016
• PDF: https://arxiv.org/pdf/2601.02016
• Github: https://github.com/mbar0075/lupi-for-object-detection
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio Suite with Generative Speech Models
📝 Summary:
The LEMAS-Dataset enables high-quality multilingual speech synthesis and editing through specialized models leveraging flow-matching and autoregressive architectures with novel training techniques. AI...
🔹 Publication Date: Published on Jan 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04233
• PDF: https://arxiv.org/pdf/2601.04233
• Project Page: https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit
🔹 Models citing this paper:
• https://huggingface.co/LEMAS-Project/LEMAS-TTS
• https://huggingface.co/LEMAS-Project/LEMAS-Edit
✨ Datasets citing this paper:
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-train
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-eval
✨ Spaces citing this paper:
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-TTS
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit
• https://huggingface.co/spaces/Kaiden423/LEMAS-TTS
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
The LEMAS-Dataset enables high-quality multilingual speech synthesis and editing through specialized models leveraging flow-matching and autoregressive architectures with novel training techniques. AI...
🔹 Publication Date: Published on Jan 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04233
• PDF: https://arxiv.org/pdf/2601.04233
• Project Page: https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit
🔹 Models citing this paper:
• https://huggingface.co/LEMAS-Project/LEMAS-TTS
• https://huggingface.co/LEMAS-Project/LEMAS-Edit
✨ Datasets citing this paper:
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-train
• https://huggingface.co/datasets/LEMAS-Project/LEMAS-Dataset-eval
✨ Spaces citing this paper:
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-TTS
• https://huggingface.co/spaces/LEMAS-Project/LEMAS-Edit
• https://huggingface.co/spaces/Kaiden423/LEMAS-TTS
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
LEMAS: Large A 150K-Hour Large-scale Extensible Multilingual Audio...
We present the LEMAS-Dataset, which, to our knowledge, is currently the largest open-source multilingual speech corpus with word-level timestamps. Covering over 150,000 hours across 10 major...
✨VERSE: Visual Embedding Reduction and Space Exploration. Clustering-Guided Insights for Training Data Enhancement in Visually-Rich Document Understanding
📝 Summary:
VERSE is a methodology for analyzing and improving Vision-Language Models in document understanding by visualizing latent representations and generating synthetic data to enhance performance in error-...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05125
• PDF: https://arxiv.org/pdf/2601.05125
• Project Page: https://huggingface.co/spaces/de-Rodrigo/Embeddings
• Github: https://github.com/nachoDRT/VrDU-Doctor
✨ Datasets citing this paper:
• https://huggingface.co/datasets/de-Rodrigo/merit
✨ Spaces citing this paper:
• https://huggingface.co/spaces/de-Rodrigo/Embeddings
• https://huggingface.co/spaces/de-Rodrigo/saliencies
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
VERSE is a methodology for analyzing and improving Vision-Language Models in document understanding by visualizing latent representations and generating synthetic data to enhance performance in error-...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05125
• PDF: https://arxiv.org/pdf/2601.05125
• Project Page: https://huggingface.co/spaces/de-Rodrigo/Embeddings
• Github: https://github.com/nachoDRT/VrDU-Doctor
✨ Datasets citing this paper:
• https://huggingface.co/datasets/de-Rodrigo/merit
✨ Spaces citing this paper:
• https://huggingface.co/spaces/de-Rodrigo/Embeddings
• https://huggingface.co/spaces/de-Rodrigo/saliencies
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Safety at One Shot: Patching Fine-Tuned LLMs with A Single Instance
📝 Summary:
Safety alignment of large language models can be fully recovered with a single safety example, maintaining utility and achieving convergence in few epochs through identified low-rank gradient structur...
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01887
• PDF: https://arxiv.org/pdf/2601.01887
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Safety alignment of large language models can be fully recovered with a single safety example, maintaining utility and achieving convergence in few epochs through identified low-rank gradient structur...
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01887
• PDF: https://arxiv.org/pdf/2601.01887
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
📝 Summary:
We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or co...
🔹 Publication Date: Published on Nov 14, 2025
🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/mirothinker-pushing-the-performance-boundaries-of-open-source-research-agents-via-model-context-and-interactive-scaling-9611-0f2289e7
• PDF: https://arxiv.org/pdf/2511.11793
• Project Page: https://dr.miromind.ai/
• Github: https://github.com/MiroMindAI/MiroThinker
🔹 Models citing this paper:
• https://huggingface.co/miromind-ai/MiroThinker-v1.5-235B
• https://huggingface.co/miromind-ai/MiroThinker-v1.5-30B
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-72B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/miromind-ai/MiroVerse-v0.1
✨ Spaces citing this paper:
• https://huggingface.co/spaces/zoom-ai/hle-leaderboard
• https://huggingface.co/spaces/miromind-ai/MiroMind-Open-Source-Deep-Research
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or co...
🔹 Publication Date: Published on Nov 14, 2025
🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/mirothinker-pushing-the-performance-boundaries-of-open-source-research-agents-via-model-context-and-interactive-scaling-9611-0f2289e7
• PDF: https://arxiv.org/pdf/2511.11793
• Project Page: https://dr.miromind.ai/
• Github: https://github.com/MiroMindAI/MiroThinker
🔹 Models citing this paper:
• https://huggingface.co/miromind-ai/MiroThinker-v1.5-235B
• https://huggingface.co/miromind-ai/MiroThinker-v1.5-30B
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-72B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/miromind-ai/MiroVerse-v0.1
✨ Spaces citing this paper:
• https://huggingface.co/spaces/zoom-ai/hle-leaderboard
• https://huggingface.co/spaces/miromind-ai/MiroMind-Open-Source-Deep-Research
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
Arxivlens
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling - AI…
AI-powered analysis of 'MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling'. We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and…
✨LTX-2: Efficient Joint Audio-Visual Foundation Model
📝 Summary:
LTX-2 is an open-source audiovisual diffusion model that generates synchronized video and audio content using a dual-stream transformer architecture with cross-modal attention and classifier-free guid...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03233
• PDF: https://arxiv.org/pdf/2601.03233
• Project Page: https://huggingface.co/papers/2511.12072
• Github: https://github.com/Lightricks/LTX-2
🔹 Models citing this paper:
• https://huggingface.co/Lightricks/LTX-2
• https://huggingface.co/unsloth/LTX-2-GGUF
• https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Canny-Control
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Lightricks/ltx-2-distilled
• https://huggingface.co/spaces/Lightricks/ltx-2
• https://huggingface.co/spaces/alexnasa/ltx-2-TURBO
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LTX-2 is an open-source audiovisual diffusion model that generates synchronized video and audio content using a dual-stream transformer architecture with cross-modal attention and classifier-free guid...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.03233
• PDF: https://arxiv.org/pdf/2601.03233
• Project Page: https://huggingface.co/papers/2511.12072
• Github: https://github.com/Lightricks/LTX-2
🔹 Models citing this paper:
• https://huggingface.co/Lightricks/LTX-2
• https://huggingface.co/unsloth/LTX-2-GGUF
• https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Canny-Control
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Lightricks/ltx-2-distilled
• https://huggingface.co/spaces/Lightricks/ltx-2
• https://huggingface.co/spaces/alexnasa/ltx-2-TURBO
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
LTX-2: Efficient Joint Audio-Visual Foundation Model
Recent text-to-video diffusion models can generate compelling video sequences, yet they remain silent -- missing the semantic, emotional, and atmospheric cues that audio provides. We introduce...