ML Research Hub

✨UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

📝 Summary:
UniPercept-Bench provides a unified framework and datasets for perceptual image understanding aesthetics, quality, structure, texture. The UniPercept model, trained with DAPT and T-ARL, outperforms MLLMs, generalizes across VR and VQA, and acts as a text-to-image reward model.

🔹 Publication Date: Published on Dec 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21675
• PDF: https://arxiv.org/pdf/2512.21675
• Project Page: https://thunderbolt215.github.io/Unipercept-project/
• Github: https://github.com/thunderbolt215/UniPercept

🔹 Models citing this paper:
• https://huggingface.co/Thunderbolt215215/UniPercept

✨ Datasets citing this paper:
• https://huggingface.co/datasets/Thunderbolt215215/UniPercept-Bench

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ImageUnderstanding #ComputerVision #AIResearch #PerceptualAI #DeepLearning

arXiv.org

UniPercept: Towards Unified Perceptual-Level Image Understanding...

Multimodal large language models (MLLMs) have achieved remarkable progress in visual understanding tasks such as visual grounding, segmentation, and captioning. However, their ability to perceive...

❤1

354 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding

📝 Summary:
Omni-Weather is a new multimodal foundation model that unifies weather generation and understanding in a single architecture. It uses shared self-attention and a Chain-of-Thought dataset for interpretable, high-quality outputs, achieving state-of-the-art performance.

🔹 Publication Date: Published on Dec 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21643
• PDF: https://arxiv.org/pdf/2512.21643

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#WeatherGeneration #FoundationModels #MultimodalAI #AIResearch #DeepLearning

❤1

440 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

📝 Summary:
Expert-Router Coupling ERC loss aligns MoE router decisions with expert capabilities. It uses proxy tokens and activation constraints to ensure experts specialize, improving performance and computational efficiency. ERC also allows tracking expert specialization during training.

🔹 Publication Date: Published on Dec 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23447
• PDF: https://arxiv.org/pdf/2512.23447

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MixtureOfExperts #DeepLearning #MachineLearning #AI #NeuralNetworks

346 views09:53

✨ Explore Data Science 📝 Write your paper

✨Yume-1.5: A Text-Controlled Interactive World Generation Model

📝 Summary:
Yume-1.5 is a novel framework that generates realistic, interactive, and continuous worlds from a single image or text prompt. It overcomes prior limitations in real-time performance and text control by using unified context compression, streaming acceleration, and text-controlled world events.

🔹 Publication Date: Published on Dec 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22096
• PDF: https://arxiv.org/pdf/2512.22096
• Project Page: https://stdstu12.github.io/YUME-Project/
• Github: https://github.com/stdstu12/YUME

🔹 Models citing this paper:
• https://huggingface.co/stdstu123/Yume-5B-720P

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AI #GenerativeAI #WorldGeneration #ComputerGraphics #DeepLearning

123 views09:57

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SpotEdit: Selective Region Editing in Diffusion Transformers

📝 Summary:
SpotEdit is a training-free framework for selective image editing in diffusion transformers. It avoids reprocessing stable regions by reusing their features, combining them with edited areas. This reduces computation and preserves unchanged regions, enhancing efficiency and precision.

🔹 Publication Date: Published on Dec 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22323
• PDF: https://arxiv.org/pdf/2512.22323
• Project Page: https://biangbiang0321.github.io/SpotEdit.github.io
• Github: https://biangbiang0321.github.io/SpotEdit.github.io

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ImageEditing #DiffusionModels #ComputerVision #AIResearch #DeepLearning

129 views09:57

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Evaluating Parameter Efficient Methods for RLVR

📝 Summary:
This work evaluates 12 PEFT methods for RLVR in mathematical reasoning, challenging LoRAs default use. It finds that structural variants like DoRA outperform LoRA, while SVD-informed methods fail and extreme parameter reduction bottlenecks reasoning.

🔹 Publication Date: Published on Dec 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23165
• PDF: https://arxiv.org/pdf/2512.23165

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#PEFT #RLVR #MathematicalReasoning #LoRA #DeepLearning

313 views08:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement

📝 Summary:
UltraShape 1.0 is a 3D diffusion framework that generates high-fidelity shapes using a two-stage process: coarse then refined geometry. It includes a novel data pipeline improving dataset quality, enabling strong geometric results on public data.

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21185
• PDF: https://arxiv.org/pdf/2512.21185
• Project Page: https://pku-yuangroup.github.io/UltraShape-1.0/
• Github: https://pku-yuangroup.github.io/UltraShape-1.0/

🔹 Models citing this paper:
• https://huggingface.co/infinith/UltraShape

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#3DGeneration #DiffusionModels #GenerativeAI #ComputerGraphics #DeepLearning

307 views09:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CosineGate: Semantic Dynamic Routing via Cosine Incompatibility in Residual Networks

📝 Summary:
CosineGate enables dynamic routing in residual networks using cosine incompatibility to skip redundant blocks. This reduces computation by up to 28.5 percent while matching or exceeding ResNet-20 accuracy, without auxiliary supervision.

🔹 Publication Date: Published on Dec 21, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22206
• PDF: https://arxiv.org/pdf/2512.22206
• Github: https://github.com/thotayogeswarreddy/CosineGate

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#DeepLearning #NeuralNetworks #DynamicRouting #ModelEfficiency #AIResearch

👍1

271 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

📝 Summary:
Youtu-LLM is a lightweight 1.96B LLM, pre-trained from scratch with a compact architecture and a multi-stage curriculum focused on commonsense, STEM, and agentic tasks. It achieves state-of-the-art performance for sub-2B models, demonstrating strong intrinsic agentic capabilities.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24618
• PDF: https://arxiv.org/pdf/2512.24618

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLM #AI #AgenticAI #LightweightLLM #DeepLearning

216 views04:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:22

This media is not supported in your browser

VIEW IN TELEGRAM

✨SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

📝 Summary:
SpaceTimePilot is a video diffusion model for dynamic scene rendering, offering independent control over spatial viewpoint and temporal motion. It achieves precise space-time disentanglement via a time-embedding, temporal-warping training, and a synthetic dataset.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25075
• PDF: https://arxiv.org/pdf/2512.25075
• Project Page: https://zheninghuang.github.io/Space-Time-Pilot/

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#VideoDiffusion #GenerativeAI #DynamicScenes #ComputerGraphics #DeepLearning

156 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Geometry-Aware Optimization for Respiratory Sound Classification: Enhancing Sensitivity with SAM-Optimized Audio Spectrogram Transformers

📝 Summary:
This paper improves respiratory sound classification using AST enhanced with SAM. It optimizes loss surface geometry for flatter minima, yielding state-of-the-art 68.10% score and crucial 68.31% sensitivity on ICBHI 2017.

🔹 Publication Date: Published on Dec 27, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22564
• PDF: https://arxiv.org/pdf/2512.22564

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#RespiratoryHealth #MedicalAI #DeepLearning #SoundClassification #AIHealthcare

242 views08:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨mHC: Manifold-Constrained Hyper-Connections

📝 Summary:
Manifold-Constrained Hyper-Connections mHC resolve training instability and scalability issues of Hyper-Connections HC. mHC restores identity mapping via manifold projection and infrastructure optimization, enabling effective large-scale training with improved performance.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24880
• PDF: https://arxiv.org/pdf/2512.24880

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MachineLearning #DeepLearning #NeuralNetworks #ManifoldLearning #AI

262 views10:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Kronos: A Foundation Model for the Language of Financial Markets

📝 Summary:
Kronos is a novel foundation model for financial K-line data. It uses a specialized tokenizer and autoregressive pre-training on a vast dataset to significantly outperform existing models in price and volatility forecasting, and synthetic data generation, establishing it as a versatile tool for f...

🔹 Publication Date: Published on Aug 2, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02739
• PDF: https://arxiv.org/pdf/2508.02739
• Github: https://github.com/shiyu-coder/Kronos

🔹 Models citing this paper:
• https://huggingface.co/NeoQuasar/Kronos-base
• https://huggingface.co/NeoQuasar/Kronos-Tokenizer-base
• https://huggingface.co/NeoQuasar/Kronos-mini

✨ Spaces citing this paper:
• https://huggingface.co/spaces/ByronWang2005/Kronos-CS2-Skins-Forecast-Demo
• https://huggingface.co/spaces/yangyang158/kronos
• https://huggingface.co/spaces/heyunfei/crypt

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#FoundationModel #FinancialAI #DeepLearning #QuantitativeFinance #Forecasting

arXiv.org

Kronos: A Foundation Model for the Language of Financial Markets

The success of large-scale pre-training paradigm, exemplified by Large Language Models (LLMs), has inspired the development of Time Series Foundation Models (TSFMs). However, their application to...

317 views10:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Guiding a Diffusion Transformer with the Internal Dynamics of Itself

📝 Summary:
This paper introduces Internal Guidance IG for diffusion models, which adds auxiliary supervision to intermediate layers during training and extrapolates outputs during sampling. This simple strategy significantly improves training efficiency and generation quality. IG achieves state-of-the-art F...

🔹 Publication Date: Published on Dec 30, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24176
• PDF: https://arxiv.org/pdf/2512.24176
• Project Page: https://zhouxingyu13.github.io/Internal-Guidance/
• Github: https://github.com/CVL-UESTC/Internal-Guidance

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#DiffusionModels #AI #DeepLearning #GenerativeAI #ComputerVision

318 views11:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

📝 Summary:
FlowBlending optimizes video generation by adapting model capacity to each stage. It uses large models for critical early and late timesteps, and small models for intermediate ones. This achieves faster inference and fewer FLOPs with no loss in large model fidelity.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24724
• PDF: https://arxiv.org/pdf/2512.24724

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#VideoGeneration #GenerativeAI #DeepLearning #AIResearch #ModelOptimization

278 views15:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting

📝 Summary:
Dolphin is a novel multimodal model for document image parsing. It uses an analyze-then-parse approach with heterogeneous anchor prompting, achieving state-of-the-art performance and superior efficiency.

🔹 Publication Date: Published on May 20, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.14059
• PDF: https://arxiv.org/pdf/2505.14059
• Github: https://github.com/bytedance/dolphin

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#DocumentParsing #MultimodalAI #DeepLearning #ComputerVision #AI

❤1

400 views05:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

📝 Summary:
NeoVerse is a 4D world model for reconstruction and video generation. It scales to in-the-wild monocular videos using pose-free feed-forward reconstruction and online degradation simulation, achieving state-of-the-art performance.

🔹 Publication Date: Published on Jan 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00393
• PDF: https://arxiv.org/pdf/2601.00393
• Project Page: https://neoverse-4d.github.io/
• Github: https://neoverse-4d.github.io

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#4DWorldModel #VideoGeneration #ComputerVision #DeepLearning #AI

62 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing

📝 Summary:
MorphAny3D offers a training-free framework for high-quality 3D morphing, even across categories. It leverages Structured Latent representations with novel attention mechanisms MCA, TFSA for structural coherence and temporal consistency. This achieves state-of-the-art results and supports advance...

🔹 Publication Date: Published on Jan 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00204
• PDF: https://arxiv.org/pdf/2601.00204
• Project Page: https://xiaokunsun.github.io/MorphAny3D.github.io
• Github: https://github.com/XiaokunSun/MorphAny3D

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#3DMorphing #ComputerGraphics #DeepLearning #StructuredLatent #AIResearch

86 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Nested Learning: The Illusion of Deep Learning Architectures

📝 Summary:
Nested Learning NL models ML as nested optimization problems. It enables expressive algorithms for higher-order learning and continual adaptation, introducing optimizers, self-modifying models, and continuum memory systems.

🔹 Publication Date: Published on Dec 31, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24695
• PDF: https://arxiv.org/pdf/2512.24695

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#NestedLearning #MachineLearning #DeepLearning #Optimization #AI

129 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨InfoSynth: Information-Guided Benchmark Synthesis for LLMs

📝 Summary:
InfoSynth automatically generates novel and diverse coding benchmarks for LLMs. It uses information-theoretic metrics and genetic algorithms to create scalable self-verifying problems, overcoming manual effort and training data contamination.

🔹 Publication Date: Published on Jan 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00575
• PDF: https://arxiv.org/pdf/2601.00575
• Project Page: https://ishirgarg.github.io/infosynth_web/
• Github: https://github.com/ishirgarg/infosynth

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLM #AI #Benchmarking #GenerativeAI #DeepLearning

48 views08:03

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform