✨UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
📝 Summary:
UniPercept-Bench provides a unified framework and datasets for perceptual image understanding aesthetics, quality, structure, texture. The UniPercept model, trained with DAPT and T-ARL, outperforms MLLMs, generalizes across VR and VQA, and acts as a text-to-image reward model.
🔹 Publication Date: Published on Dec 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21675
• PDF: https://arxiv.org/pdf/2512.21675
• Project Page: https://thunderbolt215.github.io/Unipercept-project/
• Github: https://github.com/thunderbolt215/UniPercept
🔹 Models citing this paper:
• https://huggingface.co/Thunderbolt215215/UniPercept
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Thunderbolt215215/UniPercept-Bench
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#ImageUnderstanding #ComputerVision #AIResearch #PerceptualAI #DeepLearning
📝 Summary:
UniPercept-Bench provides a unified framework and datasets for perceptual image understanding aesthetics, quality, structure, texture. The UniPercept model, trained with DAPT and T-ARL, outperforms MLLMs, generalizes across VR and VQA, and acts as a text-to-image reward model.
🔹 Publication Date: Published on Dec 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21675
• PDF: https://arxiv.org/pdf/2512.21675
• Project Page: https://thunderbolt215.github.io/Unipercept-project/
• Github: https://github.com/thunderbolt215/UniPercept
🔹 Models citing this paper:
• https://huggingface.co/Thunderbolt215215/UniPercept
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Thunderbolt215215/UniPercept-Bench
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#ImageUnderstanding #ComputerVision #AIResearch #PerceptualAI #DeepLearning
arXiv.org
UniPercept: Towards Unified Perceptual-Level Image Understanding...
Multimodal large language models (MLLMs) have achieved remarkable progress in visual understanding tasks such as visual grounding, segmentation, and captioning. However, their ability to perceive...
❤1