ML Research Hub
32.8K subscribers
4.21K photos
253 videos
23 files
4.54K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions

📝 Summary:
UniAVGen uses dual Diffusion Transformers and Asymmetric Cross-Modal Interaction for unified audio-video generation. This framework ensures precise spatiotemporal synchronization and semantic consistency. It outperforms existing methods in sync and consistency with far fewer training samples.

🔹 Publication Date: Published on Nov 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.03334
• PDF: https://arxiv.org/pdf/2511.03334
• Project Page: https://mcg-nju.github.io/UniAVGen/
• Github: https://mcg-nju.github.io/UniAVGen/

==================================

For more data science resources:
https://t.me/DataScienceT

#GenerativeAI #AudioVideoGeneration #DiffusionModels #CrossModalAI #DeepLearning
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

📝 Summary:
Ovi is a unified audio-video generation model using twin-DiT modules with blockwise cross-modal fusion. This innovative design ensures natural synchronization and high-quality multimodal outputs, simplifying previous multi-stage approaches.

🔹 Publication Date: Published on Sep 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.01284
• PDF: https://arxiv.org/pdf/2510.01284
• Project Page: https://aaxwaz.github.io/Ovi
• Github: https://github.com/character-ai/Ovi

🔹 Models citing this paper:
https://huggingface.co/chetwinlow1/Ovi
https://huggingface.co/rkfg/Ovi-fp8_quantized

Spaces citing this paper:
https://huggingface.co/spaces/akhaliq/Ovi
https://huggingface.co/spaces/deddytoyota/Ovi
https://huggingface.co/spaces/alexnasa/Ovi-ZEROGPU

==================================

For more data science resources:
https://t.me/DataScienceT

#AudioVideoGeneration #MultimodalAI #DeepLearning #CrossModalFusion #AIResearch