TianxingWu/FreeInit
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Language: Python
#aigc #text_to_video #video_diffusion_model #video_generation
Stars: 162 Issues: 4 Forks: 7
https://github.com/TianxingWu/FreeInit
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Language: Python
#aigc #text_to_video #video_diffusion_model #video_generation
Stars: 162 Issues: 4 Forks: 7
https://github.com/TianxingWu/FreeInit
GitHub
GitHub - TianxingWu/FreeInit: [ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models
[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models - TianxingWu/FreeInit
ali-vilab/dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python
#audio_visual_learning #face_animation #talking_head #video_generation
Stars: 217 Issues: 7 Forks: 20
https://github.com/ali-vilab/dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python
#audio_visual_learning #face_animation #talking_head #video_generation
Stars: 217 Issues: 7 Forks: 20
https://github.com/ali-vilab/dreamtalk
GitHub
GitHub - ali-vilab/dreamtalk: Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion…
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models - ali-vilab/dreamtalk
mayuelala/FollowYourClick
[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"
#image_animation #image_to_video_generation #video_generation
Stars: 445 Issues: 0 Forks: 10
https://github.com/mayuelala/FollowYourClick
[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"
#image_animation #image_to_video_generation #video_generation
Stars: 445 Issues: 0 Forks: 10
https://github.com/mayuelala/FollowYourClick
GitHub
GitHub - mayuelala/FollowYourClick: [AAAI 2025] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click:…
[AAAI 2025] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts" - GitHub - mayuelala/Foll...
PKU-YuanGroup/MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Language: Python
#diffusion_models #long_video_generation #metamorphic_video_generation #open_sora_plan #text_to_video #time_lapse #time_lapse_dataset #video_generation
Stars: 281 Issues: 4 Forks: 16
https://github.com/PKU-YuanGroup/MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Language: Python
#diffusion_models #long_video_generation #metamorphic_video_generation #open_sora_plan #text_to_video #time_lapse #time_lapse_dataset #video_generation
Stars: 281 Issues: 4 Forks: 16
https://github.com/PKU-YuanGroup/MagicTime
GitHub
GitHub - PKU-YuanGroup/MagicTime: [TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators - PKU-YuanGroup/MagicTime
BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Language: Python
#large_language_models #large_vision_language_models #mme #multimodal_large_language_models #video #video_mme
Stars: 182 Issues: 1 Forks: 6
https://github.com/BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Language: Python
#large_language_models #large_vision_language_models #mme #multimodal_large_language_models #video #video_mme
Stars: 182 Issues: 1 Forks: 6
https://github.com/BradyFU/Video-MME
GitHub
GitHub - BradyFU/Video-MME: ✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video…
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis - BradyFU/Video-MME
fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Language: Python
#face_animation #image_animation #video_animation
Stars: 653 Issues: 5 Forks: 102
https://github.com/fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Language: Python
#face_animation #image_animation #video_animation
Stars: 653 Issues: 5 Forks: 102
https://github.com/fudan-generative-vision/hallo
GitHub
GitHub - fudan-generative-vision/hallo: Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation - fudan-generative-vision/hallo
SuperViz/superviz
SuperViz provides programmable low-code Collaboration and Communication components for web applications.
Language: TypeScript
#autodesk #autodesk_forge #collaboration #comments #crdt #matterport #multiplayer #presence #react #reactflow #real_time #superviz #three #video_conferencing #webrtc #websockets #yjs #yjs_provider
Stars: 198 Issues: 5 Forks: 0
https://github.com/SuperViz/superviz
SuperViz provides programmable low-code Collaboration and Communication components for web applications.
Language: TypeScript
#autodesk #autodesk_forge #collaboration #comments #crdt #matterport #multiplayer #presence #react #reactflow #real_time #superviz #three #video_conferencing #webrtc #websockets #yjs #yjs_provider
Stars: 198 Issues: 5 Forks: 0
https://github.com/SuperViz/superviz
GitHub
GitHub - SuperViz/superviz: SuperViz provides powerful SDKs and APIs that enable developers to easily integrate real-time features…
SuperViz provides powerful SDKs and APIs that enable developers to easily integrate real-time features into web applications. Our platform accelerates development across various industries with rob...
jy0205/Pyramid-Flow
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Language: Python
#diffusion_models #flow_matching #video_generation
Stars: 613 Issues: 10 Forks: 47
https://github.com/jy0205/Pyramid-Flow
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Language: Python
#diffusion_models #flow_matching #video_generation
Stars: 613 Issues: 10 Forks: 47
https://github.com/jy0205/Pyramid-Flow
GitHub
GitHub - jy0205/Pyramid-Flow: [ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling - jy0205/Pyramid-Flow
jiah-cloud/Align3R
[arXiv'24] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Language: Python
#3d_reconstruction #depth_estimation #point_cloud_reconstruction #pose_estimation #video_depth
Stars: 140 Issues: 3 Forks: 3
https://github.com/jiah-cloud/Align3R
[arXiv'24] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Language: Python
#3d_reconstruction #depth_estimation #point_cloud_reconstruction #pose_estimation #video_depth
Stars: 140 Issues: 3 Forks: 3
https://github.com/jiah-cloud/Align3R
GitHub
GitHub - jiah-cloud/Align3R: [CVPR 2025 Highlight] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
[CVPR 2025 Highlight] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos - jiah-cloud/Align3R
GeekyWizKid/video_processing_service
Video Processing Service is an automated video processing service that supports extracting audio from videos, generating subtitles, and embedding subtitles into the video.
Language: Python
#llm #python #video_processing
Stars: 157 Issues: 0 Forks: 28
https://github.com/GeekyWizKid/video_processing_service
Video Processing Service is an automated video processing service that supports extracting audio from videos, generating subtitles, and embedding subtitles into the video.
Language: Python
#llm #python #video_processing
Stars: 157 Issues: 0 Forks: 28
https://github.com/GeekyWizKid/video_processing_service
GitHub
GitHub - GeekyWizKid/video_processing_service: Video Processing Service is an automated video processing service that supports…
Video Processing Service is an automated video processing service that supports extracting audio from videos, generating subtitles, and embedding subtitles into the video. - GitHub - GeekyWizKid/v...
baaivision/NOVA
NOVA: Autoregressive Video Generation without Vector Quantization
Language: Python
#autoregressive_models #diffusion_models #image_generation #video_generation
Stars: 145 Issues: 1 Forks: 2
https://github.com/baaivision/NOVA
NOVA: Autoregressive Video Generation without Vector Quantization
Language: Python
#autoregressive_models #diffusion_models #image_generation #video_generation
Stars: 145 Issues: 1 Forks: 2
https://github.com/baaivision/NOVA
GitHub
GitHub - baaivision/NOVA: [ICLR 2025] Autoregressive Video Generation without Vector Quantization
[ICLR 2025] Autoregressive Video Generation without Vector Quantization - baaivision/NOVA
ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
GitHub
GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...
DepthAnything/Video-Depth-Anything
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Language: Python
#depth_estimation #monocular_depth_estimation #transformer #video_depth
Stars: 234 Issues: 2 Forks: 8
https://github.com/DepthAnything/Video-Depth-Anything
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Language: Python
#depth_estimation #monocular_depth_estimation #transformer #video_depth
Stars: 234 Issues: 2 Forks: 8
https://github.com/DepthAnything/Video-Depth-Anything
GitHub
GitHub - DepthAnything/Video-Depth-Anything: [CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super…
[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos - DepthAnything/Video-Depth-Anything
umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer
GitHub
GitHub - umlx5h/LLPlayer: The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation…
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more! - umlx5h/LLPlayer