GitHub repos

BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Language: Python
#large_language_models #large_vision_language_models #mme #multimodal_large_language_models #video #video_mme
Stars: 182 Issues: 1 Forks: 6
https://github.com/BradyFU/Video-MME

GitHub

GitHub - BradyFU/Video-MME: ✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video…

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis - BradyFU/Video-MME

2.09K views04:00

GitHub repos

fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Language: Python
#face_animation #image_animation #video_animation
Stars: 653 Issues: 5 Forks: 102
https://github.com/fudan-generative-vision/hallo

GitHub

GitHub - fudan-generative-vision/hallo: Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation - fudan-generative-vision/hallo

3.23K views10:00

GitHub repos

SuperViz/superviz
SuperViz provides programmable low-code Collaboration and Communication components for web applications.
Language: TypeScript
#autodesk #autodesk_forge #collaboration #comments #crdt #matterport #multiplayer #presence #react #reactflow #real_time #superviz #three #video_conferencing #webrtc #websockets #yjs #yjs_provider
Stars: 198 Issues: 5 Forks: 0
https://github.com/SuperViz/superviz

GitHub

GitHub - SuperViz/superviz: SuperViz provides powerful SDKs and APIs that enable developers to easily integrate real-time features…

SuperViz provides powerful SDKs and APIs that enable developers to easily integrate real-time features into web applications. Our platform accelerates development across various industries with rob...

1.95K views22:00

GitHub repos

jy0205/Pyramid-Flow
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Language: Python
#diffusion_models #flow_matching #video_generation
Stars: 613 Issues: 10 Forks: 47
https://github.com/jy0205/Pyramid-Flow

GitHub

GitHub - jy0205/Pyramid-Flow: [ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling

[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling - jy0205/Pyramid-Flow

1.9K views10:00

GitHub repos

jiah-cloud/Align3R
[arXiv'24] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Language: Python
#3d_reconstruction #depth_estimation #point_cloud_reconstruction #pose_estimation #video_depth
Stars: 140 Issues: 3 Forks: 3
https://github.com/jiah-cloud/Align3R

GitHub

GitHub - jiah-cloud/Align3R: [CVPR 2025 Highlight] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos

[CVPR 2025 Highlight] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos - jiah-cloud/Align3R

👍1

1.62K views05:00

GitHub repos

GeekyWizKid/video_processing_service
Video Processing Service is an automated video processing service that supports extracting audio from videos, generating subtitles, and embedding subtitles into the video.
Language: Python
#llm #python #video_processing
Stars: 157 Issues: 0 Forks: 28
https://github.com/GeekyWizKid/video_processing_service

GitHub

GitHub - GeekyWizKid/video_processing_service: Video Processing Service is an automated video processing service that supports…

Video Processing Service is an automated video processing service that supports extracting audio from videos, generating subtitles, and embedding subtitles into the video. - GitHub - GeekyWizKid/v...

👍1

1.8K views23:00

GitHub repos

baaivision/NOVA
NOVA: Autoregressive Video Generation without Vector Quantization
Language: Python
#autoregressive_models #diffusion_models #image_generation #video_generation
Stars: 145 Issues: 1 Forks: 2
https://github.com/baaivision/NOVA

GitHub

GitHub - baaivision/NOVA: [ICLR 2025] Autoregressive Video Generation without Vector Quantization

[ICLR 2025] Autoregressive Video Generation without Vector Quantization - baaivision/NOVA

❤1

1.86K views23:00

GitHub repos

ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini

GitHub

GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...

1.88K views23:00

GitHub repos

DepthAnything/Video-Depth-Anything
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Language: Python
#depth_estimation #monocular_depth_estimation #transformer #video_depth
Stars: 234 Issues: 2 Forks: 8
https://github.com/DepthAnything/Video-Depth-Anything

GitHub

GitHub - DepthAnything/Video-Depth-Anything: [CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super…

[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos - DepthAnything/Video-Depth-Anything

1.76K views11:00

GitHub repos

umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer

GitHub

GitHub - umlx5h/LLPlayer: The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation…

The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more! - umlx5h/LLPlayer

❤1👍1

1.87K views23:00

About

Blog

Apps

Platform