GitHub repos

ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini

GitHub

GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...

1.8K views23:00

GitHub repos

DepthAnything/Video-Depth-Anything
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Language: Python
#depth_estimation #monocular_depth_estimation #transformer #video_depth
Stars: 234 Issues: 2 Forks: 8
https://github.com/DepthAnything/Video-Depth-Anything

GitHub

GitHub - DepthAnything/Video-Depth-Anything: [CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super…

[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos - DepthAnything/Video-Depth-Anything

1.7K views11:00

GitHub repos

umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer

GitHub

GitHub - umlx5h/LLPlayer: The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation…

The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more! - umlx5h/LLPlayer

1.8K views23:00

GitHub repos

FoundationVision/FlashVideo
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
Language: Python
#efficient_generative_model #text_to_video #video_generation
Stars: 195 Issues: 5 Forks: 3
https://github.com/FoundationVision/FlashVideo

GitHub

GitHub - FoundationVision/FlashVideo: FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation

FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation - FoundationVision/FlashVideo

1.6K views17:00

GitHub repos

SkyworkAI/SkyReels-V1
SkyReels V1: the first and most advanced open-source human-centric video foundation model
Language: Python
#i2v #t2v #video_diffusion_transformers
Stars: 348 Issues: 5 Forks: 20
https://github.com/SkyworkAI/SkyReels-V1

GitHub

GitHub - SkyworkAI/SkyReels-V1: SkyReels V1: The first and most advanced open-source human-centric video foundation model

SkyReels V1: The first and most advanced open-source human-centric video foundation model - SkyworkAI/SkyReels-V1

1.6K views17:00

GitHub repos

liuff19/Video-T1
Official Implementation of Video-T1: Test-Time Scaling for Video Generation
Language: Python
#aigc #chain_of_thought #test_time_scaling #video #video_generation
Stars: 187 Issues: 2 Forks: 12
https://github.com/liuff19/Video-T1

GitHub

GitHub - liuff19/Video-T1: Official Implementation of Video-T1: Test-Time Scaling for Video Generation

Official Implementation of Video-T1: Test-Time Scaling for Video Generation - liuff19/Video-T1

1.7K views22:00

GitHub repos

TencentARC/GeometryCrafter
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Language: Python
#depth_estimation #video_to_4d
Stars: 173 Issues: 0 Forks: 3
https://github.com/TencentARC/GeometryCrafter

GitHub

GitHub - TencentARC/GeometryCrafter: GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors - TencentARC/GeometryCrafter

1.6K views04:00

GitHub repos

hanyang-21/VideoScene
[CVPR 2025] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step
Language: Python
#3d_reconstruction #video #video_generation
Stars: 154 Issues: 4 Forks: 3
https://github.com/hanyang-21/VideoScene

GitHub

GitHub - hanyang-21/VideoScene: [CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One…

[CVPR 2025 Highlight] VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step - hanyang-21/VideoScene

1.6K views22:00

GitHub repos

ali-vilab/UniAnimate-DiT
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Language: Python
#human_image_animation #video_diffusion_transformers #video_generation
Stars: 225 Issues: 5 Forks: 17
https://github.com/ali-vilab/UniAnimate-DiT

GitHub

GitHub - ali-vilab/UniAnimate-DiT: UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer

UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer - ali-vilab/UniAnimate-DiT

1.5K views04:00

GitHub repos

SandAI-org/MAGI-1
MAGI-1: Autoregressive Video Generation at Scale
Language: Python
#autoregressive #diffusion_models #video_generation
Stars: 911 Issues: 7 Forks: 32
https://github.com/SandAI-org/MAGI-1

GitHub

GitHub - SandAI-org/MAGI-1: MAGI-1: Autoregressive Video Generation at Scale

MAGI-1: Autoregressive Video Generation at Scale. Contribute to SandAI-org/MAGI-1 development by creating an account on GitHub.

1.6K views16:00

GitHub repos