ML Research Hub
32.8K subscribers
4.09K photos
237 videos
23 files
4.41K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding

📝 Summary:
ReDirector presents a camera-controlled video retake generation method using Rotary Camera Encoding RoCE. This novel camera conditioned RoPE phase shift improves dynamic object localization and static background preservation across variable length videos and diverse camera trajectories.

🔹 Publication Date: Published on Nov 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19827
• PDF: https://arxiv.org/pdf/2511.19827
• Project Page: https://byeongjun-park.github.io/ReDirector/
• Github: https://byeongjun-park.github.io/ReDirector/

==================================

For more data science resources:
https://t.me/DataScienceT

#VideoGeneration #ComputerVision #AIResearch #CameraControl #VideoEditing
Media is too big
VIEW IN TELEGRAM
Generative Video Motion Editing with 3D Point Tracks

📝 Summary:
This paper presents a track-conditioned video-to-video framework for precise joint camera and object motion editing. It uses 3D point tracks to maintain spatiotemporal coherence and handle occlusions through explicit depth cues. This enables diverse motion edits.

🔹 Publication Date: Published on Dec 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.02015
• PDF: https://arxiv.org/pdf/2512.02015
• Project Page: https://edit-by-track.github.io/

==================================

For more data science resources:
https://t.me/DataScienceT

#VideoEditing #GenerativeAI #ComputerVision #3DTracking #DeepLearning
1👍1
ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning

📝 Summary:
The ReViSE framework enables reason-informed video editing by addressing the disconnect between models reasoning and editing capabilities. It uses a self-reflective learning mechanism with an internal VLM to provide intrinsic feedback. This significantly enhances editing accuracy and visual fidel...

🔹 Publication Date: Published on Dec 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09924
• PDF: https://arxiv.org/pdf/2512.09924
• Github: https://github.com/Liuxinyv/ReViSE

==================================

For more data science resources:
https://t.me/DataScienceT

#VideoEditing #AI #MachineLearning #VLM #SelfReflectiveLearning
1
This media is not supported in your browser
VIEW IN TELEGRAM
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

📝 Summary:
V-RGBX is an end-to-end framework for intrinsic-aware video editing. It combines video inverse rendering with photorealistic synthesis and keyframe editing of intrinsic properties. This allows consistent, physically plausible video manipulation, like relighting or object appearance changes.

🔹 Publication Date: Published on Dec 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.11799
• PDF: https://arxiv.org/pdf/2512.11799
• Project Page: https://aleafy.github.io/vrgbx/
• Github: https://github.com/Aleafy/V-RGBX

==================================

For more data science resources:
https://t.me/DataScienceT

#VideoEditing #ComputerVision #InverseRendering #NeuralRendering #Graphics
EasyV2V: A High-quality Instruction-based Video Editing Framework

📝 Summary:
EasyV2V is a framework for instruction-based video editing that combines diverse data sources, leverages pretrained text-to-video models with LoRA fine-tuning, and uses unified spatiotemporal control. This innovative approach achieves state-of-the-art results in video editing.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16920
• PDF: https://arxiv.org/pdf/2512.16920
• Github: https://snap-research.github.io/easyv2v/

==================================

For more data science resources:
https://t.me/DataScienceT

#VideoEditing #AI #DeepLearning #ComputerVision #TextToVideo
2
This media is not supported in your browser
VIEW IN TELEGRAM
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

📝 Summary:
InsertAnywhere is a framework for realistic video object insertion. It uses 4D aware mask generation for geometric consistency and an extended diffusion model for appearance-faithful synthesis, outperforming existing methods.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17504
• PDF: https://arxiv.org/pdf/2512.17504
• Project Page: https://myyzzzoooo.github.io/InsertAnywhere/
• Github: https://github.com/myyzzzoooo/InsertAnywhere

==================================

For more data science resources:
https://t.me/DataScienceT

#VideoEditing #DiffusionModels #ComputerVision #DeepLearning #GenerativeAI
1