This media is not supported in your browser
VIEW IN TELEGRAM
🌹 4D Neural Templates 🌹
👉#Stanford unveils Neural Templates, generating HQ temporal object intrinsics for several natural phenomena and enable the sampling and controllable rendering of these dynamic objects from any viewpoint, at any time of their lifespan. A novel task in vision is born💙
👉Review https://t.ly/ka_Qf
👉Paper https://arxiv.org/pdf/2412.05278
👉Project https://chen-geng.com/rose4d#toi
👉#Stanford unveils Neural Templates, generating HQ temporal object intrinsics for several natural phenomena and enable the sampling and controllable rendering of these dynamic objects from any viewpoint, at any time of their lifespan. A novel task in vision is born💙
👉Review https://t.ly/ka_Qf
👉Paper https://arxiv.org/pdf/2412.05278
👉Project https://chen-geng.com/rose4d#toi
🔥8❤2⚡1👍1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐕 Gaze-LLE: Neural Gaze 🐕
👉Gaze-LLE: novel transformer framework that streamlines gaze target by leveraging features from frozen DINOv2 encoder. Code & models under MIT 💙
👉Review https://t.ly/SadoF
👉Paper arxiv.org/pdf/2412.09586
👉Repo github.com/fkryan/gazelle
👉Gaze-LLE: novel transformer framework that streamlines gaze target by leveraging features from frozen DINOv2 encoder. Code & models under MIT 💙
👉Review https://t.ly/SadoF
👉Paper arxiv.org/pdf/2412.09586
👉Repo github.com/fkryan/gazelle
🔥26❤9👍3⚡1🤩1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🫶 Dynamic Cam-4D Hands 🫶
👉The Imperial College unveils Dyn-HaMR, the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild. Code announced under MIT💙
👉Review https://t.ly/h5vV7
👉Paper arxiv.org/pdf/2412.12861
👉Project dyn-hamr.github.io/
👉Repo github.com/ZhengdiYu/Dyn-HaMR
👉The Imperial College unveils Dyn-HaMR, the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild. Code announced under MIT💙
👉Review https://t.ly/h5vV7
👉Paper arxiv.org/pdf/2412.12861
👉Project dyn-hamr.github.io/
👉Repo github.com/ZhengdiYu/Dyn-HaMR
🤩9👍5🔥4❤3😢1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🍄 Open-MLLMs Self-Driving 🍄
👉OpenEMMA: a novel open-source e2e framework based on MLLMs (via Chain-of-Thought reasoning). Effectiveness, generalizability, and robustness across a variety of challenging driving scenarios. Code released under Apache 2.0💙
👉Review https://t.ly/waLZI
👉Paper https://arxiv.org/pdf/2412.15208
👉Code https://github.com/taco-group/OpenEMMA
👉OpenEMMA: a novel open-source e2e framework based on MLLMs (via Chain-of-Thought reasoning). Effectiveness, generalizability, and robustness across a variety of challenging driving scenarios. Code released under Apache 2.0💙
👉Review https://t.ly/waLZI
👉Paper https://arxiv.org/pdf/2412.15208
👉Code https://github.com/taco-group/OpenEMMA
❤12👍5🔥5👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔄️ Orient Anything in 3D 🔄️
️
👉Orient Anything is a novel robust image-based object orientation estimation model. By training on 2M rendered labeled images, it achieves strong zero-shot generalization in the wild. Code released💙
👉Review https://t.ly/ro5ep
👉Paper arxiv.org/pdf/2412.18605
👉Project orient-anything.github.io/
👉Code https://lnkd.in/d_3k6Nxz
️
👉Orient Anything is a novel robust image-based object orientation estimation model. By training on 2M rendered labeled images, it achieves strong zero-shot generalization in the wild. Code released💙
👉Review https://t.ly/ro5ep
👉Paper arxiv.org/pdf/2412.18605
👉Project orient-anything.github.io/
👉Code https://lnkd.in/d_3k6Nxz
👍9❤7🔥3⚡1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
⭐TOP 10 Papers you loved - 2024⭐
👉Here the list of my posts you liked the most in 2024, thank you all 💙
𝐏𝐚𝐩𝐞𝐫𝐬:
⭐"Look Ma, no markers"
⭐T-Rex 2 Detector
⭐Models at Any Resolution
👉The full list with links: https://t.ly/GvQVy
👉Here the list of my posts you liked the most in 2024, thank you all 💙
𝐏𝐚𝐩𝐞𝐫𝐬:
⭐"Look Ma, no markers"
⭐T-Rex 2 Detector
⭐Models at Any Resolution
👉The full list with links: https://t.ly/GvQVy
❤12🔥4👍1🤩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌳 HD Video Object Insertion 🌳
👉VideoAnydoor is a novel zero-shot video object insertion #AI with high-fidelity detail preservation and precise motion control. All-in-one: video VTON, face swapping, logo insertion, multi-region editing, etc.
👉Review https://t.ly/hyvRq
👉Paper arxiv.org/pdf/2501.01427
👉Project videoanydoor.github.io/
👉Repo TBA
👉VideoAnydoor is a novel zero-shot video object insertion #AI with high-fidelity detail preservation and precise motion control. All-in-one: video VTON, face swapping, logo insertion, multi-region editing, etc.
👉Review https://t.ly/hyvRq
👉Paper arxiv.org/pdf/2501.01427
👉Project videoanydoor.github.io/
👉Repo TBA
🔥8❤2💩2👍1🤩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
⭐ Poll Alert!! ⭐
[EDIT] see below
[EDIT] see below
❤3👍2🔥1
What is your favorite source for the AI updates?
Final Results
32%
Linkedin
4%
Instagram
3%
Reddit
52%
Telegram
9%
Others ( comment here: https://t.ly/chQWq )
👍11🔥2❤1😍1
AI with Papers - Artificial Intelligence & Deep Learning pinned «What is your favorite source for the AI updates?»
This media is not supported in your browser
VIEW IN TELEGRAM
🥮 SOTA probabilistic tracking🥮
👉ProTracker is a novel framework for robust and accurate long-term dense tracking of arbitrary points in videos. Code released under CC Attribution-NonCommercial💙
👉Review https://t.ly/YY_PH
👉Paper https://arxiv.org/pdf/2501.03220
👉Project michaelszj.github.io/protracker/
👉Code github.com/Michaelszj/pro-tracker
👉ProTracker is a novel framework for robust and accurate long-term dense tracking of arbitrary points in videos. Code released under CC Attribution-NonCommercial💙
👉Review https://t.ly/YY_PH
👉Paper https://arxiv.org/pdf/2501.03220
👉Project michaelszj.github.io/protracker/
👉Code github.com/Michaelszj/pro-tracker
❤6🔥5👍2🤩2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤World-Space Ego 3D Hands🧤
👉The Imperial College unveils HaWoR, a novel world-space 3D hand motion estimation for egocentric videos. The new SOTA on both cam pose estimation & hand motion reconstruction. Code under Attribution-NC-ND 4.0 Int.💙
👉Review https://t.ly/ozJn7
👉Paper arxiv.org/pdf/2501.02973
👉Project hawor-project.github.io/
👉Code github.com/ThunderVVV/HaWoR
👉The Imperial College unveils HaWoR, a novel world-space 3D hand motion estimation for egocentric videos. The new SOTA on both cam pose estimation & hand motion reconstruction. Code under Attribution-NC-ND 4.0 Int.💙
👉Review https://t.ly/ozJn7
👉Paper arxiv.org/pdf/2501.02973
👉Project hawor-project.github.io/
👉Code github.com/ThunderVVV/HaWoR
🔥4😢1🤩1
🔥 "Nuclear" AI vs. Hyper-Cheap Inference 🔥
⭐ What do you expect in 2025 after the #Nvidia announcements at CES 2025? Free to comment :)
⭐ What do you expect in 2025 after the #Nvidia announcements at CES 2025? Free to comment :)
Anonymous Poll
23%
🤲Portabile Training Workstation
35%
⚛️Nuclear energy for AI training
33%
🖲️Cheaper Only-inference devices
9%
💰Cloud-intensive Only-inference
👍4❤1🔥1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽ FIFA 3D Human Pose ⚽
👉#FIFA WorldPose is a novel dataset for multi-person global pose estimation in the wild, featuring footage from the 2022 World Cup. 2.5M+ annotation, released 💙
👉Review https://t.ly/kvGVQ
👉Paper arxiv.org/pdf/2501.02771
👉Project https://lnkd.in/d5hFWpY2
👉Dataset https://lnkd.in/dAphJ9WA
👉#FIFA WorldPose is a novel dataset for multi-person global pose estimation in the wild, featuring footage from the 2022 World Cup. 2.5M+ annotation, released 💙
👉Review https://t.ly/kvGVQ
👉Paper arxiv.org/pdf/2501.02771
👉Project https://lnkd.in/d5hFWpY2
👉Dataset https://lnkd.in/dAphJ9WA
🤩7❤6🤯3👏1💩1😍1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Depth Any Camera (SOTA) 🔥
👉DAC is a novel and powerful zero-shot metric depth estimation framework that extends a perspective-trained model to effectively handle cams with varying FoVs (including large fisheye & 360◦). Code announced (not available yet)💙
👉Review https://t.ly/1qz4F
👉Paper arxiv.org/pdf/2501.02464
👉Project yuliangguo.github.io/depth-any-camera/
👉Repo github.com/yuliangguo/depth_any_camera
👉DAC is a novel and powerful zero-shot metric depth estimation framework that extends a perspective-trained model to effectively handle cams with varying FoVs (including large fisheye & 360◦). Code announced (not available yet)💙
👉Review https://t.ly/1qz4F
👉Paper arxiv.org/pdf/2501.02464
👉Project yuliangguo.github.io/depth-any-camera/
👉Repo github.com/yuliangguo/depth_any_camera
👍12🔥5🤩4❤2😍1
This media is not supported in your browser
VIEW IN TELEGRAM
❤️🔥 Uncommon object in #3D ❤️🔥
👉#META releases uCO3D, a new object-centric dataset for 3D AI. The largest publicly-available collection of HD videos of objects with 3D annotations that ensures full-360◦ coverage. Code & data under CCA 4.0💙
👉Review https://t.ly/Z_tvA
👉Paper https://arxiv.org/pdf/2501.07574
👉Project https://uco3d.github.io/
👉Repo github.com/facebookresearch/uco3d
👉#META releases uCO3D, a new object-centric dataset for 3D AI. The largest publicly-available collection of HD videos of objects with 3D annotations that ensures full-360◦ coverage. Code & data under CCA 4.0💙
👉Review https://t.ly/Z_tvA
👉Paper https://arxiv.org/pdf/2501.07574
👉Project https://uco3d.github.io/
👉Repo github.com/facebookresearch/uco3d
❤11⚡2😍2👍1👏1🤩1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🏆Universal Detector-Free Match🏆
👉MatchAnything: novel detector-free universal matcher across unseen real-world single/cross-modality domains. Same weights for everything. Code announced, to be released 💙
👉Review https://t.ly/sx92L
👉Paper https://lnkd.in/dWwRwGyY
👉Project https://lnkd.in/dCwb2Yte
👉Repo https://lnkd.in/dnUXYzQ5
👉MatchAnything: novel detector-free universal matcher across unseen real-world single/cross-modality domains. Same weights for everything. Code announced, to be released 💙
👉Review https://t.ly/sx92L
👉Paper https://lnkd.in/dWwRwGyY
👉Project https://lnkd.in/dCwb2Yte
👉Repo https://lnkd.in/dnUXYzQ5
❤8🤯7🔥4👏3⚡1🤩1😍1🍾1
🆘 Help: Looking for Outstanding Speakers 🆘
👉Who would you suggest as a speaker for your ideal conference on AI (CV, LLM, RAG, ML, HW Optimization, AI & Space, etc.)? Only “hardcore” technical talks, no commercial at all. Please comment here with name, topic and affiliation (es: Paul Gascoigne, Computer Vision & Football, Scotland Team).
⭐Guaranteed tickets & more for the suggestions that will become invited speakers ;)
👉Who would you suggest as a speaker for your ideal conference on AI (CV, LLM, RAG, ML, HW Optimization, AI & Space, etc.)? Only “hardcore” technical talks, no commercial at all. Please comment here with name, topic and affiliation (es: Paul Gascoigne, Computer Vision & Football, Scotland Team).
⭐Guaranteed tickets & more for the suggestions that will become invited speakers ;)
❤5🔥4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🧞♂️Omni-RGPT: SOTA MLLM Understanding🧞♂️
👉 #NVIDIA presents Omni-RGPT, MLLM for region-level comprehension for both images & videos. New SOTA on image/video-based commonsense reasoning.
👉Review https://t.ly/KHnQ7
👉Paper arxiv.org/pdf/2501.08326
👉Project miranheo.github.io/omni-rgpt/
👉Repo TBA soon
👉 #NVIDIA presents Omni-RGPT, MLLM for region-level comprehension for both images & videos. New SOTA on image/video-based commonsense reasoning.
👉Review https://t.ly/KHnQ7
👉Paper arxiv.org/pdf/2501.08326
👉Project miranheo.github.io/omni-rgpt/
👉Repo TBA soon
🔥10❤3🍾2⚡1👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 GAGA: Group Any Gaussians 🔥
👉GAGA is a framework that reconstructs and segments open-world 3D scenes by leveraging inconsistent 2D masks predicted by zero-shot segmentation models. Code available, recently updated💙
👉Review https://t.ly/Nk_jT
👉Paper www.gaga.gallery/static/pdf/Gaga.pdf
👉Project www.gaga.gallery/
👉Repo github.com/weijielyu/Gaga
👉GAGA is a framework that reconstructs and segments open-world 3D scenes by leveraging inconsistent 2D masks predicted by zero-shot segmentation models. Code available, recently updated💙
👉Review https://t.ly/Nk_jT
👉Paper www.gaga.gallery/static/pdf/Gaga.pdf
👉Project www.gaga.gallery/
👉Repo github.com/weijielyu/Gaga
🔥11❤3👍2🤩1