This media is not supported in your browser
VIEW IN TELEGRAM
🪆SOTA Points Segmentation🪆
👉VGG Oxford unveils a novel loss to segment objects in videos based on their motion and NO other forms of supervision! Training the net using long-term point trajectories as a supervisory signal to complement optical flow. New SOTA!
👉Review https://t.ly/8Bsbt
👉Paper https://arxiv.org/pdf/2501.12392
👉Code https://github.com/karazijal/lrtl
👉Project www.robots.ox.ac.uk/~vgg/research/lrtl/
👉VGG Oxford unveils a novel loss to segment objects in videos based on their motion and NO other forms of supervision! Training the net using long-term point trajectories as a supervisory signal to complement optical flow. New SOTA!
👉Review https://t.ly/8Bsbt
👉Paper https://arxiv.org/pdf/2501.12392
👉Code https://github.com/karazijal/lrtl
👉Project www.robots.ox.ac.uk/~vgg/research/lrtl/
🔥3❤2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🎨MatAnyone: Human Matting🎨
👉MatAnyone is a novel approach for human video matting that supports the target assignment. Stable tracking in long videos even with complex/ambiguous BGs. Code & 🤗-Demo announced💙
👉Review https://t.ly/NVXsT
👉Paper arxiv.org/pdf/2501.14677
👉Project pq-yang.github.io/projects/MatAnyone
👉Repo TBA
👉MatAnyone is a novel approach for human video matting that supports the target assignment. Stable tracking in long videos even with complex/ambiguous BGs. Code & 🤗-Demo announced💙
👉Review https://t.ly/NVXsT
👉Paper arxiv.org/pdf/2501.14677
👉Project pq-yang.github.io/projects/MatAnyone
👉Repo TBA
❤15👏2🤩2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🦕[SOTA] Visual Grounding VOS🦕
👉ReferDINO is the first end-to-end approach for adapting foundational visual grounding models to RVOS. Code & models to be released soon💙
👉Review https://t.ly/SDFy9
👉Paper arxiv.org/pdf/2501.14607
👉Project isee-laboratory.github.io/ReferDINO/
👉Repo github.com/iSEE-Laboratory/ReferDINO
👉ReferDINO is the first end-to-end approach for adapting foundational visual grounding models to RVOS. Code & models to be released soon💙
👉Review https://t.ly/SDFy9
👉Paper arxiv.org/pdf/2501.14607
👉Project isee-laboratory.github.io/ReferDINO/
👉Repo github.com/iSEE-Laboratory/ReferDINO
🤯4❤1🔥1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
☀️ Relightable Full-Body Avatars ☀️
👉#Meta unveils the first approach ever to jointly model the relightable appearance of the body, face, and hands of drivable avatars.
👉Review https://t.ly/kx9gf
👉Paper arxiv.org/pdf/2501.14726
👉Project neuralbodies.github.io/RFGCA
👉#Meta unveils the first approach ever to jointly model the relightable appearance of the body, face, and hands of drivable avatars.
👉Review https://t.ly/kx9gf
👉Paper arxiv.org/pdf/2501.14726
👉Project neuralbodies.github.io/RFGCA
❤3👍3🔥3⚡1🤯1😢1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌅 Generative Human Mesh Recovery 🌅
👉GenHMR is a novel generative framework that reformulates monocular HMR as an image-conditioned generative task, explicitly modeling and mitigating uncertainties in 2D-to-3D mapping process. Impressive results but no code announced 🥺
👉Review https://t.ly/Rrzpj
👉Paper https://arxiv.org/pdf/2412.14444
👉Project m-usamasaleem.github.io/publication/GenHMR/GenHMR.html
👉GenHMR is a novel generative framework that reformulates monocular HMR as an image-conditioned generative task, explicitly modeling and mitigating uncertainties in 2D-to-3D mapping process. Impressive results but no code announced 🥺
👉Review https://t.ly/Rrzpj
👉Paper https://arxiv.org/pdf/2412.14444
👉Project m-usamasaleem.github.io/publication/GenHMR/GenHMR.html
🔥6👍2❤1🤯1🍾1
Social feed of everyone is broken because of unnecessary/not required opinions about DeepSeek. Your wish:
Anonymous Poll
37%
🛑 STOP posting about!
63%
🟩 Keep posting. we want more!
👍1
💎AI-driven Docs Conversion💎
👉Docling by IBM, is the ALL-in-ONE, open source solution for documents; parsing several types of popular formats into a unified, richly structured representation. Powered by SOTA models for layout (DocLayNet) and table structure (TableFormer), it runs efficiently on low-cost hardware. Code under MIT💙
👉Review https://t.ly/nSCfT
👉Paper https://lnkd.in/dc5Kpc2F
👉Repo https://lnkd.in/d9gvw9bt
👉Docling by IBM, is the ALL-in-ONE, open source solution for documents; parsing several types of popular formats into a unified, richly structured representation. Powered by SOTA models for layout (DocLayNet) and table structure (TableFormer), it runs efficiently on low-cost hardware. Code under MIT💙
👉Review https://t.ly/nSCfT
👉Paper https://lnkd.in/dc5Kpc2F
👉Repo https://lnkd.in/d9gvw9bt
❤18👍8🔥1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🈯 SOTA 0-Shot Multi-View 🈯
👉MVGD by #TOYOTA is the SOTA method that generates images and scale-consistent depth maps from novel viewpoints given an arbitrary number of posed input views. A novel diffusion-based architecture capable of direct pixel-level generation. Code announced 💙
👉Review https://t.ly/_ecKl
👉Paper arxiv.org/pdf/2501.18804
👉Project mvgd.github.io/
👉Repo TBA
👉MVGD by #TOYOTA is the SOTA method that generates images and scale-consistent depth maps from novel viewpoints given an arbitrary number of posed input views. A novel diffusion-based architecture capable of direct pixel-level generation. Code announced 💙
👉Review https://t.ly/_ecKl
👉Paper arxiv.org/pdf/2501.18804
👉Project mvgd.github.io/
👉Repo TBA
🔥8❤1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🐙MambaGlue: SOTA feats. matching🐙
👉MambaGlue is a hybrid neural network combining the Mamba and the Transformer architectures to match local features. Source Code announced, to be released💙
👉Review https://shorturl.at/LxDG1
👉Paper arxiv.org/pdf/2502.00462
👉Repo https://lnkd.in/dAujfGZQ
👉MambaGlue is a hybrid neural network combining the Mamba and the Transformer architectures to match local features. Source Code announced, to be released💙
👉Review https://shorturl.at/LxDG1
👉Paper arxiv.org/pdf/2502.00462
👉Repo https://lnkd.in/dAujfGZQ
🤩9❤3🔥2👏2👍1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🛸Real-Time Differentiable Tracing🛸
👉 Radiant Foam is a novel scene representation by leveraging the decades-old efficient volumetric mesh ray tracing algorithm (largely overlooked in recent research). Performing like Gaussian Splatting, without the constraints of rasterization. Code announced💙
👉Review https://shorturl.at/26U06
👉Paper https://arxiv.org/pdf/2502.01157
👉Project https://radfoam.github.io/
👉Repo https://github.com/theialab/radfoam
👉 Radiant Foam is a novel scene representation by leveraging the decades-old efficient volumetric mesh ray tracing algorithm (largely overlooked in recent research). Performing like Gaussian Splatting, without the constraints of rasterization. Code announced💙
👉Review https://shorturl.at/26U06
👉Paper https://arxiv.org/pdf/2502.01157
👉Project https://radfoam.github.io/
👉Repo https://github.com/theialab/radfoam
🔥7❤1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 VideoJAM: #META's Video-Model (SOTA) 🔥
👉#META's VideoJAM: the new SOTA (by large margin) in motion coherence for video generation, much better than SORA! A strong motion prior into any video-gen model. Impressive results, no code announced🥲
👉Review https://shorturl.at/id7Bt
👉Paper https://arxiv.org/pdf/2502.02492
👉Project https://hila-chefer.github.io/videojam-paper.github.io/
👉#META's VideoJAM: the new SOTA (by large margin) in motion coherence for video generation, much better than SORA! A strong motion prior into any video-gen model. Impressive results, no code announced🥲
👉Review https://shorturl.at/id7Bt
👉Paper https://arxiv.org/pdf/2502.02492
👉Project https://hila-chefer.github.io/videojam-paper.github.io/
🔥9❤4👍1👏1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👗3D Dynamic Garments👗
👉UCLA introduces Dress-1-to-3, a novel pipeline that reconstructs physics-plausible, simulation-ready separated garments with sewing patterns and humans from an in-the-wild image.
👉Review https://t.ly/qciHV
👉Paper arxiv.org/pdf/2502.03449
👉Project dress-1-to-3.github.io
👉UCLA introduces Dress-1-to-3, a novel pipeline that reconstructs physics-plausible, simulation-ready separated garments with sewing patterns and humans from an in-the-wild image.
👉Review https://t.ly/qciHV
👉Paper arxiv.org/pdf/2502.03449
👉Project dress-1-to-3.github.io
🔥8❤3👍3👏2🤩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖 META Human-Robot 🤖
👉#META PARTNR: novel benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. The largest benchmark of its kind: 100,000+ natural language tasks, spanning 60 houses and 5,819 unique objects. Code & Data (🤗) under MIT💙
👉Review https://t.ly/zcN0K
👉Paper arxiv.org/pdf/2411.00081
👉Repo github.com/facebookresearch/partnr-planner
🤗Data huggingface.co/datasets/ai-habitat/partnr_episodes
👉#META PARTNR: novel benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. The largest benchmark of its kind: 100,000+ natural language tasks, spanning 60 houses and 5,819 unique objects. Code & Data (🤗) under MIT💙
👉Review https://t.ly/zcN0K
👉Paper arxiv.org/pdf/2411.00081
👉Repo github.com/facebookresearch/partnr-planner
🤗Data huggingface.co/datasets/ai-habitat/partnr_episodes
🔥8🤩2❤1👍1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
💃HumanDiT Long-form Human💃
👉HumanDiT is a novel pose-guided Diffusion trained on a large and wild dataset w/ 14,000 hours of HQ video to produce HD videos with fine-grained bodies. Stunning results but no code announced🥲
👉Review https://t.ly/7rTRr
👉Paper https://arxiv.org/pdf/2502.04847
👉Project https://agnjason.github.io/HumanDiT-page/
👉HumanDiT is a novel pose-guided Diffusion trained on a large and wild dataset w/ 14,000 hours of HQ video to produce HD videos with fine-grained bodies. Stunning results but no code announced🥲
👉Review https://t.ly/7rTRr
👉Paper https://arxiv.org/pdf/2502.04847
👉Project https://agnjason.github.io/HumanDiT-page/
❤5🔥3👍2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔮Flow-Based Foundation GenAI🔮
👉Goku is the novel SOTA family of joint image-and-video generation models leveraging rectified flow Transformers to achieve industry-leading performance. Amazing results! Repo released (now, empty)💙
👉Review https://t.ly/dzi0O
👉Paper http://arxiv.org/pdf/2502.04896
👉Project saiyan-world.github.io/goku/
👉Repo github.com/Saiyan-World/goku
👉Goku is the novel SOTA family of joint image-and-video generation models leveraging rectified flow Transformers to achieve industry-leading performance. Amazing results! Repo released (now, empty)💙
👉Review https://t.ly/dzi0O
👉Paper http://arxiv.org/pdf/2502.04896
👉Project saiyan-world.github.io/goku/
👉Repo github.com/Saiyan-World/goku
🔥7❤2😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥛HAMSTER: Hierarchical VLA Manipulation🥛
👉#Nvidia unveils HAMSTER: novel Hierarchical VLA architecture to enable robotic manipulation with semantic, visual & geometric generalization trained on easy to collect, off-domain data. Source Code announced💙
👉Review https://t.ly/2yXaY
👉Paper https://arxiv.org/pdf/2502.05485
👉Project https://hamster-robot.github.io/
👉Repo TBA
👉#Nvidia unveils HAMSTER: novel Hierarchical VLA architecture to enable robotic manipulation with semantic, visual & geometric generalization trained on easy to collect, off-domain data. Source Code announced💙
👉Review https://t.ly/2yXaY
👉Paper https://arxiv.org/pdf/2502.05485
👉Project https://hamster-robot.github.io/
👉Repo TBA
🔥4❤1
This media is not supported in your browser
VIEW IN TELEGRAM
🦶 It's all About Foot 🦶
👉 A collection of three works all about human foot: synthetic foot renders, reconstruction and surface normals. Repos & Datasets available💙
👉Review https://t.ly/GY8mL
👉Paper (last) arxiv.org/pdf/2502.06367
👉Projects www.ollieboyne.com/
👉Repo github.com/OllieBoyne/FOUND
👉Repo github.com/OllieBoyne/SynFoot
👉Repo github.com/OllieBoyne/FOCUS (coming)
👉 A collection of three works all about human foot: synthetic foot renders, reconstruction and surface normals. Repos & Datasets available💙
👉Review https://t.ly/GY8mL
👉Paper (last) arxiv.org/pdf/2502.06367
👉Projects www.ollieboyne.com/
👉Repo github.com/OllieBoyne/FOUND
👉Repo github.com/OllieBoyne/SynFoot
👉Repo github.com/OllieBoyne/FOCUS (coming)
🤩4❤2👍2🤣2⚡1😢1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪛 Make anything "Rig-Ready" 🪛
👉RigAnything is a novel autoregressive transformer-based model, which makes 3D assets rig-ready by probabilistically generating joints, skeleton topologies, and assigning skinning weights in a template-free manner. Online demo announced💙
👉Review https://t.ly/bNwxq
👉Paper arxiv.org/pdf/2502.09615
👉Project www.liuisabella.com/RigAnything
👉RigAnything is a novel autoregressive transformer-based model, which makes 3D assets rig-ready by probabilistically generating joints, skeleton topologies, and assigning skinning weights in a template-free manner. Online demo announced💙
👉Review https://t.ly/bNwxq
👉Paper arxiv.org/pdf/2502.09615
👉Project www.liuisabella.com/RigAnything
🔥14❤8👍4🤩1
Hi friends, what other kind of content would you like to *OCCASIONALLY* see in this group?
Anonymous Poll
44%
🔔 Job/Research offers
65%
📦 AI tools/news (with NO papers)
32%
🔥 Events & Hackathon
3%
📝 Other (comment please)
👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Animate Anyone 2 🔥
👉 The evolution of the first version that enables character animation w/ environment affordance. Amazing results but no code announced 🥲
👉Review https://t.ly/iNNLB
👉Paper https://arxiv.org/pdf/2502.06145
👉Project https://humanaigc.github.io/animate-anyone-2
👉 The evolution of the first version that enables character animation w/ environment affordance. Amazing results but no code announced 🥲
👉Review https://t.ly/iNNLB
👉Paper https://arxiv.org/pdf/2502.06145
👉Project https://humanaigc.github.io/animate-anyone-2
❤17🤯8👍1😍1