This media is not supported in your browser
VIEW IN TELEGRAM
🪷 Tunnel Try-on: SOTA VTON 🪷
👉"Tunnel Try-on", the first diffusion-based video virtual try-on model that demonstrates SOTA performance in complex scenarios. No code announced :(
👉Review https://t.ly/joMtJ
👉Paper arxiv.org/pdf/2404.17571
👉Project mengtingchen.github.io/tunnel-try-on-page/
👉"Tunnel Try-on", the first diffusion-based video virtual try-on model that demonstrates SOTA performance in complex scenarios. No code announced :(
👉Review https://t.ly/joMtJ
👉Paper arxiv.org/pdf/2404.17571
👉Project mengtingchen.github.io/tunnel-try-on-page/
❤9🔥4👍1🥰1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🏝️1000x Scalable Neural 3D Fields🏝️
👉Highly-scalable neural 3D Fields: 1000x reductions in memory maintaining speed/quality: 10 MB vs. 10 GB! Code released 💙
👉Review https://t.ly/sLTK5
👉Paper https://lnkd.in/dEYM8-t2
👉Project https://lnkd.in/djptdujx
👉Code https://lnkd.in/dcCnFZ2n
👉Highly-scalable neural 3D Fields: 1000x reductions in memory maintaining speed/quality: 10 MB vs. 10 GB! Code released 💙
👉Review https://t.ly/sLTK5
👉Paper https://lnkd.in/dEYM8-t2
👉Project https://lnkd.in/djptdujx
👉Code https://lnkd.in/dcCnFZ2n
🤯13👍5🔥4❤3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🌐3D Scenes w/ Depth Inpainting🌐
👉Oxford announced two novel contributions to the field of 3D scene generation: a new benchmark and a novel depth completion model. 🤗-Demo and Source Code released💙
👉Review https://t.ly/BKiny
👉Paper arxiv.org/pdf/2404.19758
👉Project research.paulengstler.com/invisible-stitch/
👉Code github.com/paulengstler/invisible-stitch
👉Demo huggingface.co/spaces/paulengstler/invisible-stitch
👉Oxford announced two novel contributions to the field of 3D scene generation: a new benchmark and a novel depth completion model. 🤗-Demo and Source Code released💙
👉Review https://t.ly/BKiny
👉Paper arxiv.org/pdf/2404.19758
👉Project research.paulengstler.com/invisible-stitch/
👉Code github.com/paulengstler/invisible-stitch
👉Demo huggingface.co/spaces/paulengstler/invisible-stitch
❤3👏2👍1🔥1🥰1🤯1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🌊 Diffusive 3D Human Recovery 🌊
👉The Rutgers University unveils ScoreHMR at #CVPR24; novel approach for 3D human pose and shape reconstruction. Impressive results.
👉Review https://t.ly/G0k2D
👉Paper https://arxiv.org/pdf/2403.09623
👉Code https://github.com/statho/ScoreHMR
👉Project https://statho.github.io/ScoreHMR/
👉The Rutgers University unveils ScoreHMR at #CVPR24; novel approach for 3D human pose and shape reconstruction. Impressive results.
👉Review https://t.ly/G0k2D
👉Paper https://arxiv.org/pdf/2403.09623
👉Code https://github.com/statho/ScoreHMR
👉Project https://statho.github.io/ScoreHMR/
🤯11👍6❤1👏1🤣1
This media is not supported in your browser
VIEW IN TELEGRAM
🏷️DiffMOT (#CVPR24): diffusion-MOT🏷️
👉DiffMOT is a novel real-time diffusion-based MOT approach to tackle the complex nonlinear motion. Impressive results & Source Code released💙
👉Review https://t.ly/ztlHi
👉Paper https://lnkd.in/d4K3c-nt
👉Project https://diffmot.github.io/
👉Code github.com/Kroery/DiffMOT
👉DiffMOT is a novel real-time diffusion-based MOT approach to tackle the complex nonlinear motion. Impressive results & Source Code released💙
👉Review https://t.ly/ztlHi
👉Paper https://lnkd.in/d4K3c-nt
👉Project https://diffmot.github.io/
👉Code github.com/Kroery/DiffMOT
❤12👍4🔥3🤯3
This media is not supported in your browser
VIEW IN TELEGRAM
🍏 XFeat: Neural Features Matching 🍏
👉XFeat (Accelerated Features) is lightweight/accurate architecture for efficient visual correspondence. It revisits fundamental design choices in CNN for detecting, extracting & matching local features
👉Review https://t.ly/ppb38
👉Paper arxiv.org/pdf/2404.19174
👉Code https://lnkd.in/dFzTpzN8
👉Project https://lnkd.in/d8JnV-iu
👉XFeat (Accelerated Features) is lightweight/accurate architecture for efficient visual correspondence. It revisits fundamental design choices in CNN for detecting, extracting & matching local features
👉Review https://t.ly/ppb38
👉Paper arxiv.org/pdf/2404.19174
👉Code https://lnkd.in/dFzTpzN8
👉Project https://lnkd.in/d8JnV-iu
❤17🤯6⚡3👏1🍾1
🦑 Hyper-Detailed Image Descriptions 🦑
👉#Google unveils ImageInWords (IIW), a carefully designed HIL annotation framework for curating hyper-detailed image descriptions and a new dataset resulting from this process
👉Review https://t.ly/engkl
👉Paper arxiv.org/pdf/2405.02793
👉Repo github.com/google/imageinwords
👉Project google.github.io/imageinwords
👉Data huggingface.co/datasets/google/imageinwords
👉#Google unveils ImageInWords (IIW), a carefully designed HIL annotation framework for curating hyper-detailed image descriptions and a new dataset resulting from this process
👉Review https://t.ly/engkl
👉Paper arxiv.org/pdf/2405.02793
👉Repo github.com/google/imageinwords
👉Project google.github.io/imageinwords
👉Data huggingface.co/datasets/google/imageinwords
❤11🔥3👍2🤯2🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🔫 Free-Moving Reconstruction 🔫
👉EPFL (+#MagicLeap) unveils a novel approach for reconstructing free-moving object from monocular RGB clip. Free interaction with objects in front of a moving cam without relying on any prior, and optimizes the sequence globally without any segments. Great but no code announced🥺
👉Review https://t.ly/2xhtj
👉Paper arxiv.org/pdf/2405.05858
👉Project haixinshi.github.io/fmov/
👉EPFL (+#MagicLeap) unveils a novel approach for reconstructing free-moving object from monocular RGB clip. Free interaction with objects in front of a moving cam without relying on any prior, and optimizes the sequence globally without any segments. Great but no code announced🥺
👉Review https://t.ly/2xhtj
👉Paper arxiv.org/pdf/2405.05858
👉Project haixinshi.github.io/fmov/
👍6🤯4⚡1❤1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
💥FeatUp: Any Model at Any Resolution💥
👉FeatUp is a task-model agnostic framework to restore lost spatial information in deep features. It outperforms other methods in class activation map generation, transfer learning for segmentation & depth, and end-to-end training for semantic segm. Source Code released💙
👉Review https://t.ly/Evq_g
👉Paper https://lnkd.in/gweaN4s6
👉Project https://lnkd.in/gWcGXdxt
👉Code https://lnkd.in/gweq5NY4
👉FeatUp is a task-model agnostic framework to restore lost spatial information in deep features. It outperforms other methods in class activation map generation, transfer learning for segmentation & depth, and end-to-end training for semantic segm. Source Code released💙
👉Review https://t.ly/Evq_g
👉Paper https://lnkd.in/gweaN4s6
👉Project https://lnkd.in/gWcGXdxt
👉Code https://lnkd.in/gweq5NY4
🔥19❤4👍3👏1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🐏AniTalker: Universal Talking Humans🐏
👉SJTU (+AISpeech) unveils AniTalker, a framework that transforms a single static portrait and input audio into animated talking videos with naturally flowing movements.
👉Review https://t.ly/MD4yX
👉Paper https://arxiv.org/pdf/2405.03121
👉Project https://x-lance.github.io/AniTalker/
👉Repo https://github.com/X-LANCE/AniTalker
👉SJTU (+AISpeech) unveils AniTalker, a framework that transforms a single static portrait and input audio into animated talking videos with naturally flowing movements.
👉Review https://t.ly/MD4yX
👉Paper https://arxiv.org/pdf/2405.03121
👉Project https://x-lance.github.io/AniTalker/
👉Repo https://github.com/X-LANCE/AniTalker
🔥6❤4👍2⚡1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👻 3D Humans Motion from Text 👻
👉Zhejiang (+ANT) unveils a novel method to generate human motions containing accurate human-object interactions in 3D scenes based on textural descriptions. Code announced, coming 💙
👉Review https://t.ly/eOZnU
👉Paper https://arxiv.org/pdf/2405.07784
👉Project https://zju3dv.github.io/text_scene_motion/
👉Zhejiang (+ANT) unveils a novel method to generate human motions containing accurate human-object interactions in 3D scenes based on textural descriptions. Code announced, coming 💙
👉Review https://t.ly/eOZnU
👉Paper https://arxiv.org/pdf/2405.07784
👉Project https://zju3dv.github.io/text_scene_motion/
👍3🔥2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
🪬UHM: Authentic Hand by Phone🪬
👉 META unveils UHM, novel 3D high-fidelity avatarization of your (yes, the your one) hand. Adaptation pipeline fits the pre-trained UHM via phone scan. Source Code released 💙
👉Review https://t.ly/fU5rA
👉Paper https://lnkd.in/dyGaiAnq
👉Code https://lnkd.in/d9B_XFAA
👉 META unveils UHM, novel 3D high-fidelity avatarization of your (yes, the your one) hand. Adaptation pipeline fits the pre-trained UHM via phone scan. Source Code released 💙
👉Review https://t.ly/fU5rA
👉Paper https://lnkd.in/dyGaiAnq
👉Code https://lnkd.in/d9B_XFAA
👍4❤1🔥1🤯1
🔥EfficientTrain++: Efficient Foundation Visual Backbone Training🔥
👉Tsinghua unveils EfficientTrain++, a simple, general, surprisingly effective, off-the-shelf approach to reduce the training time of various popular models (e.g., ResNet, ConvNeXt, DeiT, PVT, Swin, CSWin, and CAFormer). Up to 3.0× faster on ImageNet-1K/22K without sacrificing accuracy. Source Code released 💙
👉Review https://t.ly/D8ttv
👉Paper https://arxiv.org/pdf/2405.08768
👉Code https://github.com/LeapLabTHU/EfficientTrain
👉Tsinghua unveils EfficientTrain++, a simple, general, surprisingly effective, off-the-shelf approach to reduce the training time of various popular models (e.g., ResNet, ConvNeXt, DeiT, PVT, Swin, CSWin, and CAFormer). Up to 3.0× faster on ImageNet-1K/22K without sacrificing accuracy. Source Code released 💙
👉Review https://t.ly/D8ttv
👉Paper https://arxiv.org/pdf/2405.08768
👉Code https://github.com/LeapLabTHU/EfficientTrain
👍9🔥3🤯3❤2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🫀 EchoTracker: Tracking Echocardiography🫀
👉EchoTracker: two-fold coarse-to-fine model that facilitates the tracking of queried points on a tissue surface across ultrasound. Source Code released💙
👉Review https://t.ly/NyBe0
👉Paper https://arxiv.org/pdf/2405.08587
👉Code https://github.com/riponazad/echotracker/
👉EchoTracker: two-fold coarse-to-fine model that facilitates the tracking of queried points on a tissue surface across ultrasound. Source Code released💙
👉Review https://t.ly/NyBe0
👉Paper https://arxiv.org/pdf/2405.08587
👉Code https://github.com/riponazad/echotracker/
❤15👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🦕 Grounding DINO 1.5 Pro/Edge 🦕
👉Grounding DINO 1.5, a suite of advanced open-set object detection models to advanced the "Edge" of open-set object detection. Source Code released under Apache 2.0💙
👉Review https://t.ly/kS-og
👉Paper https://lnkd.in/dNakMge2
👉Code https://lnkd.in/djhnQmrm
👉Grounding DINO 1.5, a suite of advanced open-set object detection models to advanced the "Edge" of open-set object detection. Source Code released under Apache 2.0💙
👉Review https://t.ly/kS-og
👉Paper https://lnkd.in/dNakMge2
👉Code https://lnkd.in/djhnQmrm
🔥22❤1👍1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽3D Shot Posture in Broadcast⚽
👉Nagoya Univeristy unveils 3DSP soccer broadcast videos, the most extensive sports image dataset with 2D pose annotations ever.
👉Review https://t.ly/IIMeZ
👉Paper https://arxiv.org/pdf/2405.12070
👉Code https://github.com/calvinyeungck/3D-Shot-Posture-Dataset/tree/master
👉Nagoya Univeristy unveils 3DSP soccer broadcast videos, the most extensive sports image dataset with 2D pose annotations ever.
👉Review https://t.ly/IIMeZ
👉Paper https://arxiv.org/pdf/2405.12070
👉Code https://github.com/calvinyeungck/3D-Shot-Posture-Dataset/tree/master
🔥8🥰1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🖼️ Diffusive Images that Sound 🖼️
👉The University of Michigan unveils a diffusion model able to generate spectrograms that look like images but can also be played as sound.
👉Review https://t.ly/ADtYM
👉Paper arxiv.org/pdf/2405.12221
👉Project ificl.github.io/images-that-sound
👉Code github.com/IFICL/images-that-sound
👉The University of Michigan unveils a diffusion model able to generate spectrograms that look like images but can also be played as sound.
👉Review https://t.ly/ADtYM
👉Paper arxiv.org/pdf/2405.12221
👉Project ificl.github.io/images-that-sound
👉Code github.com/IFICL/images-that-sound
🤯11❤5😍5🔥4👍1
This media is not supported in your browser
VIEW IN TELEGRAM
👚ViViD: Diffusion VTON👚
👉ViViD is a novel framework employing powerful diffusion models to tackle the task of video virtual try-on. Code announced, not released yet😢
👉Review https://t.ly/h_SyP
👉Paper arxiv.org/pdf/2405.11794
👉Repo https://lnkd.in/dT4_bzPw
👉Project https://lnkd.in/dCK5ug4v
👉ViViD is a novel framework employing powerful diffusion models to tackle the task of video virtual try-on. Code announced, not released yet😢
👉Review https://t.ly/h_SyP
👉Paper arxiv.org/pdf/2405.11794
👉Repo https://lnkd.in/dT4_bzPw
👉Project https://lnkd.in/dCK5ug4v
🔥13🤩3❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🍀OmniGlue: Foundation Matcher🍀
👉#Google OmniGlue from #CVPR24: the first learnable image matcher powered by foundation models. Impressive OOD results!
👉Review https://t.ly/ezaIc
👉Paper https://arxiv.org/pdf/2405.12979
👉Project hwjiang1510.github.io/OmniGlue/
👉Code https://github.com/google-research/omniglue/
👉#Google OmniGlue from #CVPR24: the first learnable image matcher powered by foundation models. Impressive OOD results!
👉Review https://t.ly/ezaIc
👉Paper https://arxiv.org/pdf/2405.12979
👉Project hwjiang1510.github.io/OmniGlue/
👉Code https://github.com/google-research/omniglue/
🤯10❤6👍2👏1
🔥 YOLOv10 is out 🔥
👉YOLOv10: novel real-time end-to-end object detection. Code released under GNU AGPL v3.0💙
👉Review https://shorturl.at/ZIHBh
👉Paper arxiv.org/pdf/2405.14458
👉Code https://github.com/THU-MIG/yolov10/
👉YOLOv10: novel real-time end-to-end object detection. Code released under GNU AGPL v3.0💙
👉Review https://shorturl.at/ZIHBh
👉Paper arxiv.org/pdf/2405.14458
👉Code https://github.com/THU-MIG/yolov10/
🔥25❤3👍2⚡1