This media is not supported in your browser
VIEW IN TELEGRAM
ðĒ Robo-quadruped ParkourðĒ
ðLAAS-CNRS unveils a novel RL approach to perform agile skills that are reminiscent of parkour, such as walking, climbing high steps, leaping over gaps, and crawling under obstacles. Data and Code availableð
ðReview https://t.ly/-6VRm
ðPaper arxiv.org/pdf/2409.13678
ðProject gepetto.github.io/SoloParkour/
ðCode github.com/Gepetto/SoloParkour
ðLAAS-CNRS unveils a novel RL approach to perform agile skills that are reminiscent of parkour, such as walking, climbing high steps, leaping over gaps, and crawling under obstacles. Data and Code availableð
ðReview https://t.ly/-6VRm
ðPaper arxiv.org/pdf/2409.13678
ðProject gepetto.github.io/SoloParkour/
ðCode github.com/Gepetto/SoloParkour
ðĨ5ð2ð1ðĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðа Dressed Humans in the wild ðа
ðETH (+ #Microsoft ) ReLoo: novel 3D-HQ reconstruction of humans dressed in loose garments from mono in-the-wild clips. No prior assumptions about the garments. Source Code announced, coming ð
ðReview https://t.ly/evgmN
ðPaper arxiv.org/pdf/2409.15269
ðProject moygcc.github.io/ReLoo/
ðCode github.com/eth-ait/ReLoo
ðETH (+ #Microsoft ) ReLoo: novel 3D-HQ reconstruction of humans dressed in loose garments from mono in-the-wild clips. No prior assumptions about the garments. Source Code announced, coming ð
ðReview https://t.ly/evgmN
ðPaper arxiv.org/pdf/2409.15269
ðProject moygcc.github.io/ReLoo/
ðCode github.com/eth-ait/ReLoo
ðĪŊ9âĪ2ð1ðĨ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðū New SOTA Edge Detection ðū
ðCUP (+ ESPOCH) unveils the new SOTA for Edge Detection (NBED); superior performance consistently across multiple benchmarks, even compared with huge computational cost and complex training models. Source Code releasedð
ðReview https://t.ly/zUMcS
ðPaper arxiv.org/pdf/2409.14976
ðCode github.com/Li-yachuan/NBED
ðCUP (+ ESPOCH) unveils the new SOTA for Edge Detection (NBED); superior performance consistently across multiple benchmarks, even compared with huge computational cost and complex training models. Source Code releasedð
ðReview https://t.ly/zUMcS
ðPaper arxiv.org/pdf/2409.14976
ðCode github.com/Li-yachuan/NBED
ðĨ11ð5ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĐâðͰ SOTA Gaussian Haircut ðĐâðͰ
ðETH et. al unveils Gaussian Haircut, the new SOTA in hair reconstruction via dual representation (classic + 3D Gaussian). Code and Model announcedð
ðReview https://t.ly/aiOjq
ðPaper arxiv.org/pdf/2409.14778
ðProject https://lnkd.in/dFRm2ycb
ðRepo https://lnkd.in/d5NWNkb5
ðETH et. al unveils Gaussian Haircut, the new SOTA in hair reconstruction via dual representation (classic + 3D Gaussian). Code and Model announcedð
ðReview https://t.ly/aiOjq
ðPaper arxiv.org/pdf/2409.14778
ðProject https://lnkd.in/dFRm2ycb
ðRepo https://lnkd.in/d5NWNkb5
ðĨ16ð2âĪ1ðĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðSPARK: Real-time Face Captureð
ðTechnicolor Group unveils SPARK, a novel high-precision 3D face capture via collection of unconstrained videos of a subject as prior information. New SOTA able to handle unseen pose, expression and lighting. Impressive results. Code & Model announcedð
ðReview https://t.ly/rZOgp
ðPaper arxiv.org/pdf/2409.07984
ðProject kelianb.github.io/SPARK/
ðRepo github.com/KelianB/SPARK/
ðTechnicolor Group unveils SPARK, a novel high-precision 3D face capture via collection of unconstrained videos of a subject as prior information. New SOTA able to handle unseen pose, expression and lighting. Impressive results. Code & Model announcedð
ðReview https://t.ly/rZOgp
ðPaper arxiv.org/pdf/2409.07984
ðProject kelianb.github.io/SPARK/
ðRepo github.com/KelianB/SPARK/
ðĨ10âĪ2ð1ðĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶī One-Image Object Detection ðĶī
ðDelft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code releasedð
ðReview https://t.ly/-li2G
ðPaper arxiv.org/pdf/2410.00900
ðCode github.com/RobinGerster7/OSSA
ðDelft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code releasedð
ðReview https://t.ly/-li2G
ðPaper arxiv.org/pdf/2410.00900
ðCode github.com/RobinGerster7/OSSA
ðĨ19ð2âĄ1ð1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðģïļ EVER Ellipsoid Rendering ðģïļ
ðUCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving âž30 FPS at 720p on #NVIDIA RTX4090.
ðReview https://t.ly/zAfGU
ðPaper arxiv.org/pdf/2410.01804
ðProject half-potato.gitlab.io/posts/ever/
ðUCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving âž30 FPS at 720p on #NVIDIA RTX4090.
ðReview https://t.ly/zAfGU
ðPaper arxiv.org/pdf/2410.01804
ðProject half-potato.gitlab.io/posts/ever/
ðĨ13âĪ2ð2ð1ðĪŊ1ðą1ðū1
ðĨ "Deep Gen-AI" Full Course ðĨ
ðA fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...
ðReview https://t.ly/ylBxq
ðCourse https://lnkd.in/dMKH9gNe
ðLectures https://lnkd.in/d_uwDvT6
ðA fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...
ðReview https://t.ly/ylBxq
ðCourse https://lnkd.in/dMKH9gNe
ðLectures https://lnkd.in/d_uwDvT6
âĪ21ðĨ7ð2ð1ðĨ°1ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ð EFM3D: 3D Ego-Foundation ð
ð#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code releasedð
ðReview https://t.ly/cDJv6
ðPaper arxiv.org/pdf/2406.10224
ðProject www.projectaria.com/datasets/aeo/
ðRepo github.com/facebookresearch/efm3d
ð#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code releasedð
ðReview https://t.ly/cDJv6
ðPaper arxiv.org/pdf/2406.10224
ðProject www.projectaria.com/datasets/aeo/
ðRepo github.com/facebookresearch/efm3d
ðĨ9âĪ2ð2âĄ1ð1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨĶGaussian Splatting VTONðĨĶ
ðGS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedð
ðReview https://t.ly/sTPbW
ðPaper arxiv.org/pdf/2410.05259
ðProject yukangcao.github.io/GS-VTON/
ðRepo github.com/yukangcao/GS-VTON
ðGS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedð
ðReview https://t.ly/sTPbW
ðPaper arxiv.org/pdf/2410.05259
ðProject yukangcao.github.io/GS-VTON/
ðRepo github.com/yukangcao/GS-VTON
ðĨ14âĪ3ð1ð1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĄDiffusion Models RelightingðĄ
ð#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
ðReview https://t.ly/fliXU
ðPaper arxiv.org/pdf/2410.08188
ðProject www.eyelinestudios.com/research/diffrelight.html
ð#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
ðReview https://t.ly/fliXU
ðPaper arxiv.org/pdf/2410.08188
ðProject www.eyelinestudios.com/research/diffrelight.html
ðĨ17âĪ7âĄ2ð2ð2ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨPOKEFLEX: Soft Object DatasetðĨ
ðPokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedð
ðReview https://t.ly/GXggP
ðPaper arxiv.org/pdf/2410.07688
ðProject https://lnkd.in/duv-jS7a
ðRepo
ðPokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedð
ðReview https://t.ly/GXggP
ðPaper arxiv.org/pdf/2410.07688
ðProject https://lnkd.in/duv-jS7a
ðRepo
ð7ðĨ2ðĨ°1ð1ðą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨ DEPTH ANY VIDEO is out! ðĨ
ðDAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
ðReview https://t.ly/CjSz2
ðPaper arxiv.org/pdf/2410.10815
ðProject depthanyvideo.github.io/
ðCode github.com/Nightmare-n/DepthAnyVideo
ðDAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
ðReview https://t.ly/CjSz2
ðPaper arxiv.org/pdf/2410.10815
ðProject depthanyvideo.github.io/
ðCode github.com/Nightmare-n/DepthAnyVideo
ðĨ14ðĪŊ3âĪ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŠRobo-Emulation via Video ImitationðŠ
ðOKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
ðReview https://t.ly/_N29-
ðPaper arxiv.org/pdf/2410.11792
ðProject https://lnkd.in/d6bHF_-s
ðOKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
ðReview https://t.ly/_N29-
ðPaper arxiv.org/pdf/2410.11792
ðProject https://lnkd.in/d6bHF_-s
ð4ðĪŊ2ðĨ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨ CoTracker3 by #META is out! ðĨ
ð#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data ðĪŊðĪŊðĪŊ
ðReview https://t.ly/TcRIv
ðPaper arxiv.org/pdf/2410.11831
ðProject cotracker3.github.io/
ðCode github.com/facebookresearch/co-tracker
ð#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data ðĪŊðĪŊðĪŊ
ðReview https://t.ly/TcRIv
ðPaper arxiv.org/pdf/2410.11831
ðProject cotracker3.github.io/
ðCode github.com/facebookresearch/co-tracker
âĪ14ðĨ3ðĪŊ3ðū2ð1ðą1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶ Neural Metamorphosis ðĶ
ðNU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
ðReview https://t.ly/DJab3
ðPaper arxiv.org/pdf/2410.11878
ðProject adamdad.github.io/neumeta
ðCode github.com/Adamdad/neumeta
ðNU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
ðReview https://t.ly/DJab3
ðPaper arxiv.org/pdf/2410.11878
ðProject adamdad.github.io/neumeta
ðCode github.com/Adamdad/neumeta
âĪ7ðĨ3ðĪŊ3ðą2âĄ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
âïļ GS + Depth = SOTA âïļ
ðDepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonð
ðReview https://t.ly/87HuH
ðPaper arxiv.org/abs/2410.13862
ðProject haofeixu.github.io/depthsplat/
ðCode github.com/cvg/depthsplat
ðDepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonð
ðReview https://t.ly/87HuH
ðPaper arxiv.org/abs/2410.13862
ðProject haofeixu.github.io/depthsplat/
ðCode github.com/cvg/depthsplat
ðĪŊ9ðĨ8âĪ3âĄ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨBitNet: code of 1-bit LLM releasedðĨ
ðBitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released ð
ðReview https://t.ly/3G2LA
ðPaper arxiv.org/pdf/2310.11453
ðCode https://lnkd.in/duPADJVb
ðBitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released ð
ðReview https://t.ly/3G2LA
ðPaper arxiv.org/pdf/2310.11453
ðCode https://lnkd.in/duPADJVb
ðĨ21âĪ5ðĪŊ2ð1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ð§ŋ Look Ma, no markers ð§ŋ
ð#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedð
ðReview https://t.ly/5fN0g
ðPaper arxiv.org/pdf/2410.11520
ðProject microsoft.github.io/SynthMoCap/
ðRepo github.com/microsoft/SynthMoCap
ð#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedð
ðReview https://t.ly/5fN0g
ðPaper arxiv.org/pdf/2410.11520
ðProject microsoft.github.io/SynthMoCap/
ðRepo github.com/microsoft/SynthMoCap
ðĪŊ16ð10ðĨ3ðą3âĪ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŠ PL2Map: efficient neural 2D-3D ðŠ
ðPL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
ðReview https://t.ly/D-bVD
ðPaper arxiv.org/pdf/2402.18011
ðProject https://thpjp.github.io/pl2map
ðCode https://github.com/ais-lab/pl2map
ðPL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
ðReview https://t.ly/D-bVD
ðPaper arxiv.org/pdf/2402.18011
ðProject https://thpjp.github.io/pl2map
ðCode https://github.com/ais-lab/pl2map
ðĨ14ðĪŊ8ð2âĪ1ðĪĐ1