This media is not supported in your browser
VIEW IN TELEGRAM
ðĶī One-Image Object Detection ðĶī
ðDelft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code releasedð
ðReview https://t.ly/-li2G
ðPaper arxiv.org/pdf/2410.00900
ðCode github.com/RobinGerster7/OSSA
ðDelft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code releasedð
ðReview https://t.ly/-li2G
ðPaper arxiv.org/pdf/2410.00900
ðCode github.com/RobinGerster7/OSSA
ðĨ19ð2âĄ1ð1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðģïļ EVER Ellipsoid Rendering ðģïļ
ðUCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving âž30 FPS at 720p on #NVIDIA RTX4090.
ðReview https://t.ly/zAfGU
ðPaper arxiv.org/pdf/2410.01804
ðProject half-potato.gitlab.io/posts/ever/
ðUCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving âž30 FPS at 720p on #NVIDIA RTX4090.
ðReview https://t.ly/zAfGU
ðPaper arxiv.org/pdf/2410.01804
ðProject half-potato.gitlab.io/posts/ever/
ðĨ13âĪ2ð2ð1ðĪŊ1ðą1ðū1
ðĨ "Deep Gen-AI" Full Course ðĨ
ðA fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...
ðReview https://t.ly/ylBxq
ðCourse https://lnkd.in/dMKH9gNe
ðLectures https://lnkd.in/d_uwDvT6
ðA fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...
ðReview https://t.ly/ylBxq
ðCourse https://lnkd.in/dMKH9gNe
ðLectures https://lnkd.in/d_uwDvT6
âĪ21ðĨ7ð2ð1ðĨ°1ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ð EFM3D: 3D Ego-Foundation ð
ð#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code releasedð
ðReview https://t.ly/cDJv6
ðPaper arxiv.org/pdf/2406.10224
ðProject www.projectaria.com/datasets/aeo/
ðRepo github.com/facebookresearch/efm3d
ð#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code releasedð
ðReview https://t.ly/cDJv6
ðPaper arxiv.org/pdf/2406.10224
ðProject www.projectaria.com/datasets/aeo/
ðRepo github.com/facebookresearch/efm3d
ðĨ9âĪ2ð2âĄ1ð1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨĶGaussian Splatting VTONðĨĶ
ðGS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedð
ðReview https://t.ly/sTPbW
ðPaper arxiv.org/pdf/2410.05259
ðProject yukangcao.github.io/GS-VTON/
ðRepo github.com/yukangcao/GS-VTON
ðGS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedð
ðReview https://t.ly/sTPbW
ðPaper arxiv.org/pdf/2410.05259
ðProject yukangcao.github.io/GS-VTON/
ðRepo github.com/yukangcao/GS-VTON
ðĨ14âĪ3ð1ð1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĄDiffusion Models RelightingðĄ
ð#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
ðReview https://t.ly/fliXU
ðPaper arxiv.org/pdf/2410.08188
ðProject www.eyelinestudios.com/research/diffrelight.html
ð#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
ðReview https://t.ly/fliXU
ðPaper arxiv.org/pdf/2410.08188
ðProject www.eyelinestudios.com/research/diffrelight.html
ðĨ17âĪ7âĄ2ð2ð2ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨPOKEFLEX: Soft Object DatasetðĨ
ðPokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedð
ðReview https://t.ly/GXggP
ðPaper arxiv.org/pdf/2410.07688
ðProject https://lnkd.in/duv-jS7a
ðRepo
ðPokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedð
ðReview https://t.ly/GXggP
ðPaper arxiv.org/pdf/2410.07688
ðProject https://lnkd.in/duv-jS7a
ðRepo
ð7ðĨ2ðĨ°1ð1ðą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨ DEPTH ANY VIDEO is out! ðĨ
ðDAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
ðReview https://t.ly/CjSz2
ðPaper arxiv.org/pdf/2410.10815
ðProject depthanyvideo.github.io/
ðCode github.com/Nightmare-n/DepthAnyVideo
ðDAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
ðReview https://t.ly/CjSz2
ðPaper arxiv.org/pdf/2410.10815
ðProject depthanyvideo.github.io/
ðCode github.com/Nightmare-n/DepthAnyVideo
ðĨ14ðĪŊ3âĪ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŠRobo-Emulation via Video ImitationðŠ
ðOKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
ðReview https://t.ly/_N29-
ðPaper arxiv.org/pdf/2410.11792
ðProject https://lnkd.in/d6bHF_-s
ðOKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
ðReview https://t.ly/_N29-
ðPaper arxiv.org/pdf/2410.11792
ðProject https://lnkd.in/d6bHF_-s
ð4ðĪŊ2ðĨ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨ CoTracker3 by #META is out! ðĨ
ð#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data ðĪŊðĪŊðĪŊ
ðReview https://t.ly/TcRIv
ðPaper arxiv.org/pdf/2410.11831
ðProject cotracker3.github.io/
ðCode github.com/facebookresearch/co-tracker
ð#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data ðĪŊðĪŊðĪŊ
ðReview https://t.ly/TcRIv
ðPaper arxiv.org/pdf/2410.11831
ðProject cotracker3.github.io/
ðCode github.com/facebookresearch/co-tracker
âĪ14ðĨ3ðĪŊ3ðū2ð1ðą1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶ Neural Metamorphosis ðĶ
ðNU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
ðReview https://t.ly/DJab3
ðPaper arxiv.org/pdf/2410.11878
ðProject adamdad.github.io/neumeta
ðCode github.com/Adamdad/neumeta
ðNU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
ðReview https://t.ly/DJab3
ðPaper arxiv.org/pdf/2410.11878
ðProject adamdad.github.io/neumeta
ðCode github.com/Adamdad/neumeta
âĪ7ðĨ3ðĪŊ3ðą2âĄ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
âïļ GS + Depth = SOTA âïļ
ðDepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonð
ðReview https://t.ly/87HuH
ðPaper arxiv.org/abs/2410.13862
ðProject haofeixu.github.io/depthsplat/
ðCode github.com/cvg/depthsplat
ðDepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonð
ðReview https://t.ly/87HuH
ðPaper arxiv.org/abs/2410.13862
ðProject haofeixu.github.io/depthsplat/
ðCode github.com/cvg/depthsplat
ðĪŊ9ðĨ8âĪ3âĄ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨBitNet: code of 1-bit LLM releasedðĨ
ðBitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released ð
ðReview https://t.ly/3G2LA
ðPaper arxiv.org/pdf/2310.11453
ðCode https://lnkd.in/duPADJVb
ðBitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released ð
ðReview https://t.ly/3G2LA
ðPaper arxiv.org/pdf/2310.11453
ðCode https://lnkd.in/duPADJVb
ðĨ21âĪ5ðĪŊ2ð1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ð§ŋ Look Ma, no markers ð§ŋ
ð#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedð
ðReview https://t.ly/5fN0g
ðPaper arxiv.org/pdf/2410.11520
ðProject microsoft.github.io/SynthMoCap/
ðRepo github.com/microsoft/SynthMoCap
ð#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedð
ðReview https://t.ly/5fN0g
ðPaper arxiv.org/pdf/2410.11520
ðProject microsoft.github.io/SynthMoCap/
ðRepo github.com/microsoft/SynthMoCap
ðĪŊ16ð10ðĨ3ðą3âĪ1ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŠ PL2Map: efficient neural 2D-3D ðŠ
ðPL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
ðReview https://t.ly/D-bVD
ðPaper arxiv.org/pdf/2402.18011
ðProject https://thpjp.github.io/pl2map
ðCode https://github.com/ais-lab/pl2map
ðPL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
ðReview https://t.ly/D-bVD
ðPaper arxiv.org/pdf/2402.18011
ðProject https://thpjp.github.io/pl2map
ðCode https://github.com/ais-lab/pl2map
ðĨ14ðĪŊ8ð2âĪ1ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŧ Plant Camouflage Detectionðŧ
ðPlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released ð
ðReview https://t.ly/pYFX4
ðPaper arxiv.org/pdf/2410.17598
ðCode github.com/yjybuaa/PlantCamo
ðPlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released ð
ðReview https://t.ly/pYFX4
ðPaper arxiv.org/pdf/2410.17598
ðCode github.com/yjybuaa/PlantCamo
âĪ11ð6ðĪŊ4ð1ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
âïļ SMITE: SEGMENT IN TIME âïļ
ðSFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced ð
ðReview https://t.ly/w6aWJ
ðPaper arxiv.org/pdf/2410.18538
ðProject segment-me-in-time.github.io/
ðRepo github.com/alimohammadiamirhossein/smite
ðSFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced ð
ðReview https://t.ly/w6aWJ
ðPaper arxiv.org/pdf/2410.18538
ðProject segment-me-in-time.github.io/
ðRepo github.com/alimohammadiamirhossein/smite
ðĪŊ11âĪ4ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŦ Blendify: #Python + Blender ðŦ
ðLightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedð
ðReview https://t.ly/l0crA
ðPaper https://arxiv.org/pdf/2410.17858
ðCode https://virtualhumans.mpi-inf.mpg.de/blendify/
ðLightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedð
ðReview https://t.ly/l0crA
ðPaper https://arxiv.org/pdf/2410.17858
ðCode https://virtualhumans.mpi-inf.mpg.de/blendify/
ðĪĐ13ð4ðĨ4âĪ2ð1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨ D-FINE: new SOTA Detector ðĨ
ðD-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available ð
ðReview https://t.ly/aw9fN
ðPaper https://arxiv.org/pdf/2410.13842
ðCode https://github.com/Peterande/D-FINE
ðD-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available ð
ðReview https://t.ly/aw9fN
ðPaper https://arxiv.org/pdf/2410.13842
ðCode https://github.com/Peterande/D-FINE
âĪ16ð3ð1ðĪŊ1