This media is not supported in your browser
VIEW IN TELEGRAM
π₯¦Gaussian Splatting VTONπ₯¦
πGS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedπ
πReview https://t.ly/sTPbW
πPaper arxiv.org/pdf/2410.05259
πProject yukangcao.github.io/GS-VTON/
πRepo github.com/yukangcao/GS-VTON
πGS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedπ
πReview https://t.ly/sTPbW
πPaper arxiv.org/pdf/2410.05259
πProject yukangcao.github.io/GS-VTON/
πRepo github.com/yukangcao/GS-VTON
π₯14β€3π1π1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π‘Diffusion Models Relightingπ‘
π#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
πReview https://t.ly/fliXU
πPaper arxiv.org/pdf/2410.08188
πProject www.eyelinestudios.com/research/diffrelight.html
π#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
πReview https://t.ly/fliXU
πPaper arxiv.org/pdf/2410.08188
πProject www.eyelinestudios.com/research/diffrelight.html
π₯17β€7β‘2π2π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯POKEFLEX: Soft Object Datasetπ₯
πPokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedπ
πReview https://t.ly/GXggP
πPaper arxiv.org/pdf/2410.07688
πProject https://lnkd.in/duv-jS7a
πRepo
πPokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedπ
πReview https://t.ly/GXggP
πPaper arxiv.org/pdf/2410.07688
πProject https://lnkd.in/duv-jS7a
πRepo
π7π₯2π₯°1π1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ DEPTH ANY VIDEO is out! π₯
πDAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
πReview https://t.ly/CjSz2
πPaper arxiv.org/pdf/2410.10815
πProject depthanyvideo.github.io/
πCode github.com/Nightmare-n/DepthAnyVideo
πDAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
πReview https://t.ly/CjSz2
πPaper arxiv.org/pdf/2410.10815
πProject depthanyvideo.github.io/
πCode github.com/Nightmare-n/DepthAnyVideo
π₯14π€―3β€1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺRobo-Emulation via Video Imitationπͺ
πOKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
πReview https://t.ly/_N29-
πPaper arxiv.org/pdf/2410.11792
πProject https://lnkd.in/d6bHF_-s
πOKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
πReview https://t.ly/_N29-
πPaper arxiv.org/pdf/2410.11792
πProject https://lnkd.in/d6bHF_-s
π4π€―2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ CoTracker3 by #META is out! π₯
π#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data π€―π€―π€―
πReview https://t.ly/TcRIv
πPaper arxiv.org/pdf/2410.11831
πProject cotracker3.github.io/
πCode github.com/facebookresearch/co-tracker
π#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data π€―π€―π€―
πReview https://t.ly/TcRIv
πPaper arxiv.org/pdf/2410.11831
πProject cotracker3.github.io/
πCode github.com/facebookresearch/co-tracker
β€14π₯3π€―3πΎ2π1π±1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ Neural Metamorphosis π¦
πNU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
πReview https://t.ly/DJab3
πPaper arxiv.org/pdf/2410.11878
πProject adamdad.github.io/neumeta
πCode github.com/Adamdad/neumeta
πNU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
πReview https://t.ly/DJab3
πPaper arxiv.org/pdf/2410.11878
πProject adamdad.github.io/neumeta
πCode github.com/Adamdad/neumeta
β€7π₯3π€―3π±2β‘1π1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ GS + Depth = SOTA βοΈ
πDepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonπ
πReview https://t.ly/87HuH
πPaper arxiv.org/abs/2410.13862
πProject haofeixu.github.io/depthsplat/
πCode github.com/cvg/depthsplat
πDepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonπ
πReview https://t.ly/87HuH
πPaper arxiv.org/abs/2410.13862
πProject haofeixu.github.io/depthsplat/
πCode github.com/cvg/depthsplat
π€―9π₯8β€3β‘1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯BitNet: code of 1-bit LLM releasedπ₯
πBitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released π
πReview https://t.ly/3G2LA
πPaper arxiv.org/pdf/2310.11453
πCode https://lnkd.in/duPADJVb
πBitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released π
πReview https://t.ly/3G2LA
πPaper arxiv.org/pdf/2310.11453
πCode https://lnkd.in/duPADJVb
π₯21β€5π€―2π1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π§Ώ Look Ma, no markers π§Ώ
π#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedπ
πReview https://t.ly/5fN0g
πPaper arxiv.org/pdf/2410.11520
πProject microsoft.github.io/SynthMoCap/
πRepo github.com/microsoft/SynthMoCap
π#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedπ
πReview https://t.ly/5fN0g
πPaper arxiv.org/pdf/2410.11520
πProject microsoft.github.io/SynthMoCap/
πRepo github.com/microsoft/SynthMoCap
π€―16π10π₯3π±3β€1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ PL2Map: efficient neural 2D-3D πͺ
πPL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
πReview https://t.ly/D-bVD
πPaper arxiv.org/pdf/2402.18011
πProject https://thpjp.github.io/pl2map
πCode https://github.com/ais-lab/pl2map
πPL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
πReview https://t.ly/D-bVD
πPaper arxiv.org/pdf/2402.18011
πProject https://thpjp.github.io/pl2map
πCode https://github.com/ais-lab/pl2map
π₯14π€―8π2β€1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π» Plant Camouflage Detectionπ»
πPlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released π
πReview https://t.ly/pYFX4
πPaper arxiv.org/pdf/2410.17598
πCode github.com/yjybuaa/PlantCamo
πPlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released π
πReview https://t.ly/pYFX4
πPaper arxiv.org/pdf/2410.17598
πCode github.com/yjybuaa/PlantCamo
β€11π6π€―4π1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ SMITE: SEGMENT IN TIME βοΈ
πSFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced π
πReview https://t.ly/w6aWJ
πPaper arxiv.org/pdf/2410.18538
πProject segment-me-in-time.github.io/
πRepo github.com/alimohammadiamirhossein/smite
πSFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced π
πReview https://t.ly/w6aWJ
πPaper arxiv.org/pdf/2410.18538
πProject segment-me-in-time.github.io/
πRepo github.com/alimohammadiamirhossein/smite
π€―11β€4π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π« Blendify: #Python + Blender π«
πLightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedπ
πReview https://t.ly/l0crA
πPaper https://arxiv.org/pdf/2410.17858
πCode https://virtualhumans.mpi-inf.mpg.de/blendify/
πLightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedπ
πReview https://t.ly/l0crA
πPaper https://arxiv.org/pdf/2410.17858
πCode https://virtualhumans.mpi-inf.mpg.de/blendify/
π€©13π4π₯4β€2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ D-FINE: new SOTA Detector π₯
πD-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available π
πReview https://t.ly/aw9fN
πPaper https://arxiv.org/pdf/2410.13842
πCode https://github.com/Peterande/D-FINE
πD-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available π
πReview https://t.ly/aw9fN
πPaper https://arxiv.org/pdf/2410.13842
πCode https://github.com/Peterande/D-FINE
β€16π3π1π€―1
AI with Papers - Artificial Intelligence & Deep Learning
π« Free-Moving Reconstruction π« πEPFL (+#MagicLeap) unveils a novel approach for reconstructing free-moving object from monocular RGB clip. Free interaction with objects in front of a moving cam without relying on any prior, and optimizes the sequence globallyβ¦
GitHub
GitHub - HaixinShi/fmov_pose: This is the official repo for the implementation of Free-Moving Object Reconstruction and Pose Estimationβ¦
This is the official repo for the implementation of Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera(AAAI 2025). - HaixinShi/fmov_pose
π1
This media is not supported in your browser
VIEW IN TELEGRAM
π REM: Segment What You Describe π
πREM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced π
πReview https://t.ly/OyVtV
πPaper arxiv.org/pdf/2410.23287
πProject https://miccooper9.github.io/projects/ReferEverything/
πREM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced π
πReview https://t.ly/OyVtV
πPaper arxiv.org/pdf/2410.23287
πProject https://miccooper9.github.io/projects/ReferEverything/
π₯18β€4π3π€©2π€―1π1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ Universal Relightable Avatars βοΈ
π#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!
πReview https://t.ly/U-ESX
πPaper arxiv.org/pdf/2410.24223
πProject junxuan-li.github.io/urgca-website
π#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!
πReview https://t.ly/U-ESX
πPaper arxiv.org/pdf/2410.24223
πProject junxuan-li.github.io/urgca-website
β€11π₯5β‘1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π£ CityGaussianV2: Large-Scale City π£
πA novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code releasedπ
πReview https://t.ly/Xgn59
πPaper arxiv.org/pdf/2411.00771
πProject dekuliutesla.github.io/CityGaussianV2/
πCode github.com/DekuLiuTesla/CityGaussian
πA novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code releasedπ
πReview https://t.ly/Xgn59
πPaper arxiv.org/pdf/2411.00771
πProject dekuliutesla.github.io/CityGaussianV2/
πCode github.com/DekuLiuTesla/CityGaussian
π15π₯9β€2π1