AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
237 videos
11 files
1.27K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯¦Gaussian Splatting VTONπŸ₯¦

πŸ‘‰GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announcedπŸ’™

πŸ‘‰Review https://t.ly/sTPbW
πŸ‘‰Paper arxiv.org/pdf/2410.05259
πŸ‘‰Project yukangcao.github.io/GS-VTON/
πŸ‘‰Repo github.com/yukangcao/GS-VTON
πŸ”₯14❀3πŸ‘1πŸ‘1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’‘Diffusion Models RelightingπŸ’‘

πŸ‘‰#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.

πŸ‘‰Review https://t.ly/fliXU
πŸ‘‰Paper arxiv.org/pdf/2410.08188
πŸ‘‰Project www.eyelinestudios.com/research/diffrelight.html
πŸ”₯17❀7⚑2πŸ‘2😍2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯ŽPOKEFLEX: Soft Object DatasetπŸ₯Ž

πŸ‘‰PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announcedπŸ’™

πŸ‘‰Review https://t.ly/GXggP
πŸ‘‰Paper arxiv.org/pdf/2410.07688
πŸ‘‰Project https://lnkd.in/duv-jS7a
πŸ‘‰Repo
πŸ‘7πŸ”₯2πŸ₯°1πŸ‘1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ DEPTH ANY VIDEO is out! πŸ”₯

πŸ‘‰DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!

πŸ‘‰Review https://t.ly/CjSz2
πŸ‘‰Paper arxiv.org/pdf/2410.10815
πŸ‘‰Project depthanyvideo.github.io/
πŸ‘‰Code github.com/Nightmare-n/DepthAnyVideo
πŸ”₯14🀯3❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺžRobo-Emulation via Video ImitationπŸͺž

πŸ‘‰OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.

πŸ‘‰Review https://t.ly/_N29-
πŸ‘‰Paper arxiv.org/pdf/2410.11792
πŸ‘‰Project https://lnkd.in/d6bHF_-s
πŸ‘4🀯2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ CoTracker3 by #META is out! πŸ”₯

πŸ‘‰#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🀯🀯🀯

πŸ‘‰Review https://t.ly/TcRIv
πŸ‘‰Paper arxiv.org/pdf/2410.11831
πŸ‘‰Project cotracker3.github.io/
πŸ‘‰Code github.com/facebookresearch/co-tracker
❀14πŸ”₯3🀯3🍾2πŸ‘1😱1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦠 Neural Metamorphosis 🦠

πŸ‘‰NU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.

πŸ‘‰Review https://t.ly/DJab3
πŸ‘‰Paper arxiv.org/pdf/2410.11878
πŸ‘‰Project adamdad.github.io/neumeta
πŸ‘‰Code github.com/Adamdad/neumeta
❀7πŸ”₯3🀯3😱2⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜€οΈ GS + Depth = SOTA β˜€οΈ

πŸ‘‰DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soonπŸ’™

πŸ‘‰Review https://t.ly/87HuH
πŸ‘‰Paper arxiv.org/abs/2410.13862
πŸ‘‰Project haofeixu.github.io/depthsplat/
πŸ‘‰Code github.com/cvg/depthsplat
🀯9πŸ”₯8❀3⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯BitNet: code of 1-bit LLM releasedπŸ”₯

πŸ‘‰BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released πŸ’™

πŸ‘‰Review https://t.ly/3G2LA
πŸ‘‰Paper arxiv.org/pdf/2310.11453
πŸ‘‰Code https://lnkd.in/duPADJVb
πŸ”₯21❀5🀯2πŸ‘1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧿 Look Ma, no markers 🧿

πŸ‘‰#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset releasedπŸ’™

πŸ‘‰Review https://t.ly/5fN0g
πŸ‘‰Paper arxiv.org/pdf/2410.11520
πŸ‘‰Project microsoft.github.io/SynthMoCap/
πŸ‘‰Repo github.com/microsoft/SynthMoCap
🀯16πŸ‘10πŸ”₯3😱3❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ PL2Map: efficient neural 2D-3D πŸͺ

πŸ‘‰PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences

πŸ‘‰Review https://t.ly/D-bVD
πŸ‘‰Paper arxiv.org/pdf/2402.18011
πŸ‘‰Project https://thpjp.github.io/pl2map
πŸ‘‰Code https://github.com/ais-lab/pl2map
πŸ”₯14🀯8πŸ‘2❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌻 Plant Camouflage Detection🌻

πŸ‘‰PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/pYFX4
πŸ‘‰Paper arxiv.org/pdf/2410.17598
πŸ‘‰Code github.com/yjybuaa/PlantCamo
❀11πŸ‘6🀯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ˆοΈ SMITE: SEGMENT IN TIME β›ˆοΈ

πŸ‘‰SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced πŸ’™

πŸ‘‰Review https://t.ly/w6aWJ
πŸ‘‰Paper arxiv.org/pdf/2410.18538
πŸ‘‰Project segment-me-in-time.github.io/
πŸ‘‰Repo github.com/alimohammadiamirhossein/smite
🀯11❀4🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Blendify: #Python + Blender 🫐

πŸ‘‰Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code releasedπŸ’™

πŸ‘‰Review https://t.ly/l0crA
πŸ‘‰Paper https://arxiv.org/pdf/2410.17858
πŸ‘‰Code https://virtualhumans.mpi-inf.mpg.de/blendify/
🀩13πŸ‘4πŸ”₯4❀2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ D-FINE: new SOTA Detector πŸ”₯

πŸ‘‰D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available πŸ’™

πŸ‘‰Review https://t.ly/aw9fN
πŸ‘‰Paper https://arxiv.org/pdf/2410.13842
πŸ‘‰Code https://github.com/Peterande/D-FINE
❀16πŸ‘3πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜 REM: Segment What You Describe 🍜

πŸ‘‰REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced πŸ’™

πŸ‘‰Review https://t.ly/OyVtV
πŸ‘‰Paper arxiv.org/pdf/2410.23287
πŸ‘‰Project https://miccooper9.github.io/projects/ReferEverything/
πŸ”₯18❀4πŸ‘3🀩2🀯1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜€οΈ Universal Relightable Avatars β˜€οΈ

πŸ‘‰#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!

πŸ‘‰Review https://t.ly/U-ESX
πŸ‘‰Paper arxiv.org/pdf/2410.24223
πŸ‘‰Project junxuan-li.github.io/urgca-website
❀11πŸ”₯5⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🏣 CityGaussianV2: Large-Scale City 🏣

πŸ‘‰A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code releasedπŸ’™

πŸ‘‰Review https://t.ly/Xgn59
πŸ‘‰Paper arxiv.org/pdf/2411.00771
πŸ‘‰Project dekuliutesla.github.io/CityGaussianV2/
πŸ‘‰Code github.com/DekuLiuTesla/CityGaussian
πŸ‘15πŸ”₯9❀2πŸ‘1