AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
96 photos
238 videos
11 files
1.27K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ€ MAVOS Object Segmentation πŸ€

πŸ‘‰MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)πŸ’™

πŸ‘‰Review https://t.ly/SKaRG
πŸ‘‰Paper https://lnkd.in/dQyifKa3
πŸ‘‰Project github.com/Amshaker/MAVOS
πŸ”₯10πŸ‘2❀1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’¦ ObjectDrop: automagical objects removal πŸ’¦

πŸ‘‰#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!

πŸ‘‰Review https://t.ly/ZJ6NN
πŸ‘‰Paper https://arxiv.org/pdf/2403.18818.pdf
πŸ‘‰Project https://objectdrop.github.io/
πŸ‘14🀯8❀4πŸ”₯3🍾2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺΌ Universal Mono Metric Depth πŸͺΌ

πŸ‘‰ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code releasedπŸ’™

πŸ‘‰Review https://t.ly/5C8eq
πŸ‘‰Paper arxiv.org/pdf/2403.18913.pdf
πŸ‘‰Code github.com/lpiccinelli-eth/unidepth
πŸ”₯10πŸ‘1🀣1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”˜ RELI11D: Multimodal Humans πŸ”˜

πŸ‘‰RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soonπŸ’™

πŸ‘‰Review https://t.ly/5EG6X
πŸ‘‰Paper https://lnkd.in/ep6Utcik
πŸ‘‰Project https://lnkd.in/eDhNHYBb
❀3πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ ECoDepth: SOTA Diffusive Mono-Depth πŸ”₯

πŸ‘‰New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/s2pbB
πŸ‘‰Paper https://lnkd.in/eYt5yr_q
πŸ‘‰Code https://lnkd.in/eEcyPQcd
πŸ”₯11πŸ‘4❀3⚑1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ•·οΈ Gen-NeRF2NeRF Translation πŸ•·οΈ

πŸ‘‰GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.

πŸ‘‰Review https://t.ly/VMWAH
πŸ‘‰Paper arxiv.org/pdf/2404.02788.pdf
πŸ‘‰Project xiangyueliu.github.io/GenN2N/
πŸ‘‰Code github.com/Lxiangyue/GenN2N
🀯4❀3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘†iSeg: Interactive 3D SegmentationπŸ‘†

πŸ‘‰ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.

πŸ‘‰Review https://t.ly/tyFnD
πŸ‘‰Paper https://lnkd.in/dydAz8zp
πŸ‘‰Project https://lnkd.in/de-h6SRi
πŸ‘‰Code (coming)
❀7πŸ‘2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘— Neural Bodies with Clothes πŸ‘—

πŸ‘‰Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.

πŸ‘‰Review https://t.ly/Un1wc
πŸ‘‰Project https://lnkd.in/dhDG6FF5
πŸ‘‰Paper https://lnkd.in/dhcfK7jZ
πŸ‘‰Code https://lnkd.in/dQvXWysP
πŸ”₯7πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”Œ BodyMAP: human body & pressure πŸ”Œ

πŸ‘‰#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming πŸ’™

πŸ‘‰Review https://t.ly/8926S
πŸ‘‰Project bodymap3d.github.io/
πŸ‘‰Paper https://lnkd.in/gCxH4ev3
πŸ‘‰Code https://lnkd.in/gaifdy3q
❀8🀯4⚑1πŸ‘1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧞 XComposer2: 4K Vision-Language 🧞

πŸ‘‰InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840Γ—1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released πŸ’™

πŸ‘‰Review https://t.ly/GCHsz
πŸ‘‰Paper arxiv.org/pdf/2404.06512.pdf
πŸ‘‰Code github.com/InternLM/InternLM-XComposer
πŸ₯°7⚑2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
βš›οΈ Flying w/ Photons: Neural Render βš›οΈ

πŸ‘‰Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!

πŸ‘‰Review https://t.ly/ZqL3a
πŸ‘‰Paper arxiv.org/pdf/2404.06493.pdf
πŸ‘‰Project anaghmalik.com/FlyingWithPhotons/
πŸ‘‰Code github.com/anaghmalik/FlyingWithPhotons
🀯6⚑3❀2πŸ‘1🀣1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜„οΈ Tracking Any 2D Pixels in 3D β˜„οΈ

πŸ‘‰ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.

πŸ‘‰Review https://t.ly/B28Cj
πŸ‘‰Paper https://lnkd.in/d8ers_nm
πŸ‘‰Project https://lnkd.in/deHjtZuE
πŸ‘‰Code https://lnkd.in/dMe3TvFT
❀10πŸ”₯5⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺYOLO-CIANNA: Neural AstroπŸͺ

πŸ‘‰ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/441XS
πŸ‘‰Paper arxiv.org/pdf/2402.05925.pdf
πŸ‘‰Code github.com/Deyht/CIANNA
πŸ‘‰Wiki github.com/Deyht/CIANNA/wiki
πŸ‘7⚑5❀4πŸ”₯2πŸ₯°2
This media is not supported in your browser
VIEW IN TELEGRAM
🧀Neuro MusculoSkeletal-MANO🧀

πŸ‘‰SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/HOQrn
πŸ‘‰Paper arxiv.org/pdf/2404.10227.pdf
πŸ‘‰Project https://ms-mano.robotflow.ai/
πŸ‘‰Code announced (no repo yet)
πŸ”₯3⚑1❀1πŸ‘1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽SoccerNET: Athlete Tracking⚽

πŸ‘‰SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.

πŸ‘‰Review https://t.ly/Mdu9s
πŸ‘‰Paper arxiv.org/pdf/2404.11335.pdf
πŸ‘‰Code github.com/SoccerNet/sn-gamestate
❀9πŸ‘8πŸ”₯3⚑2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎲 Articulated Objs from MonoClips 🎲

πŸ‘‰REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video

πŸ‘‰Review https://t.ly/REuM8
πŸ‘‰Paper https://lnkd.in/d6PWagij
πŸ‘‰Project https://lnkd.in/dpg3x4tm
πŸ‘‰Repo https://lnkd.in/dRZWj6_N
🀯6πŸ‘1πŸ”₯1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺΌ All You Need is SAM (+Flow) πŸͺΌ

πŸ‘‰Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/ZRYtp
πŸ‘‰Paper https://lnkd.in/d4XqkEGF
πŸ‘‰Project https://lnkd.in/dHpmx3FF
πŸ‘‰Repo coming: https://github.com/Jyxarthur/
❀12πŸ‘7πŸ”₯2🀯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ›ž 6Img-to-3D driving scenarios πŸ›ž

πŸ‘‰EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics

πŸ‘‰Review https://shorturl.at/dZ018
πŸ‘‰Paper arxiv.org/pdf/2404.12378.pdf
πŸ‘‰Project 6img-to-3d.github.io/
πŸ‘‰Code github.com/continental/6Img-to-3D
πŸ”₯5❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌹 Physics-Based 3D Video-Gen 🌹

πŸ‘‰PhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects

πŸ‘‰Review https://t.ly/zxXf9
πŸ‘‰Paper arxiv.org/pdf/2404.13026.pdf
πŸ‘‰Project physdreamer.github.io/
πŸ‘‰Code github.com/a1600012888/PhysDreamer
πŸ‘14❀9🀯4πŸ‘1