AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”˜ RELI11D: Multimodal Humans πŸ”˜

πŸ‘‰RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soonπŸ’™

πŸ‘‰Review https://t.ly/5EG6X
πŸ‘‰Paper https://lnkd.in/ep6Utcik
πŸ‘‰Project https://lnkd.in/eDhNHYBb
❀3πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ ECoDepth: SOTA Diffusive Mono-Depth πŸ”₯

πŸ‘‰New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/s2pbB
πŸ‘‰Paper https://lnkd.in/eYt5yr_q
πŸ‘‰Code https://lnkd.in/eEcyPQcd
πŸ”₯11πŸ‘4❀3⚑1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ•·οΈ Gen-NeRF2NeRF Translation πŸ•·οΈ

πŸ‘‰GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.

πŸ‘‰Review https://t.ly/VMWAH
πŸ‘‰Paper arxiv.org/pdf/2404.02788.pdf
πŸ‘‰Project xiangyueliu.github.io/GenN2N/
πŸ‘‰Code github.com/Lxiangyue/GenN2N
🀯4❀3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘†iSeg: Interactive 3D SegmentationπŸ‘†

πŸ‘‰ iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.

πŸ‘‰Review https://t.ly/tyFnD
πŸ‘‰Paper https://lnkd.in/dydAz8zp
πŸ‘‰Project https://lnkd.in/de-h6SRi
πŸ‘‰Code (coming)
❀7πŸ‘2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘— Neural Bodies with Clothes πŸ‘—

πŸ‘‰Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.

πŸ‘‰Review https://t.ly/Un1wc
πŸ‘‰Project https://lnkd.in/dhDG6FF5
πŸ‘‰Paper https://lnkd.in/dhcfK7jZ
πŸ‘‰Code https://lnkd.in/dQvXWysP
πŸ”₯7πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”Œ BodyMAP: human body & pressure πŸ”Œ

πŸ‘‰#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming πŸ’™

πŸ‘‰Review https://t.ly/8926S
πŸ‘‰Project bodymap3d.github.io/
πŸ‘‰Paper https://lnkd.in/gCxH4ev3
πŸ‘‰Code https://lnkd.in/gaifdy3q
❀8🀯4⚑1πŸ‘1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧞 XComposer2: 4K Vision-Language 🧞

πŸ‘‰InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840Γ—1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released πŸ’™

πŸ‘‰Review https://t.ly/GCHsz
πŸ‘‰Paper arxiv.org/pdf/2404.06512.pdf
πŸ‘‰Code github.com/InternLM/InternLM-XComposer
πŸ₯°7⚑2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
βš›οΈ Flying w/ Photons: Neural Render βš›οΈ

πŸ‘‰Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!

πŸ‘‰Review https://t.ly/ZqL3a
πŸ‘‰Paper arxiv.org/pdf/2404.06493.pdf
πŸ‘‰Project anaghmalik.com/FlyingWithPhotons/
πŸ‘‰Code github.com/anaghmalik/FlyingWithPhotons
🀯6⚑3❀2πŸ‘1🀣1
This media is not supported in your browser
VIEW IN TELEGRAM
β˜„οΈ Tracking Any 2D Pixels in 3D β˜„οΈ

πŸ‘‰ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.

πŸ‘‰Review https://t.ly/B28Cj
πŸ‘‰Paper https://lnkd.in/d8ers_nm
πŸ‘‰Project https://lnkd.in/deHjtZuE
πŸ‘‰Code https://lnkd.in/dMe3TvFT
❀10πŸ”₯5⚑1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺYOLO-CIANNA: Neural AstroπŸͺ

πŸ‘‰ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/441XS
πŸ‘‰Paper arxiv.org/pdf/2402.05925.pdf
πŸ‘‰Code github.com/Deyht/CIANNA
πŸ‘‰Wiki github.com/Deyht/CIANNA/wiki
πŸ‘7⚑5❀4πŸ”₯2πŸ₯°2
This media is not supported in your browser
VIEW IN TELEGRAM
🧀Neuro MusculoSkeletal-MANO🧀

πŸ‘‰SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/HOQrn
πŸ‘‰Paper arxiv.org/pdf/2404.10227.pdf
πŸ‘‰Project https://ms-mano.robotflow.ai/
πŸ‘‰Code announced (no repo yet)
πŸ”₯3⚑1❀1πŸ‘1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽SoccerNET: Athlete Tracking⚽

πŸ‘‰SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.

πŸ‘‰Review https://t.ly/Mdu9s
πŸ‘‰Paper arxiv.org/pdf/2404.11335.pdf
πŸ‘‰Code github.com/SoccerNet/sn-gamestate
❀9πŸ‘8πŸ”₯3⚑2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎲 Articulated Objs from MonoClips 🎲

πŸ‘‰REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video

πŸ‘‰Review https://t.ly/REuM8
πŸ‘‰Paper https://lnkd.in/d6PWagij
πŸ‘‰Project https://lnkd.in/dpg3x4tm
πŸ‘‰Repo https://lnkd.in/dRZWj6_N
🀯6πŸ‘1πŸ”₯1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺΌ All You Need is SAM (+Flow) πŸͺΌ

πŸ‘‰Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced πŸ’™

πŸ‘‰Review https://t.ly/ZRYtp
πŸ‘‰Paper https://lnkd.in/d4XqkEGF
πŸ‘‰Project https://lnkd.in/dHpmx3FF
πŸ‘‰Repo coming: https://github.com/Jyxarthur/
❀12πŸ‘7πŸ”₯2🀯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ›ž 6Img-to-3D driving scenarios πŸ›ž

πŸ‘‰EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics

πŸ‘‰Review https://shorturl.at/dZ018
πŸ‘‰Paper arxiv.org/pdf/2404.12378.pdf
πŸ‘‰Project 6img-to-3d.github.io/
πŸ‘‰Code github.com/continental/6Img-to-3D
πŸ”₯5❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌹 Physics-Based 3D Video-Gen 🌹

πŸ‘‰PhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects

πŸ‘‰Review https://t.ly/zxXf9
πŸ‘‰Paper arxiv.org/pdf/2404.13026.pdf
πŸ‘‰Project physdreamer.github.io/
πŸ‘‰Code github.com/a1600012888/PhysDreamer
πŸ‘14❀9🀯4πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🎑 NER-Net: Seeing at Night-Time 🎑

πŸ‘‰Huazhong (+Beijing) unveils a novel event-based nighttime imaging solution under non-uniform illumination, plus a paired multi-illumination level real-world dataset. Repo online, code coming πŸ’™

πŸ‘‰Review https://t.ly/Z9JMJ
πŸ‘‰Paper arxiv.org/pdf/2404.11884.pdf
πŸ‘‰Repo github.com/Liu-haoyue/NER-Net
πŸ‘‰Clip https://www.youtube.com/watch?v=zpfTLCF1Kw4
🀯3πŸ”₯2❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌊 FlowMap: dense depth video 🌊

πŸ‘‰MIT (+CSAIL) unveils FlowMap, a novel E2E differentiable method that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/CBH48
πŸ‘‰Paper arxiv.org/pdf/2404.15259.pdf
πŸ‘‰Project cameronosmith.github.io/flowmap
πŸ‘‰Code github.com/dcharatan/flowmap
πŸ”₯18❀3πŸ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘—TELA: Text to 3D Clothed HumanπŸ‘—

πŸ‘‰ TELA is a novel approach for the new task of clothing disentangled 3D human model generation from texts. This novel approach unleashes the potential of many downstream applications (e.g., virtual try-on).

πŸ‘‰Review https://t.ly/6N7JV
πŸ‘‰Paper https://arxiv.org/pdf/2404.16748
πŸ‘‰Project https://jtdong.com/tela_layer/
πŸ‘‰Code https://github.com/DongJT1996/TELA
πŸ‘5πŸ”₯4🀯3πŸ‘1🍾1