AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”Œ BodyMAP: human body & pressure ๐Ÿ”Œ

๐Ÿ‘‰#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/8926S
๐Ÿ‘‰Project bodymap3d.github.io/
๐Ÿ‘‰Paper https://lnkd.in/gCxH4ev3
๐Ÿ‘‰Code https://lnkd.in/gaifdy3q
โค8๐Ÿคฏ4โšก1๐Ÿ‘1๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงž XComposer2: 4K Vision-Language ๐Ÿงž

๐Ÿ‘‰InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840ร—1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/GCHsz
๐Ÿ‘‰Paper arxiv.org/pdf/2404.06512.pdf
๐Ÿ‘‰Code github.com/InternLM/InternLM-XComposer
๐Ÿฅฐ7โšก2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โš›๏ธ Flying w/ Photons: Neural Render โš›๏ธ

๐Ÿ‘‰Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!

๐Ÿ‘‰Review https://t.ly/ZqL3a
๐Ÿ‘‰Paper arxiv.org/pdf/2404.06493.pdf
๐Ÿ‘‰Project anaghmalik.com/FlyingWithPhotons/
๐Ÿ‘‰Code github.com/anaghmalik/FlyingWithPhotons
๐Ÿคฏ6โšก3โค2๐Ÿ‘1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜„๏ธ Tracking Any 2D Pixels in 3D โ˜„๏ธ

๐Ÿ‘‰ SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.

๐Ÿ‘‰Review https://t.ly/B28Cj
๐Ÿ‘‰Paper https://lnkd.in/d8ers_nm
๐Ÿ‘‰Project https://lnkd.in/deHjtZuE
๐Ÿ‘‰Code https://lnkd.in/dMe3TvFT
โค10๐Ÿ”ฅ5โšก1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸชYOLO-CIANNA: Neural Astro๐Ÿช

๐Ÿ‘‰ CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/441XS
๐Ÿ‘‰Paper arxiv.org/pdf/2402.05925.pdf
๐Ÿ‘‰Code github.com/Deyht/CIANNA
๐Ÿ‘‰Wiki github.com/Deyht/CIANNA/wiki
๐Ÿ‘7โšก5โค4๐Ÿ”ฅ2๐Ÿฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงคNeuro MusculoSkeletal-MANO๐Ÿงค

๐Ÿ‘‰SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/HOQrn
๐Ÿ‘‰Paper arxiv.org/pdf/2404.10227.pdf
๐Ÿ‘‰Project https://ms-mano.robotflow.ai/
๐Ÿ‘‰Code announced (no repo yet)
๐Ÿ”ฅ3โšก1โค1๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โšฝSoccerNET: Athlete Trackingโšฝ

๐Ÿ‘‰SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.

๐Ÿ‘‰Review https://t.ly/Mdu9s
๐Ÿ‘‰Paper arxiv.org/pdf/2404.11335.pdf
๐Ÿ‘‰Code github.com/SoccerNet/sn-gamestate
โค9๐Ÿ‘8๐Ÿ”ฅ3โšก2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽฒ Articulated Objs from MonoClips ๐ŸŽฒ

๐Ÿ‘‰REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video

๐Ÿ‘‰Review https://t.ly/REuM8
๐Ÿ‘‰Paper https://lnkd.in/d6PWagij
๐Ÿ‘‰Project https://lnkd.in/dpg3x4tm
๐Ÿ‘‰Repo https://lnkd.in/dRZWj6_N
๐Ÿคฏ6๐Ÿ‘1๐Ÿ”ฅ1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชผ All You Need is SAM (+Flow) ๐Ÿชผ

๐Ÿ‘‰Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/ZRYtp
๐Ÿ‘‰Paper https://lnkd.in/d4XqkEGF
๐Ÿ‘‰Project https://lnkd.in/dHpmx3FF
๐Ÿ‘‰Repo coming: https://github.com/Jyxarthur/
โค12๐Ÿ‘7๐Ÿ”ฅ2๐Ÿคฏ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›ž 6Img-to-3D driving scenarios ๐Ÿ›ž

๐Ÿ‘‰EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics

๐Ÿ‘‰Review https://shorturl.at/dZ018
๐Ÿ‘‰Paper arxiv.org/pdf/2404.12378.pdf
๐Ÿ‘‰Project 6img-to-3d.github.io/
๐Ÿ‘‰Code github.com/continental/6Img-to-3D
๐Ÿ”ฅ5โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒน Physics-Based 3D Video-Gen ๐ŸŒน

๐Ÿ‘‰PhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects

๐Ÿ‘‰Review https://t.ly/zxXf9
๐Ÿ‘‰Paper arxiv.org/pdf/2404.13026.pdf
๐Ÿ‘‰Project physdreamer.github.io/
๐Ÿ‘‰Code github.com/a1600012888/PhysDreamer
๐Ÿ‘14โค9๐Ÿคฏ4๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽก NER-Net: Seeing at Night-Time ๐ŸŽก

๐Ÿ‘‰Huazhong (+Beijing) unveils a novel event-based nighttime imaging solution under non-uniform illumination, plus a paired multi-illumination level real-world dataset. Repo online, code coming ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/Z9JMJ
๐Ÿ‘‰Paper arxiv.org/pdf/2404.11884.pdf
๐Ÿ‘‰Repo github.com/Liu-haoyue/NER-Net
๐Ÿ‘‰Clip https://www.youtube.com/watch?v=zpfTLCF1Kw4
๐Ÿคฏ3๐Ÿ”ฅ2โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒŠ FlowMap: dense depth video ๐ŸŒŠ

๐Ÿ‘‰MIT (+CSAIL) unveils FlowMap, a novel E2E differentiable method that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Source Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/CBH48
๐Ÿ‘‰Paper arxiv.org/pdf/2404.15259.pdf
๐Ÿ‘‰Project cameronosmith.github.io/flowmap
๐Ÿ‘‰Code github.com/dcharatan/flowmap
๐Ÿ”ฅ18โค3๐Ÿ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘—TELA: Text to 3D Clothed Human๐Ÿ‘—

๐Ÿ‘‰ TELA is a novel approach for the new task of clothing disentangled 3D human model generation from texts. This novel approach unleashes the potential of many downstream applications (e.g., virtual try-on).

๐Ÿ‘‰Review https://t.ly/6N7JV
๐Ÿ‘‰Paper https://arxiv.org/pdf/2404.16748
๐Ÿ‘‰Project https://jtdong.com/tela_layer/
๐Ÿ‘‰Code https://github.com/DongJT1996/TELA
๐Ÿ‘5๐Ÿ”ฅ4๐Ÿคฏ3๐Ÿ‘1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชท Tunnel Try-on: SOTA VTON ๐Ÿชท

๐Ÿ‘‰"Tunnel Try-on", the first diffusion-based video virtual try-on model that demonstrates SOTA performance in complex scenarios. No code announced :(

๐Ÿ‘‰Review https://t.ly/joMtJ
๐Ÿ‘‰Paper arxiv.org/pdf/2404.17571
๐Ÿ‘‰Project mengtingchen.github.io/tunnel-try-on-page/
โค9๐Ÿ”ฅ4๐Ÿ‘1๐Ÿฅฐ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ๏ธ1000x Scalable Neural 3D Fields๐Ÿ๏ธ

๐Ÿ‘‰Highly-scalable neural 3D Fields: 1000x reductions in memory maintaining speed/quality: 10 MB vs. 10 GB! Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/sLTK5
๐Ÿ‘‰Paper https://lnkd.in/dEYM8-t2
๐Ÿ‘‰Project https://lnkd.in/djptdujx
๐Ÿ‘‰Code https://lnkd.in/dcCnFZ2n
๐Ÿคฏ13๐Ÿ‘5๐Ÿ”ฅ4โค3๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒ3D Scenes w/ Depth Inpainting๐ŸŒ

๐Ÿ‘‰Oxford announced two novel contributions to the field of 3D scene generation: a new benchmark and a novel depth completion model. ๐Ÿค—-Demo and Source Code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/BKiny
๐Ÿ‘‰Paper arxiv.org/pdf/2404.19758
๐Ÿ‘‰Project research.paulengstler.com/invisible-stitch/
๐Ÿ‘‰Code github.com/paulengstler/invisible-stitch
๐Ÿ‘‰Demo huggingface.co/spaces/paulengstler/invisible-stitch
โค3๐Ÿ‘2๐Ÿ‘1๐Ÿ”ฅ1๐Ÿฅฐ1๐Ÿคฏ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒŠ Diffusive 3D Human Recovery ๐ŸŒŠ

๐Ÿ‘‰The Rutgers University unveils ScoreHMR at #CVPR24; novel approach for 3D human pose and shape reconstruction. Impressive results.

๐Ÿ‘‰Review https://t.ly/G0k2D
๐Ÿ‘‰Paper https://arxiv.org/pdf/2403.09623
๐Ÿ‘‰Code https://github.com/statho/ScoreHMR
๐Ÿ‘‰Project https://statho.github.io/ScoreHMR/
๐Ÿคฏ11๐Ÿ‘6โค1๐Ÿ‘1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿท๏ธDiffMOT (#CVPR24): diffusion-MOT๐Ÿท๏ธ

๐Ÿ‘‰DiffMOT is a novel real-time diffusion-based MOT approach to tackle the complex nonlinear motion. Impressive results & Source Code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/ztlHi
๐Ÿ‘‰Paper https://lnkd.in/d4K3c-nt
๐Ÿ‘‰Project https://diffmot.github.io/
๐Ÿ‘‰Code github.com/Kroery/DiffMOT
โค12๐Ÿ‘4๐Ÿ”ฅ3๐Ÿคฏ3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ XFeat: Neural Features Matching ๐Ÿ

๐Ÿ‘‰XFeat (Accelerated Features) is lightweight/accurate architecture for efficient visual correspondence. It revisits fundamental design choices in CNN for detecting, extracting & matching local features

๐Ÿ‘‰Review https://t.ly/ppb38
๐Ÿ‘‰Paper arxiv.org/pdf/2404.19174
๐Ÿ‘‰Code https://lnkd.in/dFzTpzN8
๐Ÿ‘‰Project https://lnkd.in/d8JnV-iu
โค17๐Ÿคฏ6โšก3๐Ÿ‘1๐Ÿพ1