AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
236 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงคNeuro MusculoSkeletal-MANO๐Ÿงค

๐Ÿ‘‰SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/HOQrn
๐Ÿ‘‰Paper arxiv.org/pdf/2404.10227.pdf
๐Ÿ‘‰Project https://ms-mano.robotflow.ai/
๐Ÿ‘‰Code announced (no repo yet)
๐Ÿ”ฅ3โšก1โค1๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โšฝSoccerNET: Athlete Trackingโšฝ

๐Ÿ‘‰SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.

๐Ÿ‘‰Review https://t.ly/Mdu9s
๐Ÿ‘‰Paper arxiv.org/pdf/2404.11335.pdf
๐Ÿ‘‰Code github.com/SoccerNet/sn-gamestate
โค9๐Ÿ‘8๐Ÿ”ฅ3โšก2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽฒ Articulated Objs from MonoClips ๐ŸŽฒ

๐Ÿ‘‰REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video

๐Ÿ‘‰Review https://t.ly/REuM8
๐Ÿ‘‰Paper https://lnkd.in/d6PWagij
๐Ÿ‘‰Project https://lnkd.in/dpg3x4tm
๐Ÿ‘‰Repo https://lnkd.in/dRZWj6_N
๐Ÿคฏ6๐Ÿ‘1๐Ÿ”ฅ1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชผ All You Need is SAM (+Flow) ๐Ÿชผ

๐Ÿ‘‰Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/ZRYtp
๐Ÿ‘‰Paper https://lnkd.in/d4XqkEGF
๐Ÿ‘‰Project https://lnkd.in/dHpmx3FF
๐Ÿ‘‰Repo coming: https://github.com/Jyxarthur/
โค12๐Ÿ‘7๐Ÿ”ฅ2๐Ÿคฏ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›ž 6Img-to-3D driving scenarios ๐Ÿ›ž

๐Ÿ‘‰EPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics

๐Ÿ‘‰Review https://shorturl.at/dZ018
๐Ÿ‘‰Paper arxiv.org/pdf/2404.12378.pdf
๐Ÿ‘‰Project 6img-to-3d.github.io/
๐Ÿ‘‰Code github.com/continental/6Img-to-3D
๐Ÿ”ฅ5โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒน Physics-Based 3D Video-Gen ๐ŸŒน

๐Ÿ‘‰PhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects

๐Ÿ‘‰Review https://t.ly/zxXf9
๐Ÿ‘‰Paper arxiv.org/pdf/2404.13026.pdf
๐Ÿ‘‰Project physdreamer.github.io/
๐Ÿ‘‰Code github.com/a1600012888/PhysDreamer
๐Ÿ‘14โค9๐Ÿคฏ4๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽก NER-Net: Seeing at Night-Time ๐ŸŽก

๐Ÿ‘‰Huazhong (+Beijing) unveils a novel event-based nighttime imaging solution under non-uniform illumination, plus a paired multi-illumination level real-world dataset. Repo online, code coming ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/Z9JMJ
๐Ÿ‘‰Paper arxiv.org/pdf/2404.11884.pdf
๐Ÿ‘‰Repo github.com/Liu-haoyue/NER-Net
๐Ÿ‘‰Clip https://www.youtube.com/watch?v=zpfTLCF1Kw4
๐Ÿคฏ3๐Ÿ”ฅ2โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒŠ FlowMap: dense depth video ๐ŸŒŠ

๐Ÿ‘‰MIT (+CSAIL) unveils FlowMap, a novel E2E differentiable method that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Source Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/CBH48
๐Ÿ‘‰Paper arxiv.org/pdf/2404.15259.pdf
๐Ÿ‘‰Project cameronosmith.github.io/flowmap
๐Ÿ‘‰Code github.com/dcharatan/flowmap
๐Ÿ”ฅ18โค3๐Ÿ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘—TELA: Text to 3D Clothed Human๐Ÿ‘—

๐Ÿ‘‰ TELA is a novel approach for the new task of clothing disentangled 3D human model generation from texts. This novel approach unleashes the potential of many downstream applications (e.g., virtual try-on).

๐Ÿ‘‰Review https://t.ly/6N7JV
๐Ÿ‘‰Paper https://arxiv.org/pdf/2404.16748
๐Ÿ‘‰Project https://jtdong.com/tela_layer/
๐Ÿ‘‰Code https://github.com/DongJT1996/TELA
๐Ÿ‘5๐Ÿ”ฅ4๐Ÿคฏ3๐Ÿ‘1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชท Tunnel Try-on: SOTA VTON ๐Ÿชท

๐Ÿ‘‰"Tunnel Try-on", the first diffusion-based video virtual try-on model that demonstrates SOTA performance in complex scenarios. No code announced :(

๐Ÿ‘‰Review https://t.ly/joMtJ
๐Ÿ‘‰Paper arxiv.org/pdf/2404.17571
๐Ÿ‘‰Project mengtingchen.github.io/tunnel-try-on-page/
โค9๐Ÿ”ฅ4๐Ÿ‘1๐Ÿฅฐ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ๏ธ1000x Scalable Neural 3D Fields๐Ÿ๏ธ

๐Ÿ‘‰Highly-scalable neural 3D Fields: 1000x reductions in memory maintaining speed/quality: 10 MB vs. 10 GB! Code released ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/sLTK5
๐Ÿ‘‰Paper https://lnkd.in/dEYM8-t2
๐Ÿ‘‰Project https://lnkd.in/djptdujx
๐Ÿ‘‰Code https://lnkd.in/dcCnFZ2n
๐Ÿคฏ13๐Ÿ‘5๐Ÿ”ฅ4โค3๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒ3D Scenes w/ Depth Inpainting๐ŸŒ

๐Ÿ‘‰Oxford announced two novel contributions to the field of 3D scene generation: a new benchmark and a novel depth completion model. ๐Ÿค—-Demo and Source Code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/BKiny
๐Ÿ‘‰Paper arxiv.org/pdf/2404.19758
๐Ÿ‘‰Project research.paulengstler.com/invisible-stitch/
๐Ÿ‘‰Code github.com/paulengstler/invisible-stitch
๐Ÿ‘‰Demo huggingface.co/spaces/paulengstler/invisible-stitch
โค3๐Ÿ‘2๐Ÿ‘1๐Ÿ”ฅ1๐Ÿฅฐ1๐Ÿคฏ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒŠ Diffusive 3D Human Recovery ๐ŸŒŠ

๐Ÿ‘‰The Rutgers University unveils ScoreHMR at #CVPR24; novel approach for 3D human pose and shape reconstruction. Impressive results.

๐Ÿ‘‰Review https://t.ly/G0k2D
๐Ÿ‘‰Paper https://arxiv.org/pdf/2403.09623
๐Ÿ‘‰Code https://github.com/statho/ScoreHMR
๐Ÿ‘‰Project https://statho.github.io/ScoreHMR/
๐Ÿคฏ11๐Ÿ‘6โค1๐Ÿ‘1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿท๏ธDiffMOT (#CVPR24): diffusion-MOT๐Ÿท๏ธ

๐Ÿ‘‰DiffMOT is a novel real-time diffusion-based MOT approach to tackle the complex nonlinear motion. Impressive results & Source Code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/ztlHi
๐Ÿ‘‰Paper https://lnkd.in/d4K3c-nt
๐Ÿ‘‰Project https://diffmot.github.io/
๐Ÿ‘‰Code github.com/Kroery/DiffMOT
โค12๐Ÿ‘4๐Ÿ”ฅ3๐Ÿคฏ3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ XFeat: Neural Features Matching ๐Ÿ

๐Ÿ‘‰XFeat (Accelerated Features) is lightweight/accurate architecture for efficient visual correspondence. It revisits fundamental design choices in CNN for detecting, extracting & matching local features

๐Ÿ‘‰Review https://t.ly/ppb38
๐Ÿ‘‰Paper arxiv.org/pdf/2404.19174
๐Ÿ‘‰Code https://lnkd.in/dFzTpzN8
๐Ÿ‘‰Project https://lnkd.in/d8JnV-iu
โค17๐Ÿคฏ6โšก3๐Ÿ‘1๐Ÿพ1
๐Ÿฆ‘ Hyper-Detailed Image Descriptions ๐Ÿฆ‘

๐Ÿ‘‰#Google unveils ImageInWords (IIW), a carefully designed HIL annotation framework for curating hyper-detailed image descriptions and a new dataset resulting from this process

๐Ÿ‘‰Review https://t.ly/engkl
๐Ÿ‘‰Paper arxiv.org/pdf/2405.02793
๐Ÿ‘‰Repo github.com/google/imageinwords
๐Ÿ‘‰Project google.github.io/imageinwords
๐Ÿ‘‰Data huggingface.co/datasets/google/imageinwords
โค11๐Ÿ”ฅ3๐Ÿ‘2๐Ÿคฏ2๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ซ Free-Moving Reconstruction ๐Ÿ”ซ

๐Ÿ‘‰EPFL (+#MagicLeap) unveils a novel approach for reconstructing free-moving object from monocular RGB clip. Free interaction with objects in front of a moving cam without relying on any prior, and optimizes the sequence globally without any segments. Great but no code announced๐Ÿฅบ

๐Ÿ‘‰Review https://t.ly/2xhtj
๐Ÿ‘‰Paper arxiv.org/pdf/2405.05858
๐Ÿ‘‰Project haixinshi.github.io/fmov/
๐Ÿ‘6๐Ÿคฏ4โšก1โค1๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ฅFeatUp: Any Model at Any Resolution๐Ÿ’ฅ

๐Ÿ‘‰FeatUp is a task-model agnostic framework to restore lost spatial information in deep features. It outperforms other methods in class activation map generation, transfer learning for segmentation & depth, and end-to-end training for semantic segm. Source Code released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/Evq_g
๐Ÿ‘‰Paper https://lnkd.in/gweaN4s6
๐Ÿ‘‰Project https://lnkd.in/gWcGXdxt
๐Ÿ‘‰Code https://lnkd.in/gweq5NY4
๐Ÿ”ฅ19โค4๐Ÿ‘3๐Ÿ‘1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸAniTalker: Universal Talking Humans๐Ÿ

๐Ÿ‘‰SJTU (+AISpeech) unveils AniTalker, a framework that transforms a single static portrait and input audio into animated talking videos with naturally flowing movements.

๐Ÿ‘‰Review https://t.ly/MD4yX
๐Ÿ‘‰Paper https://arxiv.org/pdf/2405.03121
๐Ÿ‘‰Project https://x-lance.github.io/AniTalker/
๐Ÿ‘‰Repo https://github.com/X-LANCE/AniTalker
๐Ÿ”ฅ6โค4๐Ÿ‘2โšก1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘ป 3D Humans Motion from Text ๐Ÿ‘ป

๐Ÿ‘‰Zhejiang (+ANT) unveils a novel method to generate human motions containing accurate human-object interactions in 3D scenes based on textural descriptions. Code announced, coming ๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/eOZnU
๐Ÿ‘‰Paper https://arxiv.org/pdf/2405.07784
๐Ÿ‘‰Project https://zju3dv.github.io/text_scene_motion/
๐Ÿ‘3๐Ÿ”ฅ2โค1