AI with Papers - Artificial Intelligence & Deep Learning

📱3D Human-Object Contact📱

👉Pi-HOC by CMU + NREC is a novel single-pass, instance-aware framework for dense 3D semantic contact prediction of all human-object pairs. Repo announced💙

👉Review https://t.ly/TAgG1
👉Paper https://arxiv.org/pdf/2604.12923
👉Project https://pi-hoc.github.io/
👉Repo https://github.com/SravanChittupalli/Pi-HOC

🔥3❤2👏2👍1🤩1

4.97K views12:19

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🐞GCT 3D Reconstruction🐞

👉ANT unveils LingBot-Map, a feed-forward 3D foundation model for reconstructing scenes from streaming data, built upon a geometric context transformer (GCT) architecture. Repo under A-NC 4.0 International💙

👉Review https://t.ly/ExodA
👉Paper https://arxiv.org/pdf/2604.14141
👉Project https://arxiv.org/pdf/2604.14141
👉Repo github.com/robbyant/lingbot-map

🔥9❤4👍2👏1

5.27K views06:57

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

👩‍🦰Deformable 3D Hair👩‍🦰

👉Xi’an Jiaotong University unveils a novel method that reconstructs decoupled 3D Gaussian head avatars from a single input image: effortless hairstyle transfer with natural dynamic hair motion. Code announced💙

👉Review https://t.ly/kWZdd
👉Paper https://arxiv.org/pdf/2604.14782
👉Project yuansun-xjtu.github.io/CompHairHead.io/
👉Repo yuansun-xjtu.github.io/CompHairHead.io/

❤6🔥3👏1🤩1

4.76K viewsedited 06:35

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌗Mobile Ultra-detailed Avatars🌗

👉Given skeletal poses and a virtual camera as inputs, MUA by Max Planck Institute produces photorealistic renderings and hyper-detailed geometry of animatable clothed humans. Repo announced💙

👉Review https://t.ly/QPCy6
👉Paper https://arxiv.org/pdf/2604.18583
👉Project https://vcai.mpi-inf.mpg.de/projects/MUA/
👉Repo TBA

❤11🔥1

4.71K views06:55

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🎈Face Anything 4D (SOTA)🎈

👉A novel unified 4D facial reconstruction and dense tracking from image sequences: new SOTA in facial single-image and mono-video depth estimation, dense 4D reconstruction, and 3D point tracking. Repo & Dataset announced💙

👉Review https://t.ly/zItie
👉Paper https://arxiv.org/pdf/2604.19702
👉Project kocasariumut.github.io/FaceAnything
👉Repo TBA

❤5🔥2👍1🤯1

5.8K views06:33

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

💙 PY4AI 2026: here we are! 💙

👉The third edition of our conference is official! Speaker list and (free) tickets: https://t.ly/L4_52

❤10👍1🤯1😢1🤩1

5.71K viewsedited 06:41

AI with Papers - Artificial Intelligence & Deep Learning

Please open Telegram to view this post

VIEW IN TELEGRAM

13:31

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛒 Reshoot-Anything is out 🛒

👉Reshoot-Anything reshoots dynamic monocular videos under novel camera trajectories. Code under Apache 2.0 💙

👉Review https://t.ly/MIqAc
👉Paper https://arxiv.org/pdf/2604.21776
👉Project adithyaiyer1999.github.io/reshoot-anything/
👉Repo github.com/morphicfilms/video-to-video

❤5🔥4👍1

5.1K views06:38

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧘‍♀️Holistic Shot Boundary Detection🧘‍♀️

👉OmniShotCut detects shot changes of the video in diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.), and recognize Sudden Jump and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer. Repo, demo & benchmark💙

👉Review https://t.ly/sTi7N
👉Paper https://arxiv.org/pdf/2604.24762
👉Project uva-computer-vision-lab.github.io/OmniShotCut_website/
👉Repo github.com/UVA-Computer-Vision-Lab/OmniShotCut

🔥6❤3👍1👏1

5.62K views07:16

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪝Syn4D: Multiview Synthetic 4D Dataset🪝

👉Syn4D is novel multi-view synthetic dataset of dynamic scenes that includes ground-truth camera motion, depth maps, dense tracking, and parametric human pose annotations💙

👉Review https://t.ly/SL1mk
👉Paper https://arxiv.org/pdf/2605.05207
👉Project https://jzr99.github.io/Syn4D/
👉Repo https://github.com/jzr99/Syn4D
👉Data huggingface.co/datasets/Syn4D/Syn4D_RGBD/tree/main

❤7🔥5👏2👍1

5.11K views10:34

AI with Papers - Artificial Intelligence & Deep Learning

About the frequency of posting in the channel:

Anonymous Poll

62%

💚 1 per day is great

38%

💞 a few posts per day (such as breaking news with less details) would be better

❤4👏1🤩1

266 voters5.53K views10:46

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦄Unified Correspondence Transformer🦄

👉UniCorrn is the first correspondence model with shared weights that unifies 2D-2D, 2D-3D, and 3D-3D geometric matching with a transformer. CC BY-NC-SA 4.0💙

👉Review https://t.ly/2OBdq
👉Paper https://arxiv.org/pdf/2605.04044
👉Project https://neu-vi.github.io/UniCorrn/
👉Repo https://github.com/neu-vi/UniCorrn

👍5🔥5❤4🤯4👏2

6.65K viewsedited 06:51

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍒Count Anything, Any Granularity🍒

👉Open-world counting as multi-grained counting, where visual exemplars specify target appearance and fine-grained text specifies the intended semantic granularity across five explicit levels. Repo/Data under Apache💙

👉Review https://t.ly/nqz80
👉Paper https://lnkd.in/dp7khTRU
👉Project https://lnkd.in/d_jfX_Yn
👉Repo https://lnkd.in/dkTRGZkG
👉Data https://lnkd.in/dB83jRyT

1❤15👍6👏2🔥1

7.36K views06:51

AI with Papers - Artificial Intelligence & Deep Learning

0:04

This media is not supported in your browser

VIEW IN TELEGRAM

🪔Latent Decoding Pixel Diffusion🪔

👉PiD by Nvidia is a plug-and-play diffusion decoder that replaces VAE/RAE decoders, turning latent representations directly into super-resolved pixels in a single pass. Repo under Apache 2.0💙

👉Review https://t.ly/y19mA
👉Paper https://lnkd.in/duVC25C2
👉Project https://lnkd.in/dW6TkzCB
👉Repo https://lnkd.in/dnGdgKRr

❤8🔥6👍1👏1

6.51K viewsedited 07:06

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔍 Nvidia Locate Anything 🔍

👉Diverse localization tasks under a unified vision-language model, including document understanding, GUI grounding, dense detection, and OCR. Repo released💙

👉Review https://t.ly/PvwFo
👉Paper https://lnkd.in/dWfNpzPZ
👉Project https://lnkd.in/dM89BX-8
👉Repo https://lnkd.in/dC4KCQSM

❤13🔥13👏1

5.74K views06:37

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🕷️Human Universal Grasping🕷️

👉HUG is a flow-matching model that generates diverse human grasps for any user-specified object in a single RGB-D image captured from a stereo camera.

👉Review https://t.ly/VG1Eu
👉Paper https://arxiv.org/pdf/2606.17054
👉Repo https://github.com/KevinyWu/hug
👉Project https://grasping.io/

❤10🔥4👍1👏1

3.56K views07:05

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔊VolHuMe - Volumetric Human Meshes🔊

👉VolHuMe (H/T @Martinella_94) is a novel, high-resolution large-scale dataset of volumetric human meshes with complete 4D GT: multi-view RGB-D, textured meshes, dense point clouds, normal maps, rigged assets, garment segmentation, and SMPL-X fittings in one dataset. Insane💙

👉Review https://t.ly/b5vxy
👉Paper https://arxiv.org/pdf/2606.23062
👉Project giuli13.github.io/volhume-website/#
👉Repo TBA soon

❤4🔥2⚡1👏1

1.86K views11:48

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

👋 Hi everyone!

Over the past few weeks, the number of join requests has increased dramatically, which unfortunately also means a much higher number of spam and bots (in the last days around five hundreds been cut off)

To help me distinguish real people from fake profiles - and avoid rejecting genuine requests by mistake - I'd really appreciate if your profile includes:
📷 A real profile photo
👤 Your full name (or something reasonably identifiable)
💬 If you contact me, please use English if possible.

I don't speak Russian, Arabic, or Chinese, so if your profile and messages are only in those languages, it's very difficult for me to tell whether you're a real person or an automated account. Thank you for your understanding and for helping keep this damn community welcoming and spam-free!

With love,
Alessandro 😈

❤18👍14⚡2🔥1

1.82K views12:10

AI with Papers - Artificial Intelligence & Deep Learning

AI with Papers - Artificial Intelligence & Deep Learning pinned a GIF

12:10

AI with Papers - Artificial Intelligence & Deep Learning

0:20

Media is too big

VIEW IN TELEGRAM

🍀OctoSense: Open Sensing🍀

👉OctoSense is an open-source sensor platform with stereo RGB and event cameras, LiDAR, a thermal camera, an inertial measurement unit, RTK-corrected global positioning system, and proprioception.

👉Review https://t.ly/oFN8L
👉Paper https://lnkd.in/dM3zpyju
👉Project https://lnkd.in/ddrQ3uJ6
👉Repo https://lnkd.in/dhSDjSfG

❤11🔥5💩3

1.68K views06:59

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛸PriorEye: Geospatial Self-Driving🛸

👉MRG (Oxford) introduces geospatial visual priors to leverage the street-level images in autonomous driving. Consistent improvement in performance. Repo under Apache💙

👉Review https://t.ly/7Jgav
👉Paper https://lnkd.in/dYeD2m7n
👉Project https://lnkd.in/dWJvNemr
👉Repo https://lnkd.in/dNExGGtx

🔥5❤4👍2👏1

1.34K views06:34

About

Blog

Apps

Platform