AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
⛈️Unsupervised Neuromorphic Motion⛈️

👉The Western Sydney University unveils a novel unsupervised event-based motion segmentation algorithm, employing the #Prophesee Gen4 HD event camera.

👉Review https://t.ly/UZzIZ
👉Paper arxiv.org/pdf/2405.15209
👉Project samiarja.github.io/evairborne
👉Repo (empty) github.com/samiarja/ev/_deep/_motion_segmentation
👍5🔥1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦓 Z.S. Diffusive Segmentation 🦓

👉KAUST (+MPI) announced the first zero-shot approach for Video Semantic Segmentation (VSS) based on pre-trained diffusion models. Source Code released under MIT💙

👉Review https://t.ly/v_64K
👉Paper arxiv.org/pdf/2405.16947
👉Project https://lnkd.in/dcSt4dQx
👉Code https://lnkd.in/dcZfM8F3
🤯4🔥2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪰 Dynamic Gaussian Fusion via 4D Motion Scaffolds 🪰

👉MoSca is a novel 4D Motion Scaffolds to reconstruct/synthesize novel views of dynamic scenes from monocular videos in the wild!

👉Review https://t.ly/nSdEL
👉Paper arxiv.org/pdf/2405.17421
👉Code github.com/JiahuiLei/MoSca
👉Project https://lnkd.in/dkjMVcqZ
🔥6👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤Transformer-based 4D Hands🧤

👉4DHands is a novel and robust approach to recovering interactive hand meshes and their relative movement from monocular inputs. Authors: Beijing NU, Tsinghua & Lenovo. No code announced 😢

👉Review https://t.ly/wvG-l
👉Paper arxiv.org/pdf/2405.20330
👉Project 4dhands.github.io/
🔥4🤯31👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🎭New 2D Landmarks SOTA🎭

👉Flawless AI unveils FaceLift, a novel semi-supervised approach that learns 3D landmarks by directly lifting (visible) hand-labeled 2D landmarks and ensures better definition alignment, with no need for 3D landmark datasets. No code announced🥹

👉Review https://t.ly/lew9a
👉Paper arxiv.org/pdf/2405.19646
👉Project davidcferman.github.io/FaceLift
🔥165😢5👏2💩21👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🐳 MultiPly: in-the-wild Multi-People 🐳

👉MultiPly: novel framework to reconstruct multiple people in 3D from monocular in-the-wild videos. It's the new SOTA over the publicly available datasets and in-the-wild videos. Source Code announced, coming💙

👉Review https://t.ly/_xjk_
👉Paper arxiv.org/pdf/2406.01595
👉Project eth-ait.github.io/MultiPly
👉Repo github.com/eth-ait/MultiPly
🔥14👍4👏21🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👹AI and the Everything in the Whole Wide World Benchmark👹

👉Last week Yann LeCun said something like "LLMs will not reach human intelligence". It's clear the on-going #deeplearning is not ready for "general AI", a "radical alternative" is necessary to create a “superintelligence”.

👉Review https://t.ly/isdxM
👉News https://lnkd.in/dFraieZS
👉Paper https://lnkd.in/da-7PnVT
5👍2👏1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
📞FacET: VideoCall Change Your Expression📞

👉Columbia University unveils FacET: discovering behavioral differences between conversing face-to-face (F2F) and on video-calls (VCs).

👉Review https://t.ly/qsQmt
👉Paper arxiv.org/pdf/2406.00955
👉Project facet.cs.columbia.edu/
👉Repo (empty) github.com/stellargo/facet
🔥81👍1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 UA-Track: Uncertainty-Aware MOT🚙

👉UA-Track: novel Uncertainty-Aware 3D MOT framework which tackles the uncertainty problem from multiple aspects. Code announced, not released yet.

👉Review https://t.ly/RmVSV
👉Paper https://arxiv.org/pdf/2406.02147
👉Project https://liautoad.github.io/ua-track-website
👍81🔥1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊 Universal 6D Pose/Tracking 🧊

👉Omni6DPose is a novel dataset for 6D Object Pose with 1.5M+ annotations. Extra: GenPose++, the novel SOTA in category-level 6D estimation/tracking thanks to two pivotal improvements.

👉Review https://t.ly/Ywgl1
👉Paper arxiv.org/pdf/2406.04316
👉Project https://lnkd.in/dHBvenhX
👉Lib https://lnkd.in/d8Yc-KFh
12👍4🤩2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 SOTA Multi-Garment VTOn Editing 👗

👉#Google (+UWA) unveils M&M VTO, novel mix 'n' match virtual try-on that takes as input multiple garment images, text description for garment layout and an image of a person. It's the new SOTA both qualitatively and quantitatively. Impressive results!

👉Review https://t.ly/66mLN
👉Paper arxiv.org/pdf/2406.04542
👉Project https://mmvto.github.io
👍43🥰3🔥1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
👑 Kling AI vs. OpenAI Sora 👑

👉Kling: the ultimate Chinese text-to-video model - rival to #OpenAI’s Sora. No papers or tech info to check, but stunning results from the official site.

👉Review https://t.ly/870DQ
👉Paper ???
👉Project https://kling.kuaishou.com/
🔥6👍31🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🍉 MASA: MOT Anything By SAM 🍉

👉MASA: Matching Anything by Segmenting Anything pipeline to learn object-level associations from unlabeled images of any domain. An universal instance appearance model for matching any objects in any domain. Source code in June 💙

👉Review https://t.ly/pKdEV
👉Paper https://lnkd.in/dnjuT7xm
👉Project https://lnkd.in/dYbWzG4E
👉Code https://lnkd.in/dr5BJCXm
🔥164👏3👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎹 PianoMotion10M for gen-hands 🎹

👉PianoMotion10M: 116 hours of piano playing videos from a bird’s-eye view with 10M+ annotated hand poses. A big contribution in hand motion generation. Code & Dataset released💙

👉Review https://t.ly/_pKKz
👉Paper arxiv.org/pdf/2406.09326
👉Code https://lnkd.in/dcBP6nvm
👉Project https://lnkd.in/d_YqZk8x
👉Dataset https://lnkd.in/dUPyfNDA
8🔥41🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
📫MeshPose: DensePose+HMR📫

👉MeshPose: novel approach to jointly tackle DensePose and Human Mesh Reconstruction in a while. A natural fit for #AR applications requiring real-time mobile inference.

👉Review https://t.ly/a-5uN
👉Paper arxiv.org/pdf/2406.10180
👉Project https://meshpose.github.io/
🔥61👍1
lowlight_back_n_forth.gif
1.4 MB
🌵 RobustSAM for Degraded Images 🌵

👉RobustSAM, the evolution of SAM for degraded images; enhancing the SAM’s performance on low-quality pics while preserving prompt-ability & zeroshot generalization. Dataset & Code released💙

👉Review https://t.ly/mnyyG
👉Paper arxiv.org/pdf/2406.09627
👉Project robustsam.github.io
👉Code github.com/robustsam/RobustSAM
5👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤HOT3D Hand/Object Tracking🧤

👉#Meta opens a novel egocentric dataset for 3D hand & object tracking. A new benchmark for vision-based understanding of 3D hand-object interactions. Dataset available 💙

👉Review https://t.ly/cD76F
👉Paper https://lnkd.in/e6_7UNny
👉Data https://lnkd.in/e6P-sQFK
🔥93👏3👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💦 Self-driving in wet conditions 💦

👉BMW SemanticSpray: novel dataset contains scenes in wet surface conditions captured by camera, LiDAR and radar. Camera: 2D Boxes | LiDAR: 3D Boxes, Semantic Labels | Radar: Semantic Labels.

👉Review https://t.ly/8S93j
👉Paper https://lnkd.in/dnN5MCZC
👉Project https://lnkd.in/dkUaxyEF
👉Data https://lnkd.in/ddhkyXv8
🔥61👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌱 TokenHMR : new 3D human pose SOTA 🌱

👉TokenHMR is the new SOTA HPS method mixing 2D keypoints and 3D pose accuracy, thus leveraging Internet data without known camera parameters. It's the new SOTA by a large margin.

👉Review https://t.ly/K9_8n
👉Paper arxiv.org/pdf/2404.16752
👉Project tokenhmr.is.tue.mpg.de/
👉Code github.com/saidwivedi/TokenHMR
🤯5👍3😱322🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🤓Glasses-Removal in Videos🤓

👉Lightricks unveils a novel method able to receive an input video of a person wearing glasses, and removes the glasses preserving the ID. It works even with reflections, heavy makeup, and blinks. Code announced, not yet released.

👉Review https://t.ly/Hgs2d
👉Paper arxiv.org/pdf/2406.14510
👉Project https://v-lasik.github.io/
👉Code github.com/v-lasik/v-lasik-code
💩166🤯5👍3🥰1