AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🎭 ULTRA-Realistic Avatar 🎭

πŸ‘‰Novel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.

πŸ‘‰Review https://t.ly/B3BEu
πŸ‘‰Project https://lnkd.in/dkUQHFEV
πŸ‘‰Paper https://lnkd.in/dtEQxrBu
πŸ‘‰Code coming 🩷
πŸ’©17❀5πŸ‘2🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Lumiere: SOTA video-genπŸ”₯

πŸ‘‰#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.

πŸ‘‰Review https://t.ly/nalJR
πŸ‘‰Paper https://lnkd.in/d-PvrGjT
πŸ‘‰Project https://t.ly/gK8hz
πŸ”₯18❀4πŸ‘3πŸ‘2🀩2πŸ₯°1🀯1πŸ’©1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ§ͺ SUPIR: SOTA restoration πŸ§ͺ

πŸ‘‰SUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics

πŸ‘‰Review https://t.ly/wgObH
πŸ‘‰Project https://supir.xpixel.group/
πŸ‘‰Paper https://lnkd.in/dZPYcUuq
πŸ‘‰Demo coming 🩷 but no code announced :(
❀8πŸ”₯4πŸ₯°1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🫧 SAM + Open Models 🫧

πŸ‘‰Grounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.

πŸ‘‰Review https://t.ly/FwasQ
πŸ‘‰Paper arxiv.org/pdf/2401.14159.pdf
πŸ‘‰Code github.com/IDEA-Research/Grounded-Segment-Anything
πŸ”₯9πŸ‘2πŸ‘1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘’"Virtual Try-All" by #Amazon πŸ‘’

πŸ‘‰#Amazon announces ”Diffuse to Choose”: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.

πŸ‘‰Review https://t.ly/at07Y
πŸ‘‰Paper https://lnkd.in/dxR7nGtd
πŸ‘‰Project diffuse2choose.github.io/
❀15πŸ‘7🀯4πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🦩 WildRGB-D: Objects in the Wild 🦩

πŸ‘‰#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.

πŸ‘‰Review https://t.ly/WCqVz
πŸ‘‰Data github.com/wildrgbd/wildrgbd
πŸ‘‰Paper arxiv.org/pdf/2401.12592.pdf
πŸ‘‰Project wildrgbd.github.io/
πŸ‘9❀3πŸ”₯2πŸ‘1🀩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŒ‹EasyVolcap: Accelerating Neural VolumetricπŸŒ‹

πŸ‘‰Novel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering

πŸ‘‰Review https://t.ly/8BISl
πŸ‘‰Paper arxiv.org/pdf/2312.06575.pdf
πŸ‘‰Code github.com/zju3dv/EasyVolcap
πŸ”₯10πŸ‘2❀1πŸ₯°1πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ™ Rock-Track announced! πŸ™

πŸ‘‰Rock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.

πŸ‘‰Review https://t.ly/hC0ak
πŸ‘‰Repo, coming: https://lnkd.in/dtDkPwCC
πŸ‘‰Paper coming
πŸ‘4πŸ‘4πŸ”₯2❀1πŸ₯°1
🧠350+ Free #AI Courses by #Google🧠

πŸ‘‰350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.

βœ…π†πžπ§πžπ«πšπ­π’π―πž π€πˆ
βœ…πˆπ§π­π«π¨ 𝐭𝐨 π‹π‹πŒπ¬
βœ…π‚π• 𝐰𝐒𝐭𝐑 𝐓𝐅
βœ…πƒπšπ­πš, πŒπ‹, π€πˆ
βœ…π‘πžπ¬π©π¨π§π¬π’π›π₯𝐞 π€πˆ

πŸ‘‰Review: https://t.ly/517Dr
πŸ‘‰Full list: https://www.cloudskillsboost.google/catalog?page=1
❀13πŸ‘3πŸ‘2🍾2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‹ Diffutoon: new SOTA video πŸ‹

πŸ‘‰Diffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!

πŸ‘‰Review https://t.ly/sim2O
πŸ‘‰Paper https://lnkd.in/dPcSnAUu
πŸ‘‰Code https://lnkd.in/d9B_dGrf
πŸ‘‰Project https://lnkd.in/dpcsJcX2
πŸ”₯19❀3🀯3πŸ‘1πŸ₯°1🀩1πŸ’©1🍾1
πŸ₯“ RANSAC -> PARSAC (neural) πŸ₯“

πŸ‘‰Neural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released πŸ’™

πŸ‘‰Review https://t.ly/r9ngg
πŸ‘‰Paper https://lnkd.in/dadQ4Qec
πŸ‘‰Code https://lnkd.in/dYp6gADd
❀14πŸ‘3⚑1πŸ₯°1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
β†˜οΈ SEELE: "moving" the subjects ➑️

πŸ‘‰Subject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the image’s fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks

πŸ‘‰Review https://t.ly/4FS4H
πŸ‘‰Paper arxiv.org/pdf/2401.16861.pdf
πŸ‘‰Project yikai-wang.github.io/seele/
πŸ‘20❀3🀯3πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŽ‰ ADΞ”ER: Event-Camera Suite πŸŽ‰

πŸ‘‰ADΞ”ER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΞ”ER (Address, Decimation, Ξ”t Event Representation) video streams. Source code (RUST) released πŸ’™

πŸ‘‰Review https://t.ly/w5_KC
πŸ‘‰Paper arxiv.org/pdf/2401.17151.pdf
πŸ‘‰Repo github.com/ac-freeman/adder-codec-rs
❀7πŸ‘3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🚦(add) Anything in Any Video🚦

πŸ‘‰ XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/UYhl0
πŸ‘‰Code https://lnkd.in/gyi7Dhkn
πŸ‘‰Paper https://lnkd.in/gXyAJ6GZ
πŸ‘‰Project https://lnkd.in/gVA5vduD
πŸ”₯12🀯6πŸ‘5πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🍬 ABS: SOTA collision-free 🍬

πŸ‘‰ABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) πŸ’™

πŸ‘‰Review https://t.ly/AYu-Z
πŸ‘‰Paper arxiv.org/pdf/2401.17583.pdf
πŸ‘‰Project agile-but-safe.github.io/
πŸ‘‰Repo github.com/LeCAR-Lab/ABS
😍11πŸ‘3πŸ‘1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‡ Bootstrapping TAP πŸ‡

πŸ‘‰#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released πŸ’™

πŸ‘‰Review https://t.ly/-S_ZL
πŸ‘‰Paper arxiv.org/pdf/2402.00847.pdf
πŸ‘‰Code https://github.com/google-deepmind/tapnet
πŸ”₯5πŸ‘3πŸ₯°1🀩1
πŸ’₯Py4AI 2x Speakers, 2x TicketsπŸ’₯

βœ…Doubling the speakers (6 -> 12!)
βœ…A new track (2 tracks in parallel)
βœ…A new batch of 100 tickets!

πŸ‘‰ More: https://t.ly/WmVrM
❀7πŸ‘2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ΅ HASSOD Object Detection πŸͺ΅

πŸ‘‰ HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.

πŸ‘‰Review https://t.ly/66qHF
πŸ‘‰Paper arxiv.org/pdf/2402.03311.pdf
πŸ‘‰Project hassod-neurips23.github.io/
πŸ‘‰Repo github.com/Shengcao-Cao/HASSOD
πŸ”₯13❀5πŸ‘3πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🌡 G-Splatting Portraits 🌡

πŸ‘‰From monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction

πŸ‘‰Review https://t.ly/fq71w
πŸ‘‰Paper https://arxiv.org/pdf/2402.03723.pdf
πŸ‘‰Project shahrukhathar.github.io/2024/02/05/Rig3DGS.html
πŸ”₯13❀3πŸ‘1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŒ† Up to 69x Faster SAM πŸŒ†

πŸ‘‰EfficientViT-SAM is a new family of accelerated Segment Anything Models. The same old SAM’s lightweight prompt encoder and mask decoder, while replacing the heavy image encoder with EfficientViT. Up to 69x faster, source code released. Authors: Tsinghua, MIT & #Nvidia

πŸ‘‰Review https://t.ly/zGiE9
πŸ‘‰Paper arxiv.org/pdf/2402.05008.pdf
πŸ‘‰Code github.com/mit-han-lab/efficientvit
πŸ”₯19πŸ‘7❀4πŸ₯°1