AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
☀️SunStage: Selfie with the Sun☀️

👉Accurate/tailored reconstruction of facial geometry/reflectance

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel personalized scanning
Disentanglement of scene params
Geometry, materials, lighting, poses
Photorealistic with a single selfie video

More: https://bit.ly/36W1Oqx
🔥3👏2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
📫 Generative Neural Avatars 📫

👉3D shapes of people in a variety of garments with corresponding skinning weight

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ETH + Uni-Tübingen + Max Planck
Animatable #3D human in garment
Directly from raw posed 3D scans
NO canonical, registration, manual w.
Geometric detail in clothing deformation


More: https://bit.ly/3M7mCdB
👏3🔥2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🗨️Conversational program synthesis🗨️

👉Conversational synthesis to translate English into executable code

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Conversational program synthesis
New multi-turn progr.benchmark
Open Custom library: JAXFORMER
Source code under BSD-3 license

More: https://bit.ly/3jjWWhk
🤯4🥰2🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧯Long Video Diffusion Models🧯

👉#Google unveils a novel diffusion model for video generation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Straightforward extension of 2D UNet
Longer by new conditional generation
SOTA in unconditional generation

More: https://bit.ly/35Y2rzg
🔥4🎉2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 AutoRF: #3D objects in-the-wild 🚙

👉From #Meta: #3D object from just a single, in-the wild, image

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel view synthesis from in-the-wild
Normalized, object-centric representation
Disentangling shape, appearance & pose
Exploiting BBS & panoptic segmentation
Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
🤯7😱2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🌠GAN-based Darkest Dataset🌠

👉Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
"Darkest" dataset ever seen
Moonless, no external illumination
GAN-tuned physics-based model
Clips with dancing, volleyball, flags...

More: https://bit.ly/3LXxMkN
👍3🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖Populating with digital humans🤖

👉ETHZ unveils GAMMA to populate the #3D scene with digital humans

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
GenerAtive Motion primitive MArkers
Realistic, controllable, infinite motions
Tree-based search to preserve quality
SOTA in realistic/controllable motion

More: https://bit.ly/3OgY4AG
😱5👍4🔥2👏1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥#AIwithPapers: we are ~2,000!🔥

💙💛 Simply amazing. Thank you all 💙💛

😈 Invite your friends -> https://t.me/AI_DeepLearning
18🔥8🥰4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
😼GARF: Gaussian Activated NeRF😼

👉GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
NeRF from imperfect camera poses
NO hyper-parameter tuning/initialization
Theoretical insight on Gaussian activation
Unlocking NeRF for real-world application?

More: https://bit.ly/36bvdfU
👍4🤩21👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎭Novel pre-training strategy for #AI🎭

👉EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multimodal: additional modal. over RGB
Multi-task: multiple outputs over RGB
General: MultiMAE by pseudo-labeling
Classification, segmentation, depth
Code under NonCommercial 4.0 Int.

More: https://bit.ly/3jRhNsN
🔥7🤯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧪 A new SOTA in Dataset Distillation 🧪

👉A new approach by Matching Training Trajectories is out!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Distilling data "to match" bigger one
Distilled data to guide a network
Trajectories of experts from real data
SOTA + distilling higher-res visual data

More: https://bit.ly/3JwYOxW
👍5🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤 Two-Hand tracking via GCN 🧤

👉The first-ever GCN for two interacting hands in single RGB image

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Reconstruction by GCN mesh regression
PIFA: pyramid attention for local occlusion
CHA: cross hand attention for interaction
SOTA + generalization in-the-wild scenario
Source code available under GNU 🤯

More: https://bit.ly/3KH5FWO
👏10👍4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🕹️Video K-Net, SOTA in Segmentation🕹️

👉Simple, strong, and unified framework for fully end-to-end video panoptic segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Learnable kernels from K-Net
K-Net learns to segment & track
Appearance / cross-T kernel interaction
New SOTA without bells and whistles 🤷‍♂️

More: https://bit.ly/3uEEZQR
👍6🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐭DeepLabCut: tracking animals in the wild🐭

👉A toolbox for markerless pose estimation of animals performing various tasks

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multi-animal pose estimation
Datasets for multi-animal pose
Key-points, limbs, animal identity
Optimal key-points without input

More: https://bit.ly/37L1mLE
🔥6🤔4👏2🤯21👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍡Neural Articulated Human Body🍡

👉Novel neural implicit representation for articulated body

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
COmpositional Articulated People
Large variety of shapes & poses
Novel encoder-decoder architecture

More: https://bit.ly/3xvn7dl
👍4🥰2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 2K Resolution Generative #AI 🦚

👉Novel continuous-scale training with variable output resolutions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Mixed-resolution data
Arbitrary scales during training
Generations beyond 1024×1024
Variant of FID metric for scales
Source code under MIT license

More: https://bit.ly/3uNfVY6
🤯11👍2🔥2😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍DS Unsupervised Video Decomposition🐍

👉Novel method to extract persistent elements of a scene

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Scene element as Deformable Sprite (DS)
Deformable Sprites by video auto-encoder
Canonical texture image for appearance
Non-rigid geom. transformation

More: https://bit.ly/37WV9w1
👍4🤯3🔥1🥰1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥓 L-SVPE for Deep Deblurring 🥓

👉L-SVPE to deblur scenes while recovering high-freq details

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Learned Spatially Varying Pixel Exposures
Next-gen focal-plane sensor + DL
Deep conv decoder for motion deblurring
Superior results over non-optimized exp.

More: https://bit.ly/3uRYQMT
🤩7👍2🤔2🎉1
This media is not supported in your browser
VIEW IN TELEGRAM
🧧Hyper-Fast Instance Segmentation🧧

👉Novel Temporally Efficient Vision Transformer (TeViT) for VIS

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Video instance segmentation transformer
Contextual-info at frame/instance level
Nearly convolution-free framework 🤷‍♂️
The new SOTA for VIS, ~70 FPS!
Code & models under MIT license

More: https://bit.ly/3rCMXIn
🔥10👍3👏1🤯1