AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
4D-Net: Lidar + RGB synchronization

👉Google unveils 4D-Net to combine 3D LiDAR and onboard RGB camera

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Point clouds/images in time
Fusing multiple modalities in 4D
Novel sampling for 3D P.C. in time
New SOTA for 3D detection

More: https://bit.ly/3hZCFwN
👍12🔥2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌 New SOTA in video synthesis! 🐌

👉Snap unveils a novel multimodal video generation framework via text/images

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multimodal video generation
Bidirectional transformer
Video token with self-learn.
Text augmentation for robustness
Longer sequence synthesis

More: https://bit.ly/3hZLXsG
🤯4👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁 StyelNeRF source code is out 🎁

👉3D consistent photo-realistic image synthesis

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
NeRF + style generator
3D consistency for HD image
Novel regularization loss
Camera control on styles

More: https://bit.ly/3t5xC49
🔥4🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎CLD-based generative #AI by #Nvidia🦎

👉Nvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A novel diffusion process for SGMs
Novel score matching obj. for CLD
Hybrid denoising score matching
Efficient sampling from CLD model
Source code under a specific license

More: https://bit.ly/35MToBe
🔥2🤩2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🛸UFO: segmentation @140+ FPS🛸

👉Unified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unified framework for co-segmentation
Co-segmentation, co-saliency, saliency
Block for long-range dependencies
Able to reach for 140 FPS in inference
The new SOTA on multiple datasets
Source code under MIT License

More: https://bit.ly/3KLd9b9
🔥6👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 Multi-GANs fashion 👗

👉Global GAN blended with other GANs for faces, shoes, etc.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multi-GAN framework
Several generators
Free of artifacts
Full-body generation
Humans, 1024x1024

More: https://bit.ly/37mfOte
🔥2👏21🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚧 FLAG: #3D Avatar Generation 🚧

👉A flow-based generative model of the 3D human body from sparse observations.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
FLow-based Avatar Generative
Conditional distro of body pose
Exact pose likelihood process
Invertibility -> oracle latent code

More: https://bit.ly/3CQpk3p
👏2🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💃 Dancing in the wild with StyleGAN 💃

👉StyleGAN-based animations for AR/VR apps

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Video based motion retargeting
A StyleGAN architecture based
Novel explicit motion representation
SOTA qualitatively & quantitatively

More: https://bit.ly/3CZbL1W
👍6🤯3🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🪀TensoRF: the 4D evolution of NeRF 🪀

👉TensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
VM decomposition technique
Low-rank tensor factorization
Lower memory footprint (speed)
TensoRF is the new SOTA in R.F.
Code under the MIT License

More: https://bit.ly/3qffZgI
👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔼 GAN-meshes without key-points 🔼

👉ETH unveils a GAN framework for generating textured triangle meshes without annotations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generative of textured meshes
3D generator for all categories
3D pose estimation framework
Code licensed under MIT License

More: https://bit.ly/3qfH9nJ
🤩3🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐯 S.S. Latent Image Animator 🐯

👉Self-supervised autoencoder to animate unseen images by linear navigation in latent

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent Image Animator
Linear displacement in latent
SOTA: VoxCeleb, Taichi, TED-talk
Source code (soon) available

More: https://bit.ly/36pgLAC
👍5🔥3🤯2💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🪨 Google URF for neural-synthesis 🪨

👉Sequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Extending Neural Radiance Fields
Leveraging asynch. lidar data
Addressing exposure variation
Leveraging segmentations for sky
SOTA #3D reconstructions/synthesizes

More: https://bit.ly/3L2vTDb
🔥11👍4👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 AV2: next-gen. self driving 🚛

👉One of the biggest dataset ever for #autonomousdriving

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
1k seq. of multimodal data
3D annotations, 26 categories
20k lidar & map-aligned pose
250k challenging interactions
HD Map: 3D lane & crosswalk
CC BY-NC-SA 4.0 license

More: https://bit.ly/3trx3lw
🔥3👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖CaTGrasp in Clutter from Simulation🤖

👉Task-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel cat-level, relevant grasping
S.S. hand-object-contact
Tiny objects from dense clutter
Train-simulation -> to real
Source code under Apache 2.0

More: https://bit.ly/3L2YVCo
👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🛼 Drive & Segment without Supervision 🛼

👉Learning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Cross-modal unsupervised
Synchronized LiDAR & RGB
Object proposal on LiDAR points
SOTA, significant improvements

More: https://bit.ly/3L0wWTW
👍3🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌍 NeRF-free Neural Rendering 🌍

👉A simple 2D-only method with a single pass of a neural network

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Synthesis with NO 3D reasoning
Autoregressive & masked transf.
Pose -> object, object -> pose
Attention: branching attention
Source code under MIT License

More: https://bit.ly/3JC7unt
🔥3😱2👍1🤩1
🤓👌Hey, TAKE OFF my eyeglasses! 😙👌

👉A novel framework to remove eyeglasses as well as their cast shadows from faces

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel mask-guided multi-step network
Leveraging 3D synthetic data only
Synthetic portraits with supervisions
Eyeglasses & shadows simultaneously

More: https://bit.ly/3IvQzlf
👍7🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🏥 #AI models/dataset for open surgery 🏥

👉Multi-task #AI model/dataset of real-time surgical behaviors, hands, and tools.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Annotated Videos Open Surgery
Largest dataset of open surgical
2k clips and 23 procedures
12k annotations, 11k+ keypoints
Models/Dataset soon available!

More: https://bit.ly/3tvDdkK
👍8🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥽 #metaverse in 1991 🥽

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

Discussion: https://bit.ly/3txWF07
👍3🤬3🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🫕NeRFusion: Large-Scale Reconstruction🫕

👉Efficient large-scale reconstruction & photo-realistic rendering

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Frame-by-frame R.F.
Neural reconstruction
Real-time at 20+ fps
SOTA on indoor / objects

More: https://bit.ly/3iyfoCo
🤯7🔥4👍3👏2