AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
💃 Dancing in the wild with StyleGAN 💃

👉StyleGAN-based animations for AR/VR apps

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Video based motion retargeting
A StyleGAN architecture based
Novel explicit motion representation
SOTA qualitatively & quantitatively

More: https://bit.ly/3CZbL1W
👍6🤯3🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🪀TensoRF: the 4D evolution of NeRF 🪀

👉TensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
VM decomposition technique
Low-rank tensor factorization
Lower memory footprint (speed)
TensoRF is the new SOTA in R.F.
Code under the MIT License

More: https://bit.ly/3qffZgI
👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔼 GAN-meshes without key-points 🔼

👉ETH unveils a GAN framework for generating textured triangle meshes without annotations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generative of textured meshes
3D generator for all categories
3D pose estimation framework
Code licensed under MIT License

More: https://bit.ly/3qfH9nJ
🤩3🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐯 S.S. Latent Image Animator 🐯

👉Self-supervised autoencoder to animate unseen images by linear navigation in latent

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent Image Animator
Linear displacement in latent
SOTA: VoxCeleb, Taichi, TED-talk
Source code (soon) available

More: https://bit.ly/36pgLAC
👍5🔥3🤯2💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🪨 Google URF for neural-synthesis 🪨

👉Sequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Extending Neural Radiance Fields
Leveraging asynch. lidar data
Addressing exposure variation
Leveraging segmentations for sky
SOTA #3D reconstructions/synthesizes

More: https://bit.ly/3L2vTDb
🔥11👍4👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 AV2: next-gen. self driving 🚛

👉One of the biggest dataset ever for #autonomousdriving

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
1k seq. of multimodal data
3D annotations, 26 categories
20k lidar & map-aligned pose
250k challenging interactions
HD Map: 3D lane & crosswalk
CC BY-NC-SA 4.0 license

More: https://bit.ly/3trx3lw
🔥3👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖CaTGrasp in Clutter from Simulation🤖

👉Task-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel cat-level, relevant grasping
S.S. hand-object-contact
Tiny objects from dense clutter
Train-simulation -> to real
Source code under Apache 2.0

More: https://bit.ly/3L2YVCo
👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🛼 Drive & Segment without Supervision 🛼

👉Learning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Cross-modal unsupervised
Synchronized LiDAR & RGB
Object proposal on LiDAR points
SOTA, significant improvements

More: https://bit.ly/3L0wWTW
👍3🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌍 NeRF-free Neural Rendering 🌍

👉A simple 2D-only method with a single pass of a neural network

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Synthesis with NO 3D reasoning
Autoregressive & masked transf.
Pose -> object, object -> pose
Attention: branching attention
Source code under MIT License

More: https://bit.ly/3JC7unt
🔥3😱2👍1🤩1
🤓👌Hey, TAKE OFF my eyeglasses! 😙👌

👉A novel framework to remove eyeglasses as well as their cast shadows from faces

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel mask-guided multi-step network
Leveraging 3D synthetic data only
Synthetic portraits with supervisions
Eyeglasses & shadows simultaneously

More: https://bit.ly/3IvQzlf
👍7🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🏥 #AI models/dataset for open surgery 🏥

👉Multi-task #AI model/dataset of real-time surgical behaviors, hands, and tools.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Annotated Videos Open Surgery
Largest dataset of open surgical
2k clips and 23 procedures
12k annotations, 11k+ keypoints
Models/Dataset soon available!

More: https://bit.ly/3tvDdkK
👍8🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥽 #metaverse in 1991 🥽

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

Discussion: https://bit.ly/3txWF07
👍3🤬3🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🫕NeRFusion: Large-Scale Reconstruction🫕

👉Efficient large-scale reconstruction & photo-realistic rendering

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Frame-by-frame R.F.
Neural reconstruction
Real-time at 20+ fps
SOTA on indoor / objects

More: https://bit.ly/3iyfoCo
🤯7🔥4👍3👏2
This media is not supported in your browser
VIEW IN TELEGRAM
ORViT for understanding tasks

👉ORViT: object-centric approach that extends ViT layers incorporating object representations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Spatio-temporal through the net
''Object-Region Attention''
''Object-Dynamics" module
Code just released! Apache 2.0

More: https://bit.ly/3wAUavW
🔥5👍3😱2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪅Insane Neural Sketching from #MIT🪅

👉Line drawing generation as unsupervised image translation with various losses

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unpaired method for line drawing
Geometry loss to predict depth
Semantic loss to match CLIP feats
SOTA on unpaired translation/generation
Code and Models under MIT License

More: https://bit.ly/36JRr8A
🤯7🔥41👍1🥰1👏1😁1
This media is not supported in your browser
VIEW IN TELEGRAM
🏔️MPS-Net: new SOTA for #3D human🏔️

👉MPS-Net: accurate & temporally coherent 3D human pose/shape from video

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
MoCA: visual cues from motion
HAFI to mix past/future feats
Stronger temporal correlation
SOTA on multiple datasets

More: https://bit.ly/3uAI5EB
🤯9🔥1🥰1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🤿Transfiner: hyper-detailed segmentation🤿

👉Mask Transfiner: #AI for HQ & efficient instance segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Transfiner: HQ segmentation
HQ seg. via quadtree structure
SOTA & extreme details
Code under MIT License

More: https://bit.ly/3KVzseM
👍5🔥3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥙 DualStyleGAN: SOTA in style transfer🥙

👉Flexible control of dual styles of face domain and extended artistic portrait domain

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
High-resolution (1024*1024)
Intrinsic/extrinsic style path
Hierarchical style manipulation
Novel progressive fine-tuning
Source code under MIT License

More: https://bit.ly/3uS26Xp
👍11🤩4🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🍚 GTR: Global Tracking Transformers 🍚

👉UTexas + Apple: transformer for global multi-object tracking

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
GTR operates on any object
Few frames->global trajectories
SOTA on detectors for any object
Code under Apache License 2.0

More: https://bit.ly/3DiqkxF
🔥7👍2🤯2😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧠E2E Perception for #selfdrivingcars🧠

👉HybridNets: multi-task net with several key optimizations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
End-to-end perception network
Traffic, lane, object detection
Drivable segmentation area
Real-time on embedded systems
Source code under MIT License

More: https://bit.ly/3JMk8Az
👍84👏2🤯1😱1