AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🐲A novel AI-controllable synthesis🐲

👉Modeling local semantic parts separately and synthesizing images in a compositional way

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Structure & texture locally controlled
Disentanglement between areas
Fine-grained editing of images
Extendible via transfer learning
Just accepted to #CVPR2022

More: https://bit.ly/3IBgkBy
😱3🤯21
This media is not supported in your browser
VIEW IN TELEGRAM
🥣 #AI-Generation with Dream Fields 🥣

👉Neural rendering with multi-modal image and text representations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Aligned image & text models
3D from natural language
No additional data
D.F. neural-scene

More: https://bit.ly/3Mhwm5D
👍10👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🟪 Mip-NeRF 360 for unbounded scenes 🟪

👉An extension of NeRF to overcome the challenges presented by unbounded scenes

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Realistic synthesized views
Intricate/unbounded scenes
Detailed depth maps
Mean-squared error -54%
No code provided 😥

More: https://bit.ly/36ZxsD4
🤯41
This media is not supported in your browser
VIEW IN TELEGRAM
🐓 PINA: personal Neural Avatar 🐓

👉A novel method to acquire neural avatars from RGB-D videos

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A virtual copy of themselves
Realistic clothing deformations
Shape & non-rigid deformation
Avatars from RGB-D sequences
Creative Commons Zero v1.0

More: https://bit.ly/3HAtRIh
👍41👏1😁1
This media is not supported in your browser
VIEW IN TELEGRAM
🐦 EfficientVIS: new SOTA for VIS 🐦

👉Simultaneous classification, segmentation, and tracking multiple object instances in videos

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Efficient and fully end-to-end
Iterative query-video interaction
First RoI-wise clip-level RT-VIS
Requires 15× fewer epochs

More: https://bit.ly/3KfqurN
👍10🔥3👎1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐠#AI-clips from single frame🐠

👉Moving objects in #3D while generating a video by a sequence of desired actions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A playable environments
A single starting image🤯
Controllable camera
Unsupervised learning

More: https://bit.ly/35VDrYO
3👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊Kubric: AI dataset generator🧊

👉Open-source #Python framework for photo-realistic scenes: full control, rich annotations, TBs of fresh data 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Synthetic datasets with GT
From NeRF to optical flow
Full control over data
Ok privacy & licensing
Apache License 2.0

More: https://bit.ly/3hQCaFs
🔥6👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🪂µTransfer for enormous NNs 🪂

👉Microsoft unveils how to tune enormous neural networks

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
New HP tuning: µTransfer
Zero-shot transfer to full-model
Outperforming BERT-large
Outperforming 6.7B GPT-3
Code under MIT license

More: https://bit.ly/3qc37Ij
🔥2🤯21
This media is not supported in your browser
VIEW IN TELEGRAM
🐧Semantic via only text supervision🐧

👉GroupViT with a text encoder on a large-scale image-text dataset: semantic with any pixel-level annotations in training!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Hierarc. Grouping Vision Transf.
Additional text encoder
NO pixel-level annotations
Semantic-seg task via zero-shot
Source code available soon

More:https://bit.ly/3hPGeWr
👍6🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
4D-Net: Lidar + RGB synchronization

👉Google unveils 4D-Net to combine 3D LiDAR and onboard RGB camera

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Point clouds/images in time
Fusing multiple modalities in 4D
Novel sampling for 3D P.C. in time
New SOTA for 3D detection

More: https://bit.ly/3hZCFwN
👍12🔥2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌 New SOTA in video synthesis! 🐌

👉Snap unveils a novel multimodal video generation framework via text/images

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multimodal video generation
Bidirectional transformer
Video token with self-learn.
Text augmentation for robustness
Longer sequence synthesis

More: https://bit.ly/3hZLXsG
🤯4👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁 StyelNeRF source code is out 🎁

👉3D consistent photo-realistic image synthesis

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
NeRF + style generator
3D consistency for HD image
Novel regularization loss
Camera control on styles

More: https://bit.ly/3t5xC49
🔥4🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎CLD-based generative #AI by #Nvidia🦎

👉Nvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
A novel diffusion process for SGMs
Novel score matching obj. for CLD
Hybrid denoising score matching
Efficient sampling from CLD model
Source code under a specific license

More: https://bit.ly/35MToBe
🔥2🤩2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🛸UFO: segmentation @140+ FPS🛸

👉Unified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Unified framework for co-segmentation
Co-segmentation, co-saliency, saliency
Block for long-range dependencies
Able to reach for 140 FPS in inference
The new SOTA on multiple datasets
Source code under MIT License

More: https://bit.ly/3KLd9b9
🔥6👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 Multi-GANs fashion 👗

👉Global GAN blended with other GANs for faces, shoes, etc.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multi-GAN framework
Several generators
Free of artifacts
Full-body generation
Humans, 1024x1024

More: https://bit.ly/37mfOte
🔥2👏21🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚧 FLAG: #3D Avatar Generation 🚧

👉A flow-based generative model of the 3D human body from sparse observations.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
FLow-based Avatar Generative
Conditional distro of body pose
Exact pose likelihood process
Invertibility -> oracle latent code

More: https://bit.ly/3CQpk3p
👏2🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💃 Dancing in the wild with StyleGAN 💃

👉StyleGAN-based animations for AR/VR apps

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Video based motion retargeting
A StyleGAN architecture based
Novel explicit motion representation
SOTA qualitatively & quantitatively

More: https://bit.ly/3CZbL1W
👍6🤯3🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🪀TensoRF: the 4D evolution of NeRF 🪀

👉TensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
VM decomposition technique
Low-rank tensor factorization
Lower memory footprint (speed)
TensoRF is the new SOTA in R.F.
Code under the MIT License

More: https://bit.ly/3qffZgI
👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔼 GAN-meshes without key-points 🔼

👉ETH unveils a GAN framework for generating textured triangle meshes without annotations

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generative of textured meshes
3D generator for all categories
3D pose estimation framework
Code licensed under MIT License

More: https://bit.ly/3qfH9nJ
🤩3🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐯 S.S. Latent Image Animator 🐯

👉Self-supervised autoencoder to animate unseen images by linear navigation in latent

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent Image Animator
Linear displacement in latent
SOTA: VoxCeleb, Taichi, TED-talk
Source code (soon) available

More: https://bit.ly/36pgLAC
👍5🔥3🤯2💩1