AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🐠#AI-clips from single frame🐠

πŸ‘‰Moving objects in #3D while generating a video by a sequence of desired actions

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…A playable environments
βœ…A single starting image🀯
βœ…Controllable camera
βœ…Unsupervised learning

More: https://bit.ly/35VDrYO
❀3πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊Kubric: AI dataset generator🧊

πŸ‘‰Open-source #Python framework for photo-realistic scenes: full control, rich annotations, TBs of fresh data 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Synthetic datasets with GT
βœ…From NeRF to optical flow
βœ…Full control over data
βœ…Ok privacy & licensing
βœ…Apache License 2.0

More: https://bit.ly/3hQCaFs
πŸ”₯6πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ‚Β΅Transfer for enormous NNs πŸͺ‚

πŸ‘‰Microsoft unveils how to tune enormous neural networks

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…New HP tuning: Β΅Transfer
βœ…Zero-shot transfer to full-model
βœ…Outperforming BERT-large
βœ…Outperforming 6.7B GPT-3
βœ…Code under MIT license

More: https://bit.ly/3qc37Ij
πŸ”₯2🀯2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🐧Semantic via only text supervision🐧

πŸ‘‰GroupViT with a text encoder on a large-scale image-text dataset: semantic with any pixel-level annotations in training!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Hierarc. Grouping Vision Transf.
βœ…Additional text encoder
βœ…NO pixel-level annotations
βœ…Semantic-seg task via zero-shot
βœ…Source code available soon

More:https://bit.ly/3hPGeWr
πŸ‘6πŸ₯°1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
⌚4D-Net: Lidar + RGB synchronization⌚

πŸ‘‰Google unveils 4D-Net to combine 3D LiDAR and onboard RGB camera

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Point clouds/images in time
βœ…Fusing multiple modalities in 4D
βœ…Novel sampling for 3D P.C. in time
βœ…New SOTA for 3D detection

More: https://bit.ly/3hZCFwN
πŸ‘12πŸ”₯2🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌 New SOTA in video synthesis! 🐌

πŸ‘‰Snap unveils a novel multimodal video generation framework via text/images

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Multimodal video generation
βœ…Bidirectional transformer
βœ…Video token with self-learn.
βœ…Text augmentation for robustness
βœ…Longer sequence synthesis

More: https://bit.ly/3hZLXsG
🀯4πŸ‘1πŸ”₯1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🎁 StyelNeRF source code is out 🎁

πŸ‘‰3D consistent photo-realistic image synthesis

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…NeRF + style generator
βœ…3D consistency for HD image
βœ…Novel regularization loss
βœ…Camera control on styles

More: https://bit.ly/3t5xC49
πŸ”₯4πŸ₯°1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎CLD-based generative #AI by #Nvidia🦎

πŸ‘‰Nvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…A novel diffusion process for SGMs
βœ…Novel score matching obj. for CLD
βœ…Hybrid denoising score matching
βœ…Efficient sampling from CLD model
βœ…Source code under a specific license

More: https://bit.ly/35MToBe
πŸ”₯2🀩2πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ›ΈUFO: segmentation @140+ FPSπŸ›Έ

πŸ‘‰Unified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Unified framework for co-segmentation
βœ…Co-segmentation, co-saliency, saliency
βœ…Block for long-range dependencies
βœ…Able to reach for 140 FPS in inference
βœ…The new SOTA on multiple datasets
βœ…Source code under MIT License

More: https://bit.ly/3KLd9b9
πŸ”₯6πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘— Multi-GANs fashion πŸ‘—

πŸ‘‰Global GAN blended with other GANs for faces, shoes, etc.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Multi-GAN framework
βœ…Several generators
βœ…Free of artifacts
βœ…Full-body generation
βœ…Humans, 1024x1024

More: https://bit.ly/37mfOte
πŸ”₯2πŸ‘2❀1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚧 FLAG: #3D Avatar Generation 🚧

πŸ‘‰A flow-based generative model of the 3D human body from sparse observations.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…FLow-based Avatar Generative
βœ…Conditional distro of body pose
βœ…Exact pose likelihood process
βœ…Invertibility -> oracle latent code

More: https://bit.ly/3CQpk3p
πŸ‘2πŸ”₯1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’ƒ Dancing in the wild with StyleGAN πŸ’ƒ

πŸ‘‰StyleGAN-based animations for AR/VR apps

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Video based motion retargeting
βœ…A StyleGAN architecture based
βœ…Novel explicit motion representation
βœ…SOTA qualitatively & quantitatively

More: https://bit.ly/3CZbL1W
πŸ‘6🀯3πŸ₯°2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ€TensoRF: the 4D evolution of NeRF πŸͺ€

πŸ‘‰TensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…VM decomposition technique
βœ…Low-rank tensor factorization
βœ…Lower memory footprint (speed)
βœ…TensoRF is the new SOTA in R.F.
βœ…Code under the MIT License

More: https://bit.ly/3qffZgI
πŸ‘2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”Ό GAN-meshes without key-points πŸ”Ό

πŸ‘‰ETH unveils a GAN framework for generating textured triangle meshes without annotations

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Generative of textured meshes
βœ…3D generator for all categories
βœ…3D pose estimation framework
βœ…Code licensed under MIT License

More: https://bit.ly/3qfH9nJ
🀩3🀯2πŸ‘1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐯 S.S. Latent Image Animator 🐯

πŸ‘‰Self-supervised autoencoder to animate unseen images by linear navigation in latent

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Latent Image Animator
βœ…Linear displacement in latent
βœ…SOTA: VoxCeleb, Taichi, TED-talk
βœ…Source code (soon) available

More: https://bit.ly/36pgLAC
πŸ‘5πŸ”₯3🀯2πŸ’©1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ¨ Google URF for neural-synthesis πŸͺ¨

πŸ‘‰Sequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Extending Neural Radiance Fields
βœ…Leveraging asynch. lidar data
βœ…Addressing exposure variation
βœ…Leveraging segmentations for sky
βœ…SOTA #3D reconstructions/synthesizes

More: https://bit.ly/3L2vTDb
πŸ”₯11πŸ‘4πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸš› AV2: next-gen. self driving πŸš›

πŸ‘‰One of the biggest dataset ever for #autonomousdriving

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…1k seq. of multimodal data
βœ…3D annotations, 26 categories
βœ…20k lidar & map-aligned pose
βœ…250k challenging interactions
βœ…HD Map: 3D lane & crosswalk
βœ…CC BY-NC-SA 4.0 license

More: https://bit.ly/3trx3lw
πŸ”₯3πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ€–CaTGrasp in Clutter from SimulationπŸ€–

πŸ‘‰Task-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel cat-level, relevant grasping
βœ…S.S. hand-object-contact
βœ…Tiny objects from dense clutter
βœ…Train-simulation -> to real
βœ…Source code under Apache 2.0

More: https://bit.ly/3L2YVCo
πŸ‘1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ›Ό Drive & Segment without Supervision πŸ›Ό

πŸ‘‰Learning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Cross-modal unsupervised
βœ…Synchronized LiDAR & RGB
βœ…Object proposal on LiDAR points
βœ…SOTA, significant improvements

More: https://bit.ly/3L0wWTW
πŸ‘3πŸ”₯1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌍 NeRF-free Neural Rendering 🌍

πŸ‘‰A simple 2D-only method with a single pass of a neural network

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Synthesis with NO 3D reasoning
βœ…Autoregressive & masked transf.
βœ…Pose -> object, object -> pose
βœ…Attention: branching attention
βœ…Source code under MIT License

More: https://bit.ly/3JC7unt
πŸ”₯3😱2πŸ‘1🀩1