AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Neighborhood Attention Transformer πŸ”₯

πŸ‘‰A novel transformer for both image classification and downstream vision tasks

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Neighborhood Attention (NA)
βœ…Neighborhood Attention Transformer, NAT
βœ…Faster training/inference, good throughput
βœ…Checkpoints, train, #CUDA kernel available

More: https://bit.ly/3F5aVSo
🀯4πŸ‘3πŸ”₯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯πŸ”₯FANs: Fully Attentional NetworksπŸ”₯πŸ”₯

πŸ‘‰#Nvidia unveils the fully attentional networks (FANs)

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Efficient fully attentional design
βœ…Semantic seg. & object detection
βœ…Model/source code soon available!

More: https://bit.ly/3vtpITs
πŸ”₯7🀯3πŸ‘2❀1
πŸ‘¨πŸΌβ€πŸŽ¨ Open-Source DALLΒ·E 2 is out πŸ‘¨πŸΌβ€πŸŽ¨

πŸ‘‰#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA for text-to-image generation
βœ…Source code/model under MIT License
βœ…"Medieval painting of wifi not working"

More: https://bit.ly/3vzsff6
🀯14πŸ‘6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
β›ΊViTPose: Transformer for Poseβ›Ί

πŸ‘‰ViTPose from ViTAE, ViT for human pose

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Plain/nonhierarchical ViT for pose
βœ…Deconv-layers after ViT for keypoints
βœ…Just the baseline is the new SOTA
βœ…Source code & models available soon!

More: https://bit.ly/3MJ0kz1
πŸ‘5🀯4πŸ”₯1πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
🧳 Unsupervised HD Motion Transfer 🧳

πŸ‘‰Novel e2e unsupervised motion transfer for image animation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…TPS motion estimation + Dropout
βœ…Novel E2E unsupervised motion transfer
βœ…Optical flow + multi-res. occlusion mask
βœ…Code and models under MIT license

More: https://bit.ly/3MGNPns
πŸ”₯8πŸ‘6🀯4❀2😱2
This media is not supported in your browser
VIEW IN TELEGRAM
🚀 Neural Self-Calibration in the wild 🚀

πŸ‘‰ Learning algorithm to regress calibration params from in the wild clips

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Params purely from self-supervision
βœ…S.S. depth/pose learning as objective
βœ…POV, fisheye, catadioptric: no changes
βœ…SOTA results on EuRoC MAV dataset

More: https://bit.ly/3w1n6LB
πŸ‘8🀩2πŸ”₯1πŸ₯°1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦… ConDor: S.S. Canonicalization πŸ¦…

πŸ‘‰Self-Supervised Canonicalization for full/partial 3D points cloud

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…RRC + Stanford + KAIST + Brown
βœ…On top of Tensor Field Networks (TFNs)
βœ…Unseen 3D -> equivariant canonical
βœ…Co-segmentation, NO supervision
βœ…Code and model under MIT license

More: https://bit.ly/3MNDyGa
πŸ”₯4πŸ‘1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦€ Event-aided Direct Sparse Odometry πŸ¦€

πŸ‘‰EDS: direct monocular visual odometry using events/frames

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Mono 6-DOF visual odometry + events
βœ…Direct photometric bundle adjustment
βœ…Camera motion tracking by sparse pixels
βœ…A new dataset with HQ events and frame

More: https://bit.ly/3s9FiBN
πŸ”₯5πŸ‘3🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ«€BlobGAN: Blob-Disentangled SceneπŸ«€

πŸ‘‰Unsupervised, mid-level (blobs) generation of scenes

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Spatial, depth-ordered Gaussian blobs
βœ…Reaching for supervised level, and more
βœ…Source under BSD-2 "Simplified" License

More: https://bit.ly/3kRyGnj
πŸ”₯8πŸ‘1πŸ₯°1🀯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦•E2EVE editor via pre-trained artistπŸ¦•

πŸ‘‰E2EVE generates a new version of the source image that resembles the "driver" one

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Blending regions by driver image
βœ…E2E cond-probability of the edits
βœ…S.S. augmenting in target domain
βœ…Implemented as SOTA transformer
βœ…Code/models available (soon)

More: https://bit.ly/3P9TDYW
🀯5πŸ‘2🀩2❀1πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐢 Bringing pets in #metaverse 🐢

πŸ‘‰ARTEMIS: pipeline for generating articulated neural pets for virtual worlds

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ARTiculated, appEarance, Mo-synthesIS
βœ…Motion control, animation & rendering
βœ…Neural-generated (NGI) animal engine
βœ…SOTA animal mocap + neural control

More: https://bit.ly/3LZSLDU
❀4πŸ‘2πŸ₯°2🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
😍Animated hand in 1972, damn romantic😍

πŸ‘‰Q: is #VR the technology that developed least in the last 30 years? πŸ€”

More: https://bit.ly/3snxNaq
πŸ‘7❀3🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️

πŸ‘‰Pretrained vision models to improve the GAN training. FID by 1.5 to 2Γ—!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…CV models as ensemble of discriminators
βœ…Improving GAN in limited / large-scale set
βœ…10k samples matches StyleGAN2 w/ 1.6M
βœ…Source code / models under MIT license

More: https://bit.ly/3wgUVsr
🀯6πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
🀯Cooperative Driving + AUTOCASTSIM🀯

πŸ‘‰COOPERNAUT: cross-vehicle perception for vision-based cooperative driving

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…UTexas + #Stanford + #Sony #AI
βœ…LiDAR into compact point-based
βœ…Network-augmented simulator
βœ…Source code and models available

More: https://bit.ly/3sr5HLk
πŸ”₯6🀯3πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’„NeuralHDHair: 3D Neural HairπŸ’„

πŸ‘‰NeuralHDHair: fully automatic system for modeling HD hair from a single image

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…IRHairNet for hair geometric features
βœ…GrowingNet: 3D hair strands in parallel
βœ…VIFu: novel voxel-aligned implicit function
βœ…SOTA in 3D hair modeling from single pic

More: https://bit.ly/38iR0mQ
πŸ‘5πŸ₯°3❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🐑DyNeRF: Neural 3D Video Synthesis🐑

πŸ‘‰#Meta unveils DyNeRF, novel rendering HQ 3D video

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel NeRF-based on temp-latent codes
βœ…Novel training based on hierarchical step
βœ…Datasets of time-synch/calibrated clips
βœ…Attribution-NonCommercial 4.0 Int.

More: https://bit.ly/3MlBRA9
🀯8πŸ‘2πŸ”₯1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‹GATO: agent for multiple tasksπŸ‹

πŸ‘‰The same network with the same weights can play Atari, caption pics, chat, and more🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…General-purpose agent, multiple tasks
βœ…Multi-modal-task, multi-embodiment
βœ…Inspired by large-scale language model

More: https://bit.ly/3LbBOWb
🀯10❀3πŸ‘2πŸ”₯2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺNeRF powered by keypointsπŸͺ

πŸ‘‰ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Sparse 3D keypoints for SOTA avatars
βœ…Unseen subjects from 2/3 views
βœ…Never-before-seen iPhone captures

More: https://bit.ly/39NQqhe
🀯5πŸ”₯2❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌Self-Supervised human co-evolution🐌

πŸ‘‰Self-supervised 3D by co-evolution of pose estimator, imitator, and hallucinator

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel self-supervised 3D pose
βœ…Co-evo of pose, imitator, hallucinator
βœ…Realist 3D pose and 2D-3D supervision
βœ…Source code / model under MIT license

More: https://bit.ly/37J5ImL
πŸ”₯4πŸ‘3❀1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Diff-SDF #3D Rendering 🐲

πŸ‘‰Reconstruction with no complex reg. or priors, using only a per-pixel RGB loss

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Diff-render to optimize geometry/albedo
βœ…No ad-hoc object mask or supervision
βœ…Extended sphere tracing algorithm

More: https://bit.ly/3yKWPnI
🀯10πŸ‘4πŸ”₯2❀1🀩1