AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸŒđ Generalizable Neural Performer ðŸŒđ

👉General neural framework to synthesize free-viewpoint images of arbitrary human performers

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Free-viewpoint synthesis of humans
✅Implicit Geometric Body Embedding
✅Screen-Space Occlusion-Aware Blending
✅GeneBody: 4M frames, multi-view cams

More: https://cutt.ly/SGcnQzn
👍5ðŸ”Ĩ1ðŸĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
🚌 Tire-defect inspection 🚌

👉Unsupervised defects in tires using neural networks

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Impurity, same material as tire
✅Impurity, with different material
✅Damage by temp/pressure
✅Crack or etched material

More: https://bit.ly/37GX1JT
âĪ5👍3ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
🧋#4D Neural Fields🧋

👉4D N.F. visual representations from monocular RGB-D ðŸĪŊ

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅4D scene completion (occlusions)
✅Scene completion in cluttered scenes
✅Novel #AI for contextual point clouds
✅Data, code, models under MIT license

More: https://cutt.ly/6GveKiJ
👍6ðŸĪŊ2ðŸ”Ĩ1ðŸĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
👔Largest dataset of human-object 👔

👉BEHAVE by Google: largest dataset of human-object interactions

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅8 subjects, 20 objects, 5 envs.
✅321 clips with 4 Kinect RGB-D
✅Masks and segmented point clouds
✅3D SMPL & mesh registration
✅Textured scan reconstructions

More: https://bit.ly/3Lx6NNo
👏5👍4ðŸ”Ĩ2âĪ1ðŸ˜ą1ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĶīENARF-GAN Neural ArticulationsðŸĶī

👉Unsupervised method for 3D geometry-aware representation of articulated objects

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Novel efficient neural representation
✅Tri-planes deformation fields for training
✅Novel GAN for articulated representations
✅Controllable 3D from real unlabeled pic

More: https://bit.ly/3xYqedN
ðŸĪŊ3👍2âĪ1ðŸ”Ĩ1ðŸĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ–ēïļ HuMMan: 4D human dataset ðŸ–ēïļ

👉HuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames ðŸĪŊ

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅RGB, pt-clouds, keypts, SMPL, texture
✅Mobile device in the sensor suite
✅500+ actions to cover movements

More: https://bit.ly/3vTRW8Z
ðŸĨ°2ðŸ˜ą2👍1ðŸĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ”ĨNeighborhood Attention Transformer ðŸ”Ĩ

👉A novel transformer for both image classification and downstream vision tasks

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Neighborhood Attention (NA)
✅Neighborhood Attention Transformer, NAT
✅Faster training/inference, good throughput
✅Checkpoints, train, #CUDA kernel available

More: https://bit.ly/3F5aVSo
ðŸĪŊ4👍3ðŸ”Ĩ1ðŸ˜ą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ”ĨðŸ”ĨFANs: Fully Attentional NetworksðŸ”ĨðŸ”Ĩ

👉#Nvidia unveils the fully attentional networks (FANs)

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Efficient fully attentional design
✅Semantic seg. & object detection
✅Model/source code soon available!

More: https://bit.ly/3vtpITs
ðŸ”Ĩ7ðŸĪŊ3👍2âĪ1
ðŸ‘Ļ🏞‍ðŸŽĻ Open-Source DALL·E 2 is out ðŸ‘Ļ🏞‍ðŸŽĻ

👉#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅SOTA for text-to-image generation
✅Source code/model under MIT License
✅"Medieval painting of wifi not working"

More: https://bit.ly/3vzsff6
ðŸĪŊ14👍6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
⛹ViTPose: Transformer for Pose⛹

👉ViTPose from ViTAE, ViT for human pose

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Plain/nonhierarchical ViT for pose
✅Deconv-layers after ViT for keypoints
✅Just the baseline is the new SOTA
✅Source code & models available soon!

More: https://bit.ly/3MJ0kz1
👍5ðŸĪŊ4ðŸ”Ĩ1ðŸĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸ§ģ Unsupervised HD Motion Transfer ðŸ§ģ

👉Novel e2e unsupervised motion transfer for image animation

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅TPS motion estimation + Dropout
✅Novel E2E unsupervised motion transfer
✅Optical flow + multi-res. occlusion mask
✅Code and models under MIT license

More: https://bit.ly/3MGNPns
ðŸ”Ĩ8👍6ðŸĪŊ4âĪ2ðŸ˜ą2
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸšĪ Neural Self-Calibration in the wild ðŸšĪ

👉 Learning algorithm to regress calibration params from in the wild clips

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Params purely from self-supervision
✅S.S. depth/pose learning as objective
✅POV, fisheye, catadioptric: no changes
✅SOTA results on EuRoC MAV dataset

More: https://bit.ly/3w1n6LB
👍8ðŸĪĐ2ðŸ”Ĩ1ðŸĨ°1ðŸĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĶ… ConDor: S.S. Canonicalization ðŸĶ…

👉Self-Supervised Canonicalization for full/partial 3D points cloud

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅RRC + Stanford + KAIST + Brown
✅On top of Tensor Field Networks (TFNs)
✅Unseen 3D -> equivariant canonical
✅Co-segmentation, NO supervision
✅Code and model under MIT license

More: https://bit.ly/3MNDyGa
ðŸ”Ĩ4👍1ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĶ€ Event-aided Direct Sparse Odometry ðŸĶ€

👉EDS: direct monocular visual odometry using events/frames

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Mono 6-DOF visual odometry + events
✅Direct photometric bundle adjustment
✅Camera motion tracking by sparse pixels
✅A new dataset with HQ events and frame

More: https://bit.ly/3s9FiBN
ðŸ”Ĩ5👍3ðŸĪŊ1ðŸ˜ą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸŦ€BlobGAN: Blob-Disentangled SceneðŸŦ€

👉Unsupervised, mid-level (blobs) generation of scenes

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Spatial, depth-ordered Gaussian blobs
✅Reaching for supervised level, and more
✅Source under BSD-2 "Simplified" License

More: https://bit.ly/3kRyGnj
ðŸ”Ĩ8👍1ðŸĨ°1ðŸĪŊ1ðŸ˜ą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĶ•E2EVE editor via pre-trained artistðŸĶ•

👉E2EVE generates a new version of the source image that resembles the "driver" one

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅Blending regions by driver image
✅E2E cond-probability of the edits
✅S.S. augmenting in target domain
✅Implemented as SOTA transformer
✅Code/models available (soon)

More: https://bit.ly/3P9TDYW
ðŸĪŊ5👍2ðŸĪĐ2âĪ1ðŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸķ Bringing pets in #metaverse ðŸķ

👉ARTEMIS: pipeline for generating articulated neural pets for virtual worlds

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅ARTiculated, appEarance, Mo-synthesIS
✅Motion control, animation & rendering
✅Neural-generated (NGI) animal engine
✅SOTA animal mocap + neural control

More: https://bit.ly/3LZSLDU
âĪ4👍2ðŸĨ°2ðŸĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
😍Animated hand in 1972, damn romantic😍

👉Q: is #VR the technology that developed least in the last 30 years? ðŸĪ”

More: https://bit.ly/3snxNaq
👍7âĪ3ðŸĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
⏏ïļEnsembling models for GAN training⏏ïļ

👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅CV models as ensemble of discriminators
✅Improving GAN in limited / large-scale set
✅10k samples matches StyleGAN2 w/ 1.6M
✅Source code / models under MIT license

More: https://bit.ly/3wgUVsr
ðŸĪŊ6ðŸ”Ĩ2
This media is not supported in your browser
VIEW IN TELEGRAM
ðŸĪŊCooperative Driving + AUTOCASTSIMðŸĪŊ

👉COOPERNAUT: cross-vehicle perception for vision-based cooperative driving

𝐇ðĒð ðĄðĨðĒð ðĄð­ðŽ:
✅UTexas + #Stanford + #Sony #AI
✅LiDAR into compact point-based
✅Network-augmented simulator
✅Source code and models available

More: https://bit.ly/3sr5HLk
ðŸ”Ĩ6ðŸĪŊ3ðŸĨ°1