This media is not supported in your browser
VIEW IN TELEGRAM
ðđ Generalizable Neural Performer ðđ
ðGeneral neural framework to synthesize free-viewpoint images of arbitrary human performers
ððĒð ðĄðĨðĒð ðĄððŽ:
â Free-viewpoint synthesis of humans
â Implicit Geometric Body Embedding
â Screen-Space Occlusion-Aware Blending
â GeneBody: 4M frames, multi-view cams
More: https://cutt.ly/SGcnQzn
ðGeneral neural framework to synthesize free-viewpoint images of arbitrary human performers
ððĒð ðĄðĨðĒð ðĄððŽ:
â Free-viewpoint synthesis of humans
â Implicit Geometric Body Embedding
â Screen-Space Occlusion-Aware Blending
â GeneBody: 4M frames, multi-view cams
More: https://cutt.ly/SGcnQzn
ð5ðĨ1ðĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ð Tire-defect inspection ð
ðUnsupervised defects in tires using neural networks
ððĒð ðĄðĨðĒð ðĄððŽ:
â Impurity, same material as tire
â Impurity, with different material
â Damage by temp/pressure
â Crack or etched material
More: https://bit.ly/37GX1JT
ðUnsupervised defects in tires using neural networks
ððĒð ðĄðĨðĒð ðĄððŽ:
â Impurity, same material as tire
â Impurity, with different material
â Damage by temp/pressure
â Crack or etched material
More: https://bit.ly/37GX1JT
âĪ5ð3ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ð§#4D Neural Fieldsð§
ð4D N.F. visual representations from monocular RGB-D ðĪŊ
ððĒð ðĄðĨðĒð ðĄððŽ:
â 4D scene completion (occlusions)
â Scene completion in cluttered scenes
â Novel #AI for contextual point clouds
â Data, code, models under MIT license
More: https://cutt.ly/6GveKiJ
ð4D N.F. visual representations from monocular RGB-D ðĪŊ
ððĒð ðĄðĨðĒð ðĄððŽ:
â 4D scene completion (occlusions)
â Scene completion in cluttered scenes
â Novel #AI for contextual point clouds
â Data, code, models under MIT license
More: https://cutt.ly/6GveKiJ
ð6ðĪŊ2ðĨ1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðLargest dataset of human-object ð
ðBEHAVE by Google: largest dataset of human-object interactions
ððĒð ðĄðĨðĒð ðĄððŽ:
â 8 subjects, 20 objects, 5 envs.
â 321 clips with 4 Kinect RGB-D
â Masks and segmented point clouds
â 3D SMPL & mesh registration
â Textured scan reconstructions
More: https://bit.ly/3Lx6NNo
ðBEHAVE by Google: largest dataset of human-object interactions
ððĒð ðĄðĨðĒð ðĄððŽ:
â 8 subjects, 20 objects, 5 envs.
â 321 clips with 4 Kinect RGB-D
â Masks and segmented point clouds
â 3D SMPL & mesh registration
â Textured scan reconstructions
More: https://bit.ly/3Lx6NNo
ð5ð4ðĨ2âĪ1ðą1ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶīENARF-GAN Neural ArticulationsðĶī
ðUnsupervised method for 3D geometry-aware representation of articulated objects
ððĒð ðĄðĨðĒð ðĄððŽ:
â Novel efficient neural representation
â Tri-planes deformation fields for training
â Novel GAN for articulated representations
â Controllable 3D from real unlabeled pic
More: https://bit.ly/3xYqedN
ðUnsupervised method for 3D geometry-aware representation of articulated objects
ððĒð ðĄðĨðĒð ðĄððŽ:
â Novel efficient neural representation
â Tri-planes deformation fields for training
â Novel GAN for articulated representations
â Controllable 3D from real unlabeled pic
More: https://bit.ly/3xYqedN
ðĪŊ3ð2âĪ1ðĨ1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ðēïļ HuMMan: 4D human dataset ðēïļ
ðHuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames ðĪŊ
ððĒð ðĄðĨðĒð ðĄððŽ:
â RGB, pt-clouds, keypts, SMPL, texture
â Mobile device in the sensor suite
â 500+ actions to cover movements
More: https://bit.ly/3vTRW8Z
ðHuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames ðĪŊ
ððĒð ðĄðĨðĒð ðĄððŽ:
â RGB, pt-clouds, keypts, SMPL, texture
â Mobile device in the sensor suite
â 500+ actions to cover movements
More: https://bit.ly/3vTRW8Z
ðĨ°2ðą2ð1ðĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨNeighborhood Attention Transformer ðĨ
ðA novel transformer for both image classification and downstream vision tasks
ððĒð ðĄðĨðĒð ðĄððŽ:
â Neighborhood Attention (NA)
â Neighborhood Attention Transformer, NAT
â Faster training/inference, good throughput
â Checkpoints, train, #CUDA kernel available
More: https://bit.ly/3F5aVSo
ðA novel transformer for both image classification and downstream vision tasks
ððĒð ðĄðĨðĒð ðĄððŽ:
â Neighborhood Attention (NA)
â Neighborhood Attention Transformer, NAT
â Faster training/inference, good throughput
â Checkpoints, train, #CUDA kernel available
More: https://bit.ly/3F5aVSo
ðĪŊ4ð3ðĨ1ðą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĨðĨFANs: Fully Attentional NetworksðĨðĨ
ð#Nvidia unveils the fully attentional networks (FANs)
ððĒð ðĄðĨðĒð ðĄððŽ:
â Efficient fully attentional design
â Semantic seg. & object detection
â Model/source code soon available!
More: https://bit.ly/3vtpITs
ð#Nvidia unveils the fully attentional networks (FANs)
ððĒð ðĄðĨðĒð ðĄððŽ:
â Efficient fully attentional design
â Semantic seg. & object detection
â Model/source code soon available!
More: https://bit.ly/3vtpITs
ðĨ7ðĪŊ3ð2âĪ1
ðĻðžâðĻ Open-Source DALL·E 2 is out ðĻðžâðĻ
ð#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.
ððĒð ðĄðĨðĒð ðĄððŽ:
â SOTA for text-to-image generation
â Source code/model under MIT License
â "Medieval painting of wifi not working"
More: https://bit.ly/3vzsff6
ð#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.
ððĒð ðĄðĨðĒð ðĄððŽ:
â SOTA for text-to-image generation
â Source code/model under MIT License
â "Medieval painting of wifi not working"
More: https://bit.ly/3vzsff6
ðĪŊ14ð6ð1
This media is not supported in your browser
VIEW IN TELEGRAM
âšViTPose: Transformer for Poseâš
ðViTPose from ViTAE, ViT for human pose
ððĒð ðĄðĨðĒð ðĄððŽ:
â Plain/nonhierarchical ViT for pose
â Deconv-layers after ViT for keypoints
â Just the baseline is the new SOTA
â Source code & models available soon!
More: https://bit.ly/3MJ0kz1
ðViTPose from ViTAE, ViT for human pose
ððĒð ðĄðĨðĒð ðĄððŽ:
â Plain/nonhierarchical ViT for pose
â Deconv-layers after ViT for keypoints
â Just the baseline is the new SOTA
â Source code & models available soon!
More: https://bit.ly/3MJ0kz1
ð5ðĪŊ4ðĨ1ðĨ°1
This media is not supported in your browser
VIEW IN TELEGRAM
ð§ģ Unsupervised HD Motion Transfer ð§ģ
ðNovel e2e unsupervised motion transfer for image animation
ððĒð ðĄðĨðĒð ðĄððŽ:
â TPS motion estimation + Dropout
â Novel E2E unsupervised motion transfer
â Optical flow + multi-res. occlusion mask
â Code and models under MIT license
More: https://bit.ly/3MGNPns
ðNovel e2e unsupervised motion transfer for image animation
ððĒð ðĄðĨðĒð ðĄððŽ:
â TPS motion estimation + Dropout
â Novel E2E unsupervised motion transfer
â Optical flow + multi-res. occlusion mask
â Code and models under MIT license
More: https://bit.ly/3MGNPns
ðĨ8ð6ðĪŊ4âĪ2ðą2
This media is not supported in your browser
VIEW IN TELEGRAM
ðĪ Neural Self-Calibration in the wild ðĪ
ð Learning algorithm to regress calibration params from in the wild clips
ððĒð ðĄðĨðĒð ðĄððŽ:
â Params purely from self-supervision
â S.S. depth/pose learning as objective
â POV, fisheye, catadioptric: no changes
â SOTA results on EuRoC MAV dataset
More: https://bit.ly/3w1n6LB
ð Learning algorithm to regress calibration params from in the wild clips
ððĒð ðĄðĨðĒð ðĄððŽ:
â Params purely from self-supervision
â S.S. depth/pose learning as objective
â POV, fisheye, catadioptric: no changes
â SOTA results on EuRoC MAV dataset
More: https://bit.ly/3w1n6LB
ð8ðĪĐ2ðĨ1ðĨ°1ðĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶ
ConDor: S.S. Canonicalization ðĶ
ðSelf-Supervised Canonicalization for full/partial 3D points cloud
ððĒð ðĄðĨðĒð ðĄððŽ:
â RRC + Stanford + KAIST + Brown
â On top of Tensor Field Networks (TFNs)
â Unseen 3D -> equivariant canonical
â Co-segmentation, NO supervision
â Code and model under MIT license
More: https://bit.ly/3MNDyGa
ðSelf-Supervised Canonicalization for full/partial 3D points cloud
ððĒð ðĄðĨðĒð ðĄððŽ:
â RRC + Stanford + KAIST + Brown
â On top of Tensor Field Networks (TFNs)
â Unseen 3D -> equivariant canonical
â Co-segmentation, NO supervision
â Code and model under MIT license
More: https://bit.ly/3MNDyGa
ðĨ4ð1ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶ Event-aided Direct Sparse Odometry ðĶ
ðEDS: direct monocular visual odometry using events/frames
ððĒð ðĄðĨðĒð ðĄððŽ:
â Mono 6-DOF visual odometry + events
â Direct photometric bundle adjustment
â Camera motion tracking by sparse pixels
â A new dataset with HQ events and frame
More: https://bit.ly/3s9FiBN
ðEDS: direct monocular visual odometry using events/frames
ððĒð ðĄðĨðĒð ðĄððŽ:
â Mono 6-DOF visual odometry + events
â Direct photometric bundle adjustment
â Camera motion tracking by sparse pixels
â A new dataset with HQ events and frame
More: https://bit.ly/3s9FiBN
ðĨ5ð3ðĪŊ1ðą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðŦBlobGAN: Blob-Disentangled SceneðŦ
ðUnsupervised, mid-level (blobs) generation of scenes
ððĒð ðĄðĨðĒð ðĄððŽ:
â Spatial, depth-ordered Gaussian blobs
â Reaching for supervised level, and more
â Source under BSD-2 "Simplified" License
More: https://bit.ly/3kRyGnj
ðUnsupervised, mid-level (blobs) generation of scenes
ððĒð ðĄðĨðĒð ðĄððŽ:
â Spatial, depth-ordered Gaussian blobs
â Reaching for supervised level, and more
â Source under BSD-2 "Simplified" License
More: https://bit.ly/3kRyGnj
ðĨ8ð1ðĨ°1ðĪŊ1ðą1
This media is not supported in your browser
VIEW IN TELEGRAM
ðĶE2EVE editor via pre-trained artistðĶ
ðE2EVE generates a new version of the source image that resembles the "driver" one
ððĒð ðĄðĨðĒð ðĄððŽ:
â Blending regions by driver image
â E2E cond-probability of the edits
â S.S. augmenting in target domain
â Implemented as SOTA transformer
â Code/models available (soon)
More: https://bit.ly/3P9TDYW
ðE2EVE generates a new version of the source image that resembles the "driver" one
ððĒð ðĄðĨðĒð ðĄððŽ:
â Blending regions by driver image
â E2E cond-probability of the edits
â S.S. augmenting in target domain
â Implemented as SOTA transformer
â Code/models available (soon)
More: https://bit.ly/3P9TDYW
ðĪŊ5ð2ðĪĐ2âĪ1ðĨ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðķ Bringing pets in #metaverse ðķ
ðARTEMIS: pipeline for generating articulated neural pets for virtual worlds
ððĒð ðĄðĨðĒð ðĄððŽ:
â ARTiculated, appEarance, Mo-synthesIS
â Motion control, animation & rendering
â Neural-generated (NGI) animal engine
â SOTA animal mocap + neural control
More: https://bit.ly/3LZSLDU
ðARTEMIS: pipeline for generating articulated neural pets for virtual worlds
ððĒð ðĄðĨðĒð ðĄððŽ:
â ARTiculated, appEarance, Mo-synthesIS
â Motion control, animation & rendering
â Neural-generated (NGI) animal engine
â SOTA animal mocap + neural control
More: https://bit.ly/3LZSLDU
âĪ4ð2ðĨ°2ðĪĐ1
This media is not supported in your browser
VIEW IN TELEGRAM
ðAnimated hand in 1972, damn romanticð
ðQ: is #VR the technology that developed least in the last 30 years? ðĪ
More: https://bit.ly/3snxNaq
ðQ: is #VR the technology that developed least in the last 30 years? ðĪ
More: https://bit.ly/3snxNaq
ð7âĪ3ðĪŊ1
This media is not supported in your browser
VIEW IN TELEGRAM
âïļEnsembling models for GAN trainingâïļ
ðPretrained vision models to improve the GAN training. FID by 1.5 to 2Ã!
ððĒð ðĄðĨðĒð ðĄððŽ:
â CV models as ensemble of discriminators
â Improving GAN in limited / large-scale set
â 10k samples matches StyleGAN2 w/ 1.6M
â Source code / models under MIT license
More: https://bit.ly/3wgUVsr
ðPretrained vision models to improve the GAN training. FID by 1.5 to 2Ã!
ððĒð ðĄðĨðĒð ðĄððŽ:
â CV models as ensemble of discriminators
â Improving GAN in limited / large-scale set
â 10k samples matches StyleGAN2 w/ 1.6M
â Source code / models under MIT license
More: https://bit.ly/3wgUVsr
ðĪŊ6ðĨ2
This media is not supported in your browser
VIEW IN TELEGRAM
ðĪŊCooperative Driving + AUTOCASTSIMðĪŊ
ðCOOPERNAUT: cross-vehicle perception for vision-based cooperative driving
ððĒð ðĄðĨðĒð ðĄððŽ:
â UTexas + #Stanford + #Sony #AI
â LiDAR into compact point-based
â Network-augmented simulator
â Source code and models available
More: https://bit.ly/3sr5HLk
ðCOOPERNAUT: cross-vehicle perception for vision-based cooperative driving
ððĒð ðĄðĨðĒð ðĄððŽ:
â UTexas + #Stanford + #Sony #AI
â LiDAR into compact point-based
â Network-augmented simulator
â Source code and models available
More: https://bit.ly/3sr5HLk
ðĨ6ðĪŊ3ðĨ°1