This media is not supported in your browser
VIEW IN TELEGRAM
π OSSO: Skeletal Shape from Outside π
πAnatomic skeleton of a person from 3D surface of body π¦΄
ππ’π π‘π₯π’π π‘ππ¬:
β Max Planck + IMATI-CNR + INRIA
β DXA images to obtain #3D shape
β External body to internal skeleton
More: https://bit.ly/3v7Z5TQ
πAnatomic skeleton of a person from 3D surface of body π¦΄
ππ’π π‘π₯π’π π‘ππ¬:
β Max Planck + IMATI-CNR + INRIA
β DXA images to obtain #3D shape
β External body to internal skeleton
More: https://bit.ly/3v7Z5TQ
π4π€―2π₯1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π· Pix2Seq: object detection by #Google π·
πA novel framework to perform object detection as a language modeling task
ππ’π π‘π₯π’π π‘ππ¬:
β Obj. detection as a lang-modeling task
β BBs/labels -> seq. of discrete token
β Encoder-decoder (one token at a time)
β Code under Apache License 2.0
More: https://bit.ly/3F49PX3
πA novel framework to perform object detection as a language modeling task
ππ’π π‘π₯π’π π‘ππ¬:
β Obj. detection as a lang-modeling task
β BBs/labels -> seq. of discrete token
β Encoder-decoder (one token at a time)
β Code under Apache License 2.0
More: https://bit.ly/3F49PX3
π8π€―3π₯1π±1π1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
πΉ Generalizable Neural Performer πΉ
πGeneral neural framework to synthesize free-viewpoint images of arbitrary human performers
ππ’π π‘π₯π’π π‘ππ¬:
β Free-viewpoint synthesis of humans
β Implicit Geometric Body Embedding
β Screen-Space Occlusion-Aware Blending
β GeneBody: 4M frames, multi-view cams
More: https://cutt.ly/SGcnQzn
πGeneral neural framework to synthesize free-viewpoint images of arbitrary human performers
ππ’π π‘π₯π’π π‘ππ¬:
β Free-viewpoint synthesis of humans
β Implicit Geometric Body Embedding
β Screen-Space Occlusion-Aware Blending
β GeneBody: 4M frames, multi-view cams
More: https://cutt.ly/SGcnQzn
π5π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π Tire-defect inspection π
πUnsupervised defects in tires using neural networks
ππ’π π‘π₯π’π π‘ππ¬:
β Impurity, same material as tire
β Impurity, with different material
β Damage by temp/pressure
β Crack or etched material
More: https://bit.ly/37GX1JT
πUnsupervised defects in tires using neural networks
ππ’π π‘π₯π’π π‘ππ¬:
β Impurity, same material as tire
β Impurity, with different material
β Damage by temp/pressure
β Crack or etched material
More: https://bit.ly/37GX1JT
β€5π3π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π§#4D Neural Fieldsπ§
π4D N.F. visual representations from monocular RGB-D π€―
ππ’π π‘π₯π’π π‘ππ¬:
β 4D scene completion (occlusions)
β Scene completion in cluttered scenes
β Novel #AI for contextual point clouds
β Data, code, models under MIT license
More: https://cutt.ly/6GveKiJ
π4D N.F. visual representations from monocular RGB-D π€―
ππ’π π‘π₯π’π π‘ππ¬:
β 4D scene completion (occlusions)
β Scene completion in cluttered scenes
β Novel #AI for contextual point clouds
β Data, code, models under MIT license
More: https://cutt.ly/6GveKiJ
π6π€―2π₯1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πLargest dataset of human-object π
πBEHAVE by Google: largest dataset of human-object interactions
ππ’π π‘π₯π’π π‘ππ¬:
β 8 subjects, 20 objects, 5 envs.
β 321 clips with 4 Kinect RGB-D
β Masks and segmented point clouds
β 3D SMPL & mesh registration
β Textured scan reconstructions
More: https://bit.ly/3Lx6NNo
πBEHAVE by Google: largest dataset of human-object interactions
ππ’π π‘π₯π’π π‘ππ¬:
β 8 subjects, 20 objects, 5 envs.
β 321 clips with 4 Kinect RGB-D
β Masks and segmented point clouds
β 3D SMPL & mesh registration
β Textured scan reconstructions
More: https://bit.ly/3Lx6NNo
π5π4π₯2β€1π±1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦΄ENARF-GAN Neural Articulationsπ¦΄
πUnsupervised method for 3D geometry-aware representation of articulated objects
ππ’π π‘π₯π’π π‘ππ¬:
β Novel efficient neural representation
β Tri-planes deformation fields for training
β Novel GAN for articulated representations
β Controllable 3D from real unlabeled pic
More: https://bit.ly/3xYqedN
πUnsupervised method for 3D geometry-aware representation of articulated objects
ππ’π π‘π₯π’π π‘ππ¬:
β Novel efficient neural representation
β Tri-planes deformation fields for training
β Novel GAN for articulated representations
β Controllable 3D from real unlabeled pic
More: https://bit.ly/3xYqedN
π€―3π2β€1π₯1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π²οΈ HuMMan: 4D human dataset π²οΈ
πHuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames π€―
ππ’π π‘π₯π’π π‘ππ¬:
β RGB, pt-clouds, keypts, SMPL, texture
β Mobile device in the sensor suite
β 500+ actions to cover movements
More: https://bit.ly/3vTRW8Z
πHuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames π€―
ππ’π π‘π₯π’π π‘ππ¬:
β RGB, pt-clouds, keypts, SMPL, texture
β Mobile device in the sensor suite
β 500+ actions to cover movements
More: https://bit.ly/3vTRW8Z
π₯°2π±2π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯Neighborhood Attention Transformer π₯
πA novel transformer for both image classification and downstream vision tasks
ππ’π π‘π₯π’π π‘ππ¬:
β Neighborhood Attention (NA)
β Neighborhood Attention Transformer, NAT
β Faster training/inference, good throughput
β Checkpoints, train, #CUDA kernel available
More: https://bit.ly/3F5aVSo
πA novel transformer for both image classification and downstream vision tasks
ππ’π π‘π₯π’π π‘ππ¬:
β Neighborhood Attention (NA)
β Neighborhood Attention Transformer, NAT
β Faster training/inference, good throughput
β Checkpoints, train, #CUDA kernel available
More: https://bit.ly/3F5aVSo
π€―4π3π₯1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯π₯FANs: Fully Attentional Networksπ₯π₯
π#Nvidia unveils the fully attentional networks (FANs)
ππ’π π‘π₯π’π π‘ππ¬:
β Efficient fully attentional design
β Semantic seg. & object detection
β Model/source code soon available!
More: https://bit.ly/3vtpITs
π#Nvidia unveils the fully attentional networks (FANs)
ππ’π π‘π₯π’π π‘ππ¬:
β Efficient fully attentional design
β Semantic seg. & object detection
β Model/source code soon available!
More: https://bit.ly/3vtpITs
π₯7π€―3π2β€1
π¨πΌβπ¨ Open-Source DALLΒ·E 2 is out π¨πΌβπ¨
π#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.
ππ’π π‘π₯π’π π‘ππ¬:
β SOTA for text-to-image generation
β Source code/model under MIT License
β "Medieval painting of wifi not working"
More: https://bit.ly/3vzsff6
π#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.
ππ’π π‘π₯π’π π‘ππ¬:
β SOTA for text-to-image generation
β Source code/model under MIT License
β "Medieval painting of wifi not working"
More: https://bit.ly/3vzsff6
π€―14π6π1
This media is not supported in your browser
VIEW IN TELEGRAM
βΊViTPose: Transformer for PoseβΊ
πViTPose from ViTAE, ViT for human pose
ππ’π π‘π₯π’π π‘ππ¬:
β Plain/nonhierarchical ViT for pose
β Deconv-layers after ViT for keypoints
β Just the baseline is the new SOTA
β Source code & models available soon!
More: https://bit.ly/3MJ0kz1
πViTPose from ViTAE, ViT for human pose
ππ’π π‘π₯π’π π‘ππ¬:
β Plain/nonhierarchical ViT for pose
β Deconv-layers after ViT for keypoints
β Just the baseline is the new SOTA
β Source code & models available soon!
More: https://bit.ly/3MJ0kz1
π5π€―4π₯1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π§³ Unsupervised HD Motion Transfer π§³
πNovel e2e unsupervised motion transfer for image animation
ππ’π π‘π₯π’π π‘ππ¬:
β TPS motion estimation + Dropout
β Novel E2E unsupervised motion transfer
β Optical flow + multi-res. occlusion mask
β Code and models under MIT license
More: https://bit.ly/3MGNPns
πNovel e2e unsupervised motion transfer for image animation
ππ’π π‘π₯π’π π‘ππ¬:
β TPS motion estimation + Dropout
β Novel E2E unsupervised motion transfer
β Optical flow + multi-res. occlusion mask
β Code and models under MIT license
More: https://bit.ly/3MGNPns
π₯8π6π€―4β€2π±2
This media is not supported in your browser
VIEW IN TELEGRAM
π€ Neural Self-Calibration in the wild π€
π Learning algorithm to regress calibration params from in the wild clips
ππ’π π‘π₯π’π π‘ππ¬:
β Params purely from self-supervision
β S.S. depth/pose learning as objective
β POV, fisheye, catadioptric: no changes
β SOTA results on EuRoC MAV dataset
More: https://bit.ly/3w1n6LB
π Learning algorithm to regress calibration params from in the wild clips
ππ’π π‘π₯π’π π‘ππ¬:
β Params purely from self-supervision
β S.S. depth/pose learning as objective
β POV, fisheye, catadioptric: no changes
β SOTA results on EuRoC MAV dataset
More: https://bit.ly/3w1n6LB
π8π€©2π₯1π₯°1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦
ConDor: S.S. Canonicalization π¦
πSelf-Supervised Canonicalization for full/partial 3D points cloud
ππ’π π‘π₯π’π π‘ππ¬:
β RRC + Stanford + KAIST + Brown
β On top of Tensor Field Networks (TFNs)
β Unseen 3D -> equivariant canonical
β Co-segmentation, NO supervision
β Code and model under MIT license
More: https://bit.ly/3MNDyGa
πSelf-Supervised Canonicalization for full/partial 3D points cloud
ππ’π π‘π₯π’π π‘ππ¬:
β RRC + Stanford + KAIST + Brown
β On top of Tensor Field Networks (TFNs)
β Unseen 3D -> equivariant canonical
β Co-segmentation, NO supervision
β Code and model under MIT license
More: https://bit.ly/3MNDyGa
π₯4π1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ Event-aided Direct Sparse Odometry π¦
πEDS: direct monocular visual odometry using events/frames
ππ’π π‘π₯π’π π‘ππ¬:
β Mono 6-DOF visual odometry + events
β Direct photometric bundle adjustment
β Camera motion tracking by sparse pixels
β A new dataset with HQ events and frame
More: https://bit.ly/3s9FiBN
πEDS: direct monocular visual odometry using events/frames
ππ’π π‘π₯π’π π‘ππ¬:
β Mono 6-DOF visual odometry + events
β Direct photometric bundle adjustment
β Camera motion tracking by sparse pixels
β A new dataset with HQ events and frame
More: https://bit.ly/3s9FiBN
π₯5π3π€―1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π«BlobGAN: Blob-Disentangled Sceneπ«
πUnsupervised, mid-level (blobs) generation of scenes
ππ’π π‘π₯π’π π‘ππ¬:
β Spatial, depth-ordered Gaussian blobs
β Reaching for supervised level, and more
β Source under BSD-2 "Simplified" License
More: https://bit.ly/3kRyGnj
πUnsupervised, mid-level (blobs) generation of scenes
ππ’π π‘π₯π’π π‘ππ¬:
β Spatial, depth-ordered Gaussian blobs
β Reaching for supervised level, and more
β Source under BSD-2 "Simplified" License
More: https://bit.ly/3kRyGnj
π₯8π1π₯°1π€―1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦E2EVE editor via pre-trained artistπ¦
πE2EVE generates a new version of the source image that resembles the "driver" one
ππ’π π‘π₯π’π π‘ππ¬:
β Blending regions by driver image
β E2E cond-probability of the edits
β S.S. augmenting in target domain
β Implemented as SOTA transformer
β Code/models available (soon)
More: https://bit.ly/3P9TDYW
πE2EVE generates a new version of the source image that resembles the "driver" one
ππ’π π‘π₯π’π π‘ππ¬:
β Blending regions by driver image
β E2E cond-probability of the edits
β S.S. augmenting in target domain
β Implemented as SOTA transformer
β Code/models available (soon)
More: https://bit.ly/3P9TDYW
π€―5π2π€©2β€1π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πΆ Bringing pets in #metaverse πΆ
πARTEMIS: pipeline for generating articulated neural pets for virtual worlds
ππ’π π‘π₯π’π π‘ππ¬:
β ARTiculated, appEarance, Mo-synthesIS
β Motion control, animation & rendering
β Neural-generated (NGI) animal engine
β SOTA animal mocap + neural control
More: https://bit.ly/3LZSLDU
πARTEMIS: pipeline for generating articulated neural pets for virtual worlds
ππ’π π‘π₯π’π π‘ππ¬:
β ARTiculated, appEarance, Mo-synthesIS
β Motion control, animation & rendering
β Neural-generated (NGI) animal engine
β SOTA animal mocap + neural control
More: https://bit.ly/3LZSLDU
β€4π2π₯°2π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
πAnimated hand in 1972, damn romanticπ
πQ: is #VR the technology that developed least in the last 30 years? π€
More: https://bit.ly/3snxNaq
πQ: is #VR the technology that developed least in the last 30 years? π€
More: https://bit.ly/3snxNaq
π7β€3π€―1