This media is not supported in your browser
VIEW IN TELEGRAM
๐ฟOld Films Back to Life with #AI๐ฟ
๐Recurrent transformer network (RTN) to restore heavily degraded old films
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Transformer blocks for spatial
โ Knowledge from adjacent frames
โ Color from keyframes to whole clip
โ Source code available in days!
More: https://bit.ly/3wZbV8y
๐Recurrent transformer network (RTN) to restore heavily degraded old films
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Transformer blocks for spatial
โ Knowledge from adjacent frames
โ Color from keyframes to whole clip
โ Source code available in days!
More: https://bit.ly/3wZbV8y
โค12๐2๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Neural Head #Avatars from RGB๐
๐Novel neural representation for animatable head avatar
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel articulated human head
โ Full-geometry reconstruction
โ Differentiable optimization pipeline
โ Disentanglement of shape/color
More: https://bit.ly/3DxUGMI
๐Novel neural representation for animatable head avatar
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel articulated human head
โ Full-geometry reconstruction
โ Differentiable optimization pipeline
โ Disentanglement of shape/color
More: https://bit.ly/3DxUGMI
๐ฅ3๐คฏ2๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ถ๏ธ MyStyle: personal generative #AI ๐ถ๏ธ
๐Personalized deep generation with a few shots of a person
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Small set of portraits(โผ100)
โ Local, low-dim, personal manifold
โ Personal #AI for ill-posed tasks
โ SOTA vs. previous few-shots
More: https://bit.ly/3wWMwMu
๐Personalized deep generation with a few shots of a person
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Small set of portraits(โผ100)
โ Local, low-dim, personal manifold
โ Personal #AI for ill-posed tasks
โ SOTA vs. previous few-shots
More: https://bit.ly/3wWMwMu
๐ฅ5๐4๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ GAN + Dense Map ๐ฆ
๐CoordGAN: structure-texture disentangled GAN with dense correspondence map
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel coordinate space
โ Warping to learn coordinate
โ Encoder for structure representation
โ HQ structure/texture editable images
More: https://bit.ly/3DOlOaB
๐CoordGAN: structure-texture disentangled GAN with dense correspondence map
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel coordinate space
โ Warping to learn coordinate
โ Encoder for structure representation
โ HQ structure/texture editable images
More: https://bit.ly/3DOlOaB
๐คฏ4โค2๐ฅ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โUnified shape & non-rigid motionโ
๐CaDeX: SOTA in both shape & non-rigid motion
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Canonical Deformation Coordinate Space
โ Shape + non rigid motion representation
โ Factorization of def-homeomorphisms
โ Cycle consistency, topology & volume
โ SOTA in modelling deformable objects
More: https://bit.ly/3NM5NX1
๐CaDeX: SOTA in both shape & non-rigid motion
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Canonical Deformation Coordinate Space
โ Shape + non rigid motion representation
โ Factorization of def-homeomorphisms
โ Cycle consistency, topology & volume
โ SOTA in modelling deformable objects
More: https://bit.ly/3NM5NX1
โค4๐คฏ1๐ฑ1
๐ธ ~6 BILLION CLIP-filtered pairs ๐ธ
๐A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ 2,3B English image-text pairs
โ 2,2B from 100+ other languages
โ 1,3B language not detected
โ KNN index for quick search
More: https://bit.ly/3LFhKvT
๐A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ 2,3B English image-text pairs
โ 2,2B from 100+ other languages
โ 1,3B language not detected
โ KNN index for quick search
More: https://bit.ly/3LFhKvT
โค3๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅฎ PP-YOLOE: e-version of YOLO ๐ฅฎ
๐ SOTA object detector up to 149+ FPS!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Optimized PP-YOLOv2
โ S/M/L/XL for different scenarios
โ 149+ FPS, with TensorRT & FP16
โ Source code & models available
More: https://bit.ly/3x454uy
๐ SOTA object detector up to 149+ FPS!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Optimized PP-YOLOv2
โ S/M/L/XL for different scenarios
โ 149+ FPS, with TensorRT & FP16
โ Source code & models available
More: https://bit.ly/3x454uy
๐ฅ5๐3๐1๐คฏ1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ง HD synthesis with LDM ๐ง
๐Low-cost DM via latent space of powerful pretrained autoencoders
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hi-res synthesis of megapixel
โ Synthesis, inpainting, stochastic SR
โ Large, consistent images of โผ1024px
โ General conditioning via cross-attention
โ Code licensed under MIT License
More: https://bit.ly/3LIVOzS
๐Low-cost DM via latent space of powerful pretrained autoencoders
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hi-res synthesis of megapixel
โ Synthesis, inpainting, stochastic SR
โ Large, consistent images of โผ1024px
โ General conditioning via cross-attention
โ Code licensed under MIT License
More: https://bit.ly/3LIVOzS
๐ฅ6๐3๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฉ SinNeRF: Single Image NeRF ๐ฉ
๐NEural Radiance Field via single view only
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ UATX + UIUC + UOregon + Picsart AI
โ "Looking only onceโ approach
โ semi-supervised learning process
โ Geometry/semantic pseudo-labels
โ SOTA in novel-view synthesis
More: https://bit.ly/3ujMZqF
๐NEural Radiance Field via single view only
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ UATX + UIUC + UOregon + Picsart AI
โ "Looking only onceโ approach
โ semi-supervised learning process
โ Geometry/semantic pseudo-labels
โ SOTA in novel-view synthesis
More: https://bit.ly/3ujMZqF
๐7๐ฅ2๐1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ Transformer-based Tracking ๐ฅ
๐Tracker via Transformer-based model prediction module
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Tracking by Transformer prediction
โ Extending model predictor for BBs
โ SOTA on three public benchmark
โ Code/models under GNU License 3.0
More: https://bit.ly/3ucYvUI
๐Tracker via Transformer-based model prediction module
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Tracking by Transformer prediction
โ Extending model predictor for BBs
โ SOTA on three public benchmark
โ Code/models under GNU License 3.0
More: https://bit.ly/3ucYvUI
๐ฅ9๐คฏ2๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ In-The-Wild Virtual Try-On ๐
๐StyleGAN-based architecture for appearance flow estimation in VTON application
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Global appearance flow estimation
โ Ok with mis-alignments person/garment
โ "In-the-wild": person with natural poses
โ Code under CC BY-NC-SA 4.0 license
More: https://bit.ly/3LPR9wl
๐StyleGAN-based architecture for appearance flow estimation in VTON application
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Global appearance flow estimation
โ Ok with mis-alignments person/garment
โ "In-the-wild": person with natural poses
โ Code under CC BY-NC-SA 4.0 license
More: https://bit.ly/3LPR9wl
๐6โค3๐ฅ1๐ค1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐DALLยทE 2 just announced!๐
๐DALLยทE 2 to create realistic images and art from natural language
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ More realistic/accurate, 4x res.
โ Better caption matching
โ Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
๐DALLยทE 2 to create realistic images and art from natural language
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ More realistic/accurate, 4x res.
โ Better caption matching
โ Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
๐ฅ12๐คฏ5๐2๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Forecasting interactions via attention๐
๐Predicting the hand motion trajectory and the future contact points on the next active object
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Object-Centric Transformer (OCT)
โ Self-attention Transformer mechanism
โ Framework to handle uncertainty
โ SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
๐Predicting the hand motion trajectory and the future contact points on the next active object
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Object-Centric Transformer (OCT)
โ Self-attention Transformer mechanism
โ Framework to handle uncertainty
โ SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
๐4๐ฅ2๐1๐ค1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐SmeLU: Smooth Activation Function๐
๐Google unveils a new smooth activation function: easy to implement, cheap & less error-prone
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Smooth to mitigate irreproducibility
โ Cheap function, better than GELU/Swish
โ 0-1 slope through quadratic middle region
โ SmeLU as convolution of ReLU with box
โ Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
๐Google unveils a new smooth activation function: easy to implement, cheap & less error-prone
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Smooth to mitigate irreproducibility
โ Cheap function, better than GELU/Swish
โ 0-1 slope through quadratic middle region
โ SmeLU as convolution of ReLU with box
โ Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
๐ฑ8๐4โค1๐ฅ1๐1๐คฏ1
๐Hyper-Dense Landmarks at 150FPS๐
๐#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Accurate 10ร as many landmarks as usual
โ Synthetic data, perfect annotations
โ NO appearance, light, diff-rendering
โ #3D @150+FPS with a single CPU thread
โ SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
๐#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Accurate 10ร as many landmarks as usual
โ Synthetic data, perfect annotations
โ NO appearance, light, diff-rendering
โ #3D @150+FPS with a single CPU thread
โ SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
๐6๐ฅ4๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ๏ธSunStage: Selfie with the Sunโ๏ธ
๐Accurate/tailored reconstruction of facial geometry/reflectance
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel personalized scanning
โ Disentanglement of scene params
โ Geometry, materials, lighting, poses
โ Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
๐Accurate/tailored reconstruction of facial geometry/reflectance
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel personalized scanning
โ Disentanglement of scene params
โ Geometry, materials, lighting, poses
โ Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
๐ฅ3๐2๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ซ Generative Neural Avatars ๐ซ
๐3D shapes of people in a variety of garments with corresponding skinning weight
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ ETH + Uni-Tรผbingen + Max Planck
โ Animatable #3D human in garment
โ Directly from raw posed 3D scans
โ NO canonical, registration, manual w.
โ Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
๐3D shapes of people in a variety of garments with corresponding skinning weight
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ ETH + Uni-Tรผbingen + Max Planck
โ Animatable #3D human in garment
โ Directly from raw posed 3D scans
โ NO canonical, registration, manual w.
โ Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
๐3๐ฅ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐จ๏ธConversational program synthesis๐จ๏ธ
๐Conversational synthesis to translate English into executable code
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Conversational program synthesis
โ New multi-turn progr.benchmark
โ Open Custom library: JAXFORMER
โ Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
๐Conversational synthesis to translate English into executable code
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Conversational program synthesis
โ New multi-turn progr.benchmark
โ Open Custom library: JAXFORMER
โ Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
๐คฏ4๐ฅฐ2๐ฅ1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งฏLong Video Diffusion Models๐งฏ
๐#Google unveils a novel diffusion model for video generation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Straightforward extension of 2D UNet
โ Longer by new conditional generation
โ SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
๐#Google unveils a novel diffusion model for video generation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Straightforward extension of 2D UNet
โ Longer by new conditional generation
โ SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
๐ฅ4๐2๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ AutoRF: #3D objects in-the-wild ๐
๐From #Meta: #3D object from just a single, in-the wild, image
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel view synthesis from in-the-wild
โ Normalized, object-centric representation
โ Disentangling shape, appearance & pose
โ Exploiting BBS & panoptic segmentation
โ Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
๐From #Meta: #3D object from just a single, in-the wild, image
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel view synthesis from in-the-wild
โ Normalized, object-centric representation
โ Disentangling shape, appearance & pose
โ Exploiting BBS & panoptic segmentation
โ Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
๐คฏ7๐ฑ2๐ฅ1