This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ Hyper-Fast Refinement ๐ฆ
๐SharpContour: novel contour-based refinement for semantic segmentation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Instance-aware Point Classifier
โ Deforming by discrete updating
โ Estimating offsets independently
โ Source code soon available!
More: https://bit.ly/3qL04GY
๐SharpContour: novel contour-based refinement for semantic segmentation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Instance-aware Point Classifier
โ Deforming by discrete updating
โ Estimating offsets independently
โ Source code soon available!
More: https://bit.ly/3qL04GY
๐5๐ฅ4๐คฏ1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ Neural Mesh via Text only ๐ฅ
๐Zero-shot generation of 3D model using only a target text prompt
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ ZS 3D model with text only
โ ZS text-guided generation
โ Meshes with texture/normal
โ Differentiable LLS implementation
More: https://bit.ly/3u0qnvb
๐Zero-shot generation of 3D model using only a target text prompt
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ ZS 3D model with text only
โ ZS text-guided generation
โ Meshes with texture/normal
โ Differentiable LLS implementation
More: https://bit.ly/3u0qnvb
๐คฏ8๐1๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ช#3D, Materials, and Lighting from 2D๐ช
๐Nvidia: topology, materials & map lighting jointly from 2D. INSANE ๐ฎ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Topology, materials and lighting
โ Meshes with materials/lighting
โ Compact volumetric texturing
โ Differentiable all-frequency lighting
โ Code under #NVIDIA License
More: https://bit.ly/3IUoF2t
๐Nvidia: topology, materials & map lighting jointly from 2D. INSANE ๐ฎ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Topology, materials and lighting
โ Meshes with materials/lighting
โ Compact volumetric texturing
โ Differentiable all-frequency lighting
โ Code under #NVIDIA License
More: https://bit.ly/3IUoF2t
๐5๐1๐คฏ1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ref-NeRF for extreme realism๐
๐Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Realism and accuracy
โ Replacing NeRFโs params
โ Regularization of volume density
โ Integrated Directional Encoding
More: https://bit.ly/3tTlS5l
๐Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Realism and accuracy
โ Replacing NeRFโs params
โ Regularization of volume density
โ Integrated Directional Encoding
More: https://bit.ly/3tTlS5l
๐4๐คฏ2๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆงOFA for all: Cross, Vision, Language๐ฆง
๐Unified multimodal model for image generation, visual grounding, etc.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Sequence-to-sequence learning
โ Image Captioning / Generation
โ Visual Grounding / Classification
โ Text-to-Image Generation
โ Visual Question Answering
More: https://bit.ly/3wSTGlc
๐Unified multimodal model for image generation, visual grounding, etc.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Sequence-to-sequence learning
โ Image Captioning / Generation
โ Visual Grounding / Classification
โ Text-to-Image Generation
โ Visual Question Answering
More: https://bit.ly/3wSTGlc
๐7๐คฏ6๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฟOld Films Back to Life with #AI๐ฟ
๐Recurrent transformer network (RTN) to restore heavily degraded old films
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Transformer blocks for spatial
โ Knowledge from adjacent frames
โ Color from keyframes to whole clip
โ Source code available in days!
More: https://bit.ly/3wZbV8y
๐Recurrent transformer network (RTN) to restore heavily degraded old films
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Transformer blocks for spatial
โ Knowledge from adjacent frames
โ Color from keyframes to whole clip
โ Source code available in days!
More: https://bit.ly/3wZbV8y
โค12๐2๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Neural Head #Avatars from RGB๐
๐Novel neural representation for animatable head avatar
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel articulated human head
โ Full-geometry reconstruction
โ Differentiable optimization pipeline
โ Disentanglement of shape/color
More: https://bit.ly/3DxUGMI
๐Novel neural representation for animatable head avatar
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel articulated human head
โ Full-geometry reconstruction
โ Differentiable optimization pipeline
โ Disentanglement of shape/color
More: https://bit.ly/3DxUGMI
๐ฅ3๐คฏ2๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ถ๏ธ MyStyle: personal generative #AI ๐ถ๏ธ
๐Personalized deep generation with a few shots of a person
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Small set of portraits(โผ100)
โ Local, low-dim, personal manifold
โ Personal #AI for ill-posed tasks
โ SOTA vs. previous few-shots
More: https://bit.ly/3wWMwMu
๐Personalized deep generation with a few shots of a person
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Small set of portraits(โผ100)
โ Local, low-dim, personal manifold
โ Personal #AI for ill-posed tasks
โ SOTA vs. previous few-shots
More: https://bit.ly/3wWMwMu
๐ฅ5๐4๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ GAN + Dense Map ๐ฆ
๐CoordGAN: structure-texture disentangled GAN with dense correspondence map
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel coordinate space
โ Warping to learn coordinate
โ Encoder for structure representation
โ HQ structure/texture editable images
More: https://bit.ly/3DOlOaB
๐CoordGAN: structure-texture disentangled GAN with dense correspondence map
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel coordinate space
โ Warping to learn coordinate
โ Encoder for structure representation
โ HQ structure/texture editable images
More: https://bit.ly/3DOlOaB
๐คฏ4โค2๐ฅ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
โUnified shape & non-rigid motionโ
๐CaDeX: SOTA in both shape & non-rigid motion
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Canonical Deformation Coordinate Space
โ Shape + non rigid motion representation
โ Factorization of def-homeomorphisms
โ Cycle consistency, topology & volume
โ SOTA in modelling deformable objects
More: https://bit.ly/3NM5NX1
๐CaDeX: SOTA in both shape & non-rigid motion
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Canonical Deformation Coordinate Space
โ Shape + non rigid motion representation
โ Factorization of def-homeomorphisms
โ Cycle consistency, topology & volume
โ SOTA in modelling deformable objects
More: https://bit.ly/3NM5NX1
โค4๐คฏ1๐ฑ1
๐ธ ~6 BILLION CLIP-filtered pairs ๐ธ
๐A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ 2,3B English image-text pairs
โ 2,2B from 100+ other languages
โ 1,3B language not detected
โ KNN index for quick search
More: https://bit.ly/3LFhKvT
๐A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ 2,3B English image-text pairs
โ 2,2B from 100+ other languages
โ 1,3B language not detected
โ KNN index for quick search
More: https://bit.ly/3LFhKvT
โค3๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅฎ PP-YOLOE: e-version of YOLO ๐ฅฎ
๐ SOTA object detector up to 149+ FPS!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Optimized PP-YOLOv2
โ S/M/L/XL for different scenarios
โ 149+ FPS, with TensorRT & FP16
โ Source code & models available
More: https://bit.ly/3x454uy
๐ SOTA object detector up to 149+ FPS!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Optimized PP-YOLOv2
โ S/M/L/XL for different scenarios
โ 149+ FPS, with TensorRT & FP16
โ Source code & models available
More: https://bit.ly/3x454uy
๐ฅ5๐3๐1๐คฏ1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ง HD synthesis with LDM ๐ง
๐Low-cost DM via latent space of powerful pretrained autoencoders
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hi-res synthesis of megapixel
โ Synthesis, inpainting, stochastic SR
โ Large, consistent images of โผ1024px
โ General conditioning via cross-attention
โ Code licensed under MIT License
More: https://bit.ly/3LIVOzS
๐Low-cost DM via latent space of powerful pretrained autoencoders
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hi-res synthesis of megapixel
โ Synthesis, inpainting, stochastic SR
โ Large, consistent images of โผ1024px
โ General conditioning via cross-attention
โ Code licensed under MIT License
More: https://bit.ly/3LIVOzS
๐ฅ6๐3๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฉ SinNeRF: Single Image NeRF ๐ฉ
๐NEural Radiance Field via single view only
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ UATX + UIUC + UOregon + Picsart AI
โ "Looking only onceโ approach
โ semi-supervised learning process
โ Geometry/semantic pseudo-labels
โ SOTA in novel-view synthesis
More: https://bit.ly/3ujMZqF
๐NEural Radiance Field via single view only
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ UATX + UIUC + UOregon + Picsart AI
โ "Looking only onceโ approach
โ semi-supervised learning process
โ Geometry/semantic pseudo-labels
โ SOTA in novel-view synthesis
More: https://bit.ly/3ujMZqF
๐7๐ฅ2๐1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ Transformer-based Tracking ๐ฅ
๐Tracker via Transformer-based model prediction module
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Tracking by Transformer prediction
โ Extending model predictor for BBs
โ SOTA on three public benchmark
โ Code/models under GNU License 3.0
More: https://bit.ly/3ucYvUI
๐Tracker via Transformer-based model prediction module
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Tracking by Transformer prediction
โ Extending model predictor for BBs
โ SOTA on three public benchmark
โ Code/models under GNU License 3.0
More: https://bit.ly/3ucYvUI
๐ฅ9๐คฏ2๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ In-The-Wild Virtual Try-On ๐
๐StyleGAN-based architecture for appearance flow estimation in VTON application
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Global appearance flow estimation
โ Ok with mis-alignments person/garment
โ "In-the-wild": person with natural poses
โ Code under CC BY-NC-SA 4.0 license
More: https://bit.ly/3LPR9wl
๐StyleGAN-based architecture for appearance flow estimation in VTON application
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Global appearance flow estimation
โ Ok with mis-alignments person/garment
โ "In-the-wild": person with natural poses
โ Code under CC BY-NC-SA 4.0 license
More: https://bit.ly/3LPR9wl
๐6โค3๐ฅ1๐ค1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐DALLยทE 2 just announced!๐
๐DALLยทE 2 to create realistic images and art from natural language
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ More realistic/accurate, 4x res.
โ Better caption matching
โ Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
๐DALLยทE 2 to create realistic images and art from natural language
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ More realistic/accurate, 4x res.
โ Better caption matching
โ Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
๐ฅ12๐คฏ5๐2๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Forecasting interactions via attention๐
๐Predicting the hand motion trajectory and the future contact points on the next active object
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Object-Centric Transformer (OCT)
โ Self-attention Transformer mechanism
โ Framework to handle uncertainty
โ SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
๐Predicting the hand motion trajectory and the future contact points on the next active object
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Object-Centric Transformer (OCT)
โ Self-attention Transformer mechanism
โ Framework to handle uncertainty
โ SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
๐4๐ฅ2๐1๐ค1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐SmeLU: Smooth Activation Function๐
๐Google unveils a new smooth activation function: easy to implement, cheap & less error-prone
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Smooth to mitigate irreproducibility
โ Cheap function, better than GELU/Swish
โ 0-1 slope through quadratic middle region
โ SmeLU as convolution of ReLU with box
โ Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
๐Google unveils a new smooth activation function: easy to implement, cheap & less error-prone
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Smooth to mitigate irreproducibility
โ Cheap function, better than GELU/Swish
โ 0-1 slope through quadratic middle region
โ SmeLU as convolution of ReLU with box
โ Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
๐ฑ8๐4โค1๐ฅ1๐1๐คฏ1
๐Hyper-Dense Landmarks at 150FPS๐
๐#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Accurate 10ร as many landmarks as usual
โ Synthetic data, perfect annotations
โ NO appearance, light, diff-rendering
โ #3D @150+FPS with a single CPU thread
โ SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
๐#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Accurate 10ร as many landmarks as usual
โ Synthetic data, perfect annotations
โ NO appearance, light, diff-rendering
โ #3D @150+FPS with a single CPU thread
โ SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
๐6๐ฅ4๐คฏ1