This media is not supported in your browser
VIEW IN TELEGRAM
๐ผGARF: Gaussian Activated NeRF๐ผ
๐GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ NeRF from imperfect camera poses
โ NO hyper-parameter tuning/initialization
โ Theoretical insight on Gaussian activation
โ Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
๐GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ NeRF from imperfect camera poses
โ NO hyper-parameter tuning/initialization
โ Theoretical insight on Gaussian activation
โ Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
๐4๐คฉ2โค1๐1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ญNovel pre-training strategy for #AI๐ญ
๐EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multimodal: additional modal. over RGB
โ Multi-task: multiple outputs over RGB
โ General: MultiMAE by pseudo-labeling
โ Classification, segmentation, depth
โ Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
๐EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multimodal: additional modal. over RGB
โ Multi-task: multiple outputs over RGB
โ General: MultiMAE by pseudo-labeling
โ Classification, segmentation, depth
โ Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
๐ฅ7๐คฏ2๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งช A new SOTA in Dataset Distillation ๐งช
๐A new approach by Matching Training Trajectories is out!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Distilling data "to match" bigger one
โ Distilled data to guide a network
โ Trajectories of experts from real data
โ SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
๐A new approach by Matching Training Trajectories is out!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Distilling data "to match" bigger one
โ Distilled data to guide a network
โ Trajectories of experts from real data
โ SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
๐5๐ฅ1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งค Two-Hand tracking via GCN ๐งค
๐The first-ever GCN for two interacting hands in single RGB image
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Reconstruction by GCN mesh regression
โ PIFA: pyramid attention for local occlusion
โ CHA: cross hand attention for interaction
โ SOTA + generalization in-the-wild scenario
โ Source code available under GNU ๐คฏ
More: https://bit.ly/3KH5FWO
๐The first-ever GCN for two interacting hands in single RGB image
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Reconstruction by GCN mesh regression
โ PIFA: pyramid attention for local occlusion
โ CHA: cross hand attention for interaction
โ SOTA + generalization in-the-wild scenario
โ Source code available under GNU ๐คฏ
More: https://bit.ly/3KH5FWO
๐10๐4๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐น๏ธVideo K-Net, SOTA in Segmentation๐น๏ธ
๐Simple, strong, and unified framework for fully end-to-end video panoptic segmentation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learnable kernels from K-Net
โ K-Net learns to segment & track
โ Appearance / cross-T kernel interaction
โ New SOTA without bells and whistles ๐คทโโ๏ธ
More: https://bit.ly/3uEEZQR
๐Simple, strong, and unified framework for fully end-to-end video panoptic segmentation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learnable kernels from K-Net
โ K-Net learns to segment & track
โ Appearance / cross-T kernel interaction
โ New SOTA without bells and whistles ๐คทโโ๏ธ
More: https://bit.ly/3uEEZQR
๐6๐ฅ1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ญDeepLabCut: tracking animals in the wild๐ญ
๐A toolbox for markerless pose estimation of animals performing various tasks
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multi-animal pose estimation
โ Datasets for multi-animal pose
โ Key-points, limbs, animal identity
โ Optimal key-points without input
More: https://bit.ly/37L1mLE
๐A toolbox for markerless pose estimation of animals performing various tasks
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multi-animal pose estimation
โ Datasets for multi-animal pose
โ Key-points, limbs, animal identity
โ Optimal key-points without input
More: https://bit.ly/37L1mLE
๐ฅ6๐ค4๐2๐คฏ2โค1๐1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐กNeural Articulated Human Body๐ก
๐Novel neural implicit representation for articulated body
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ COmpositional Articulated People
โ Large variety of shapes & poses
โ Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
๐Novel neural implicit representation for articulated body
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ COmpositional Articulated People
โ Large variety of shapes & poses
โ Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
๐4๐ฅฐ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ 2K Resolution Generative #AI ๐ฆ
๐Novel continuous-scale training with variable output resolutions
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Mixed-resolution data
โ Arbitrary scales during training
โ Generations beyond 1024ร1024
โ Variant of FID metric for scales
โ Source code under MIT license
More: https://bit.ly/3uNfVY6
๐Novel continuous-scale training with variable output resolutions
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Mixed-resolution data
โ Arbitrary scales during training
โ Generations beyond 1024ร1024
โ Variant of FID metric for scales
โ Source code under MIT license
More: https://bit.ly/3uNfVY6
๐คฏ11๐2๐ฅ2๐ฑ1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐DS Unsupervised Video Decomposition๐
๐Novel method to extract persistent elements of a scene
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Scene element as Deformable Sprite (DS)
โ Deformable Sprites by video auto-encoder
โ Canonical texture image for appearance
โ Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
๐Novel method to extract persistent elements of a scene
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Scene element as Deformable Sprite (DS)
โ Deformable Sprites by video auto-encoder
โ Canonical texture image for appearance
โ Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
๐4๐คฏ3๐ฅ1๐ฅฐ1๐1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ L-SVPE for Deep Deblurring ๐ฅ
๐L-SVPE to deblur scenes while recovering high-freq details
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learned Spatially Varying Pixel Exposures
โ Next-gen focal-plane sensor + DL
โ Deep conv decoder for motion deblurring
โ Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
๐L-SVPE to deblur scenes while recovering high-freq details
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learned Spatially Varying Pixel Exposures
โ Next-gen focal-plane sensor + DL
โ Deep conv decoder for motion deblurring
โ Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
๐คฉ7๐2๐ค2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งงHyper-Fast Instance Segmentation๐งง
๐Novel Temporally Efficient Vision Transformer (TeViT) for VIS
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Video instance segmentation transformer
โ Contextual-info at frame/instance level
โ Nearly convolution-free framework ๐คทโโ๏ธ
โ The new SOTA for VIS, ~70 FPS!
โ Code & models under MIT license
More: https://bit.ly/3rCMXIn
๐Novel Temporally Efficient Vision Transformer (TeViT) for VIS
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Video instance segmentation transformer
โ Contextual-info at frame/instance level
โ Nearly convolution-free framework ๐คทโโ๏ธ
โ The new SOTA for VIS, ~70 FPS!
โ Code & models under MIT license
More: https://bit.ly/3rCMXIn
๐ฅ10๐3๐1๐คฏ1
๐Unified Scene Text/Layout Detection๐
๐World's first hierarchical scene text dataset + novel detection method
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Unified detection & geometric layout
โ Hierarchical annotations in natural scenes
โ Word, line, & paragraph level annotations
โ Source under CC Attribution Share Alike 4.0
More: https://bit.ly/3jRpezV
๐World's first hierarchical scene text dataset + novel detection method
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Unified detection & geometric layout
โ Hierarchical annotations in natural scenes
โ Word, line, & paragraph level annotations
โ Source under CC Attribution Share Alike 4.0
More: https://bit.ly/3jRpezV
๐ฅ3๐คฏ2โค1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ #Oculus' new Hand Tracking ๐
๐Hands are able to move as naturally and intuitively in the #metaverse as do in real life
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hands2.0 powered by CV & ML
โ Tracking hand-over-hand interactions
โ Crossing hands, clapping, high-fives
โ Accurate thumbs-up gesture
More: https://bit.ly/3JXPvY2
๐Hands are able to move as naturally and intuitively in the #metaverse as do in real life
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hands2.0 powered by CV & ML
โ Tracking hand-over-hand interactions
โ Crossing hands, clapping, high-fives
โ Accurate thumbs-up gesture
More: https://bit.ly/3JXPvY2
๐คฏ6โค4๐2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐๏ธNew SOTA in #3D human avatar๐๏ธ
๐PHORHUM: photorealistic 3D human from mono-RGB
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Pixel-aligned method for 3D geometry
โ Unshaded surface color + illumination
โ Patch-based rendering losses for visible
โ Plausible color estimation for non-visible
More: https://bit.ly/3MkvBrA
๐PHORHUM: photorealistic 3D human from mono-RGB
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Pixel-aligned method for 3D geometry
โ Unshaded surface color + illumination
โ Patch-based rendering losses for visible
โ Plausible color estimation for non-visible
More: https://bit.ly/3MkvBrA
๐คฏ4๐2๐ฅฐ2โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ What's in your hands (#3D) ? ๐
๐Reconstructing hand-held objects (from single RGB) without knowing their 3D templates๐คทโโ๏ธ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hand is highly predictive of object shape
โ Conditional-based on the articulation
โ Visual feats. / articulation-aware coords.
โ Code and models available!
More: https://bit.ly/3vuYn2a
๐Reconstructing hand-held objects (from single RGB) without knowing their 3D templates๐คทโโ๏ธ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hand is highly predictive of object shape
โ Conditional-based on the articulation
โ Visual feats. / articulation-aware coords.
โ Code and models available!
More: https://bit.ly/3vuYn2a
๐9๐คฏ2๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐YODO: You Only Demonstrate Once๐
๐A novel category-level manipulation learned in sim from single demonstration video๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ One-shot IL, model-free 6D pose tracking
โ Demonstration BY single 3rd-person-view
โ manipulation including hi-precision tasks
โ Category-level Behavior Cloning
โ Attention for dynamic coords selection
โ Generalizability to novel unseen obj/env
More: https://bit.ly/3v0V4R4
๐A novel category-level manipulation learned in sim from single demonstration video๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ One-shot IL, model-free 6D pose tracking
โ Demonstration BY single 3rd-person-view
โ manipulation including hi-precision tasks
โ Category-level Behavior Cloning
โ Attention for dynamic coords selection
โ Generalizability to novel unseen obj/env
More: https://bit.ly/3v0V4R4
๐คฏ8โค3๐2๐ฑ2๐คฉ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ Dress Code for Virtual Try-On ๐
๐UniMORE (+ YOOX) unveils a novel dataset/approach for virtual try-on.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hi-Res paired front-view / full-body
โ Pixel-level Semantic-Aware Discriminator
โ 9 SOTA VTON approaches / 3 baselines
โ New SOTA considering res. & garments
More: https://bit.ly/3xKXSUw
๐UniMORE (+ YOOX) unveils a novel dataset/approach for virtual try-on.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hi-Res paired front-view / full-body
โ Pixel-level Semantic-Aware Discriminator
โ 9 SOTA VTON approaches / 3 baselines
โ New SOTA considering res. & garments
More: https://bit.ly/3xKXSUw
โค3๐3๐ฅ1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Deep Equilibrium for Optical Flow๐
๐DEQ: converge faster, less memory, often more accurate
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel formulation of optical flow method
โ Compatible with prior modeling/data-related
โ Sparse fixed-point correction for stability
โ Code/models under GNU Affero GPL v3.0
More: https://bit.ly/3v4fZmi
๐DEQ: converge faster, less memory, often more accurate
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel formulation of optical flow method
โ Compatible with prior modeling/data-related
โ Sparse fixed-point correction for stability
โ Code/models under GNU Affero GPL v3.0
More: https://bit.ly/3v4fZmi
๐3๐ฅฐ2๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ณUltra High-Resolution Neural Saliency๐ณ
๐A novel ultra high-resolution saliency detector with dataset!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Ultra Hi-Res Saliency Detection
โ 5,920 pics at 4K-8K resolution
โ Pyramid Grafting Network
โ Cross-Model Grafting Module
โ AGL: Attention Guided Loss
โ Code/models under MIT
More: https://bit.ly/3MnU1Rf
๐A novel ultra high-resolution saliency detector with dataset!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Ultra Hi-Res Saliency Detection
โ 5,920 pics at 4K-8K resolution
โ Pyramid Grafting Network
โ Cross-Model Grafting Module
โ AGL: Attention Guided Loss
โ Code/models under MIT
More: https://bit.ly/3MnU1Rf
โค6๐3๐คฏ3๐ฅ2๐คฉ1