This media is not supported in your browser
VIEW IN TELEGRAM
๐ GAN-based Darkest Dataset๐
๐Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ "Darkest" dataset ever seen
โ Moonless, no external illumination
โ GAN-tuned physics-based model
โ Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
๐Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ "Darkest" dataset ever seen
โ Moonless, no external illumination
โ GAN-tuned physics-based model
โ Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
๐3๐คฏ2๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐คPopulating with digital humans๐ค
๐ETHZ unveils GAMMA to populate the #3D scene with digital humans
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ GenerAtive Motion primitive MArkers
โ Realistic, controllable, infinite motions
โ Tree-based search to preserve quality
โ SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
๐ETHZ unveils GAMMA to populate the #3D scene with digital humans
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ GenerAtive Motion primitive MArkers
โ Realistic, controllable, infinite motions
โ Tree-based search to preserve quality
โ SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
๐ฑ5๐4๐ฅ2๐1๐คฏ1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ#AIwithPapers: we are ~2,000!๐ฅ
๐๐ Simply amazing. Thank you all ๐๐
๐ Invite your friends -> https://t.me/AI_DeepLearning
๐๐ Simply amazing. Thank you all ๐๐
๐ Invite your friends -> https://t.me/AI_DeepLearning
โค18๐ฅ8๐ฅฐ4๐3
This media is not supported in your browser
VIEW IN TELEGRAM
๐ผGARF: Gaussian Activated NeRF๐ผ
๐GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ NeRF from imperfect camera poses
โ NO hyper-parameter tuning/initialization
โ Theoretical insight on Gaussian activation
โ Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
๐GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ NeRF from imperfect camera poses
โ NO hyper-parameter tuning/initialization
โ Theoretical insight on Gaussian activation
โ Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
๐4๐คฉ2โค1๐1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ญNovel pre-training strategy for #AI๐ญ
๐EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multimodal: additional modal. over RGB
โ Multi-task: multiple outputs over RGB
โ General: MultiMAE by pseudo-labeling
โ Classification, segmentation, depth
โ Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
๐EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multimodal: additional modal. over RGB
โ Multi-task: multiple outputs over RGB
โ General: MultiMAE by pseudo-labeling
โ Classification, segmentation, depth
โ Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
๐ฅ7๐คฏ2๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งช A new SOTA in Dataset Distillation ๐งช
๐A new approach by Matching Training Trajectories is out!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Distilling data "to match" bigger one
โ Distilled data to guide a network
โ Trajectories of experts from real data
โ SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
๐A new approach by Matching Training Trajectories is out!
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Distilling data "to match" bigger one
โ Distilled data to guide a network
โ Trajectories of experts from real data
โ SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
๐5๐ฅ1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งค Two-Hand tracking via GCN ๐งค
๐The first-ever GCN for two interacting hands in single RGB image
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Reconstruction by GCN mesh regression
โ PIFA: pyramid attention for local occlusion
โ CHA: cross hand attention for interaction
โ SOTA + generalization in-the-wild scenario
โ Source code available under GNU ๐คฏ
More: https://bit.ly/3KH5FWO
๐The first-ever GCN for two interacting hands in single RGB image
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Reconstruction by GCN mesh regression
โ PIFA: pyramid attention for local occlusion
โ CHA: cross hand attention for interaction
โ SOTA + generalization in-the-wild scenario
โ Source code available under GNU ๐คฏ
More: https://bit.ly/3KH5FWO
๐10๐4๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐น๏ธVideo K-Net, SOTA in Segmentation๐น๏ธ
๐Simple, strong, and unified framework for fully end-to-end video panoptic segmentation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learnable kernels from K-Net
โ K-Net learns to segment & track
โ Appearance / cross-T kernel interaction
โ New SOTA without bells and whistles ๐คทโโ๏ธ
More: https://bit.ly/3uEEZQR
๐Simple, strong, and unified framework for fully end-to-end video panoptic segmentation
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learnable kernels from K-Net
โ K-Net learns to segment & track
โ Appearance / cross-T kernel interaction
โ New SOTA without bells and whistles ๐คทโโ๏ธ
More: https://bit.ly/3uEEZQR
๐6๐ฅ1๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ญDeepLabCut: tracking animals in the wild๐ญ
๐A toolbox for markerless pose estimation of animals performing various tasks
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multi-animal pose estimation
โ Datasets for multi-animal pose
โ Key-points, limbs, animal identity
โ Optimal key-points without input
More: https://bit.ly/37L1mLE
๐A toolbox for markerless pose estimation of animals performing various tasks
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Multi-animal pose estimation
โ Datasets for multi-animal pose
โ Key-points, limbs, animal identity
โ Optimal key-points without input
More: https://bit.ly/37L1mLE
๐ฅ6๐ค4๐2๐คฏ2โค1๐1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐กNeural Articulated Human Body๐ก
๐Novel neural implicit representation for articulated body
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ COmpositional Articulated People
โ Large variety of shapes & poses
โ Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
๐Novel neural implicit representation for articulated body
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ COmpositional Articulated People
โ Large variety of shapes & poses
โ Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
๐4๐ฅฐ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ 2K Resolution Generative #AI ๐ฆ
๐Novel continuous-scale training with variable output resolutions
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Mixed-resolution data
โ Arbitrary scales during training
โ Generations beyond 1024ร1024
โ Variant of FID metric for scales
โ Source code under MIT license
More: https://bit.ly/3uNfVY6
๐Novel continuous-scale training with variable output resolutions
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Mixed-resolution data
โ Arbitrary scales during training
โ Generations beyond 1024ร1024
โ Variant of FID metric for scales
โ Source code under MIT license
More: https://bit.ly/3uNfVY6
๐คฏ11๐2๐ฅ2๐ฑ1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐DS Unsupervised Video Decomposition๐
๐Novel method to extract persistent elements of a scene
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Scene element as Deformable Sprite (DS)
โ Deformable Sprites by video auto-encoder
โ Canonical texture image for appearance
โ Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
๐Novel method to extract persistent elements of a scene
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Scene element as Deformable Sprite (DS)
โ Deformable Sprites by video auto-encoder
โ Canonical texture image for appearance
โ Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
๐4๐คฏ3๐ฅ1๐ฅฐ1๐1๐ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ L-SVPE for Deep Deblurring ๐ฅ
๐L-SVPE to deblur scenes while recovering high-freq details
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learned Spatially Varying Pixel Exposures
โ Next-gen focal-plane sensor + DL
โ Deep conv decoder for motion deblurring
โ Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
๐L-SVPE to deblur scenes while recovering high-freq details
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Learned Spatially Varying Pixel Exposures
โ Next-gen focal-plane sensor + DL
โ Deep conv decoder for motion deblurring
โ Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
๐คฉ7๐2๐ค2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งงHyper-Fast Instance Segmentation๐งง
๐Novel Temporally Efficient Vision Transformer (TeViT) for VIS
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Video instance segmentation transformer
โ Contextual-info at frame/instance level
โ Nearly convolution-free framework ๐คทโโ๏ธ
โ The new SOTA for VIS, ~70 FPS!
โ Code & models under MIT license
More: https://bit.ly/3rCMXIn
๐Novel Temporally Efficient Vision Transformer (TeViT) for VIS
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Video instance segmentation transformer
โ Contextual-info at frame/instance level
โ Nearly convolution-free framework ๐คทโโ๏ธ
โ The new SOTA for VIS, ~70 FPS!
โ Code & models under MIT license
More: https://bit.ly/3rCMXIn
๐ฅ10๐3๐1๐คฏ1
๐Unified Scene Text/Layout Detection๐
๐World's first hierarchical scene text dataset + novel detection method
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Unified detection & geometric layout
โ Hierarchical annotations in natural scenes
โ Word, line, & paragraph level annotations
โ Source under CC Attribution Share Alike 4.0
More: https://bit.ly/3jRpezV
๐World's first hierarchical scene text dataset + novel detection method
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Unified detection & geometric layout
โ Hierarchical annotations in natural scenes
โ Word, line, & paragraph level annotations
โ Source under CC Attribution Share Alike 4.0
More: https://bit.ly/3jRpezV
๐ฅ3๐คฏ2โค1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ #Oculus' new Hand Tracking ๐
๐Hands are able to move as naturally and intuitively in the #metaverse as do in real life
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hands2.0 powered by CV & ML
โ Tracking hand-over-hand interactions
โ Crossing hands, clapping, high-fives
โ Accurate thumbs-up gesture
More: https://bit.ly/3JXPvY2
๐Hands are able to move as naturally and intuitively in the #metaverse as do in real life
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hands2.0 powered by CV & ML
โ Tracking hand-over-hand interactions
โ Crossing hands, clapping, high-fives
โ Accurate thumbs-up gesture
More: https://bit.ly/3JXPvY2
๐คฏ6โค4๐2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐๏ธNew SOTA in #3D human avatar๐๏ธ
๐PHORHUM: photorealistic 3D human from mono-RGB
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Pixel-aligned method for 3D geometry
โ Unshaded surface color + illumination
โ Patch-based rendering losses for visible
โ Plausible color estimation for non-visible
More: https://bit.ly/3MkvBrA
๐PHORHUM: photorealistic 3D human from mono-RGB
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Pixel-aligned method for 3D geometry
โ Unshaded surface color + illumination
โ Patch-based rendering losses for visible
โ Plausible color estimation for non-visible
More: https://bit.ly/3MkvBrA
๐คฏ4๐2๐ฅฐ2โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ What's in your hands (#3D) ? ๐
๐Reconstructing hand-held objects (from single RGB) without knowing their 3D templates๐คทโโ๏ธ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hand is highly predictive of object shape
โ Conditional-based on the articulation
โ Visual feats. / articulation-aware coords.
โ Code and models available!
More: https://bit.ly/3vuYn2a
๐Reconstructing hand-held objects (from single RGB) without knowing their 3D templates๐คทโโ๏ธ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Hand is highly predictive of object shape
โ Conditional-based on the articulation
โ Visual feats. / articulation-aware coords.
โ Code and models available!
More: https://bit.ly/3vuYn2a
๐9๐คฏ2๐ฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐YODO: You Only Demonstrate Once๐
๐A novel category-level manipulation learned in sim from single demonstration video๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ One-shot IL, model-free 6D pose tracking
โ Demonstration BY single 3rd-person-view
โ manipulation including hi-precision tasks
โ Category-level Behavior Cloning
โ Attention for dynamic coords selection
โ Generalizability to novel unseen obj/env
More: https://bit.ly/3v0V4R4
๐A novel category-level manipulation learned in sim from single demonstration video๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ One-shot IL, model-free 6D pose tracking
โ Demonstration BY single 3rd-person-view
โ manipulation including hi-precision tasks
โ Category-level Behavior Cloning
โ Attention for dynamic coords selection
โ Generalizability to novel unseen obj/env
More: https://bit.ly/3v0V4R4
๐คฏ8โค3๐2๐ฑ2๐คฉ2๐1