πHyper-Dense Landmarks at 150FPSπ
π#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
ππ’π π‘π₯π’π π‘ππ¬:
β Accurate 10Γ as many landmarks as usual
β Synthetic data, perfect annotations
β NO appearance, light, diff-rendering
β #3D @150+FPS with a single CPU thread
β SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
π#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
ππ’π π‘π₯π’π π‘ππ¬:
β Accurate 10Γ as many landmarks as usual
β Synthetic data, perfect annotations
β NO appearance, light, diff-rendering
β #3D @150+FPS with a single CPU thread
β SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
π6π₯4π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈSunStage: Selfie with the SunβοΈ
πAccurate/tailored reconstruction of facial geometry/reflectance
ππ’π π‘π₯π’π π‘ππ¬:
β Novel personalized scanning
β Disentanglement of scene params
β Geometry, materials, lighting, poses
β Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
πAccurate/tailored reconstruction of facial geometry/reflectance
ππ’π π‘π₯π’π π‘ππ¬:
β Novel personalized scanning
β Disentanglement of scene params
β Geometry, materials, lighting, poses
β Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
π₯3π2π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π« Generative Neural Avatars π«
π3D shapes of people in a variety of garments with corresponding skinning weight
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + Uni-TΓΌbingen + Max Planck
β Animatable #3D human in garment
β Directly from raw posed 3D scans
β NO canonical, registration, manual w.
β Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
π3D shapes of people in a variety of garments with corresponding skinning weight
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + Uni-TΓΌbingen + Max Planck
β Animatable #3D human in garment
β Directly from raw posed 3D scans
β NO canonical, registration, manual w.
β Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
π3π₯2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¨οΈConversational program synthesisπ¨οΈ
πConversational synthesis to translate English into executable code
ππ’π π‘π₯π’π π‘ππ¬:
β Conversational program synthesis
β New multi-turn progr.benchmark
β Open Custom library: JAXFORMER
β Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
πConversational synthesis to translate English into executable code
ππ’π π‘π₯π’π π‘ππ¬:
β Conversational program synthesis
β New multi-turn progr.benchmark
β Open Custom library: JAXFORMER
β Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
π€―4π₯°2π₯1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π§―Long Video Diffusion Modelsπ§―
π#Google unveils a novel diffusion model for video generation
ππ’π π‘π₯π’π π‘ππ¬:
β Straightforward extension of 2D UNet
β Longer by new conditional generation
β SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
π#Google unveils a novel diffusion model for video generation
ππ’π π‘π₯π’π π‘ππ¬:
β Straightforward extension of 2D UNet
β Longer by new conditional generation
β SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
π₯4π2π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π AutoRF: #3D objects in-the-wild π
πFrom #Meta: #3D object from just a single, in-the wild, image
ππ’π π‘π₯π’π π‘ππ¬:
β Novel view synthesis from in-the-wild
β Normalized, object-centric representation
β Disentangling shape, appearance & pose
β Exploiting BBS & panoptic segmentation
β Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
πFrom #Meta: #3D object from just a single, in-the wild, image
ππ’π π‘π₯π’π π‘ππ¬:
β Novel view synthesis from in-the-wild
β Normalized, object-centric representation
β Disentangling shape, appearance & pose
β Exploiting BBS & panoptic segmentation
β Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
π€―7π±2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π GAN-based Darkest Datasetπ
πBerkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
ππ’π π‘π₯π’π π‘ππ¬:
β "Darkest" dataset ever seen
β Moonless, no external illumination
β GAN-tuned physics-based model
β Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
πBerkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
ππ’π π‘π₯π’π π‘ππ¬:
β "Darkest" dataset ever seen
β Moonless, no external illumination
β GAN-tuned physics-based model
β Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
π3π€―2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π€Populating with digital humansπ€
πETHZ unveils GAMMA to populate the #3D scene with digital humans
ππ’π π‘π₯π’π π‘ππ¬:
β GenerAtive Motion primitive MArkers
β Realistic, controllable, infinite motions
β Tree-based search to preserve quality
β SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
πETHZ unveils GAMMA to populate the #3D scene with digital humans
ππ’π π‘π₯π’π π‘ππ¬:
β GenerAtive Motion primitive MArkers
β Realistic, controllable, infinite motions
β Tree-based search to preserve quality
β SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
π±5π4π₯2π1π€―1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯#AIwithPapers: we are ~2,000!π₯
ππ Simply amazing. Thank you all ππ
π Invite your friends -> https://t.me/AI_DeepLearning
ππ Simply amazing. Thank you all ππ
π Invite your friends -> https://t.me/AI_DeepLearning
β€18π₯8π₯°4π3
This media is not supported in your browser
VIEW IN TELEGRAM
πΌGARF: Gaussian Activated NeRFπΌ
πGARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF from imperfect camera poses
β NO hyper-parameter tuning/initialization
β Theoretical insight on Gaussian activation
β Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
πGARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF from imperfect camera poses
β NO hyper-parameter tuning/initialization
β Theoretical insight on Gaussian activation
β Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
π4π€©2β€1π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πNovel pre-training strategy for #AIπ
πEPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
ππ’π π‘π₯π’π π‘ππ¬:
β Multimodal: additional modal. over RGB
β Multi-task: multiple outputs over RGB
β General: MultiMAE by pseudo-labeling
β Classification, segmentation, depth
β Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
πEPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
ππ’π π‘π₯π’π π‘ππ¬:
β Multimodal: additional modal. over RGB
β Multi-task: multiple outputs over RGB
β General: MultiMAE by pseudo-labeling
β Classification, segmentation, depth
β Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
π₯7π€―2π1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π§ͺ A new SOTA in Dataset Distillation π§ͺ
πA new approach by Matching Training Trajectories is out!
ππ’π π‘π₯π’π π‘ππ¬:
β Distilling data "to match" bigger one
β Distilled data to guide a network
β Trajectories of experts from real data
β SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
πA new approach by Matching Training Trajectories is out!
ππ’π π‘π₯π’π π‘ππ¬:
β Distilling data "to match" bigger one
β Distilled data to guide a network
β Trajectories of experts from real data
β SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
π5π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π§€ Two-Hand tracking via GCN π§€
πThe first-ever GCN for two interacting hands in single RGB image
ππ’π π‘π₯π’π π‘ππ¬:
β Reconstruction by GCN mesh regression
β PIFA: pyramid attention for local occlusion
β CHA: cross hand attention for interaction
β SOTA + generalization in-the-wild scenario
β Source code available under GNU π€―
More: https://bit.ly/3KH5FWO
πThe first-ever GCN for two interacting hands in single RGB image
ππ’π π‘π₯π’π π‘ππ¬:
β Reconstruction by GCN mesh regression
β PIFA: pyramid attention for local occlusion
β CHA: cross hand attention for interaction
β SOTA + generalization in-the-wild scenario
β Source code available under GNU π€―
More: https://bit.ly/3KH5FWO
π10π4π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πΉοΈVideo K-Net, SOTA in SegmentationπΉοΈ
πSimple, strong, and unified framework for fully end-to-end video panoptic segmentation
ππ’π π‘π₯π’π π‘ππ¬:
β Learnable kernels from K-Net
β K-Net learns to segment & track
β Appearance / cross-T kernel interaction
β New SOTA without bells and whistles π€·ββοΈ
More: https://bit.ly/3uEEZQR
πSimple, strong, and unified framework for fully end-to-end video panoptic segmentation
ππ’π π‘π₯π’π π‘ππ¬:
β Learnable kernels from K-Net
β K-Net learns to segment & track
β Appearance / cross-T kernel interaction
β New SOTA without bells and whistles π€·ββοΈ
More: https://bit.ly/3uEEZQR
π6π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πDeepLabCut: tracking animals in the wildπ
πA toolbox for markerless pose estimation of animals performing various tasks
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-animal pose estimation
β Datasets for multi-animal pose
β Key-points, limbs, animal identity
β Optimal key-points without input
More: https://bit.ly/37L1mLE
πA toolbox for markerless pose estimation of animals performing various tasks
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-animal pose estimation
β Datasets for multi-animal pose
β Key-points, limbs, animal identity
β Optimal key-points without input
More: https://bit.ly/37L1mLE
π₯6π€4π2π€―2β€1π1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π‘Neural Articulated Human Bodyπ‘
πNovel neural implicit representation for articulated body
ππ’π π‘π₯π’π π‘ππ¬:
β COmpositional Articulated People
β Large variety of shapes & poses
β Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
πNovel neural implicit representation for articulated body
ππ’π π‘π₯π’π π‘ππ¬:
β COmpositional Articulated People
β Large variety of shapes & poses
β Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
π4π₯°2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ 2K Resolution Generative #AI π¦
πNovel continuous-scale training with variable output resolutions
ππ’π π‘π₯π’π π‘ππ¬:
β Mixed-resolution data
β Arbitrary scales during training
β Generations beyond 1024Γ1024
β Variant of FID metric for scales
β Source code under MIT license
More: https://bit.ly/3uNfVY6
πNovel continuous-scale training with variable output resolutions
ππ’π π‘π₯π’π π‘ππ¬:
β Mixed-resolution data
β Arbitrary scales during training
β Generations beyond 1024Γ1024
β Variant of FID metric for scales
β Source code under MIT license
More: https://bit.ly/3uNfVY6
π€―11π2π₯2π±1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
πDS Unsupervised Video Decompositionπ
πNovel method to extract persistent elements of a scene
ππ’π π‘π₯π’π π‘ππ¬:
β Scene element as Deformable Sprite (DS)
β Deformable Sprites by video auto-encoder
β Canonical texture image for appearance
β Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
πNovel method to extract persistent elements of a scene
ππ’π π‘π₯π’π π‘ππ¬:
β Scene element as Deformable Sprite (DS)
β Deformable Sprites by video auto-encoder
β Canonical texture image for appearance
β Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
π4π€―3π₯1π₯°1π1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ L-SVPE for Deep Deblurring π₯
πL-SVPE to deblur scenes while recovering high-freq details
ππ’π π‘π₯π’π π‘ππ¬:
β Learned Spatially Varying Pixel Exposures
β Next-gen focal-plane sensor + DL
β Deep conv decoder for motion deblurring
β Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
πL-SVPE to deblur scenes while recovering high-freq details
ππ’π π‘π₯π’π π‘ππ¬:
β Learned Spatially Varying Pixel Exposures
β Next-gen focal-plane sensor + DL
β Deep conv decoder for motion deblurring
β Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
π€©7π2π€2π1