This media is not supported in your browser
VIEW IN TELEGRAM
☀️SunStage: Selfie with the Sun☀️
👉Accurate/tailored reconstruction of facial geometry/reflectance
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel personalized scanning
✅Disentanglement of scene params
✅Geometry, materials, lighting, poses
✅Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
👉Accurate/tailored reconstruction of facial geometry/reflectance
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel personalized scanning
✅Disentanglement of scene params
✅Geometry, materials, lighting, poses
✅Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
🔥3👏2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
📫 Generative Neural Avatars 📫
👉3D shapes of people in a variety of garments with corresponding skinning weight
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ETH + Uni-Tübingen + Max Planck
✅Animatable #3D human in garment
✅Directly from raw posed 3D scans
✅NO canonical, registration, manual w.
✅Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
👉3D shapes of people in a variety of garments with corresponding skinning weight
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ETH + Uni-Tübingen + Max Planck
✅Animatable #3D human in garment
✅Directly from raw posed 3D scans
✅NO canonical, registration, manual w.
✅Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
👏3🔥2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🗨️Conversational program synthesis🗨️
👉Conversational synthesis to translate English into executable code
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Conversational program synthesis
✅New multi-turn progr.benchmark
✅Open Custom library: JAXFORMER
✅Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
👉Conversational synthesis to translate English into executable code
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Conversational program synthesis
✅New multi-turn progr.benchmark
✅Open Custom library: JAXFORMER
✅Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
🤯4🥰2🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧯Long Video Diffusion Models🧯
👉#Google unveils a novel diffusion model for video generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Straightforward extension of 2D UNet
✅Longer by new conditional generation
✅SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
👉#Google unveils a novel diffusion model for video generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Straightforward extension of 2D UNet
✅Longer by new conditional generation
✅SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
🔥4🎉2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 AutoRF: #3D objects in-the-wild 🚙
👉From #Meta: #3D object from just a single, in-the wild, image
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel view synthesis from in-the-wild
✅Normalized, object-centric representation
✅Disentangling shape, appearance & pose
✅Exploiting BBS & panoptic segmentation
✅Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
👉From #Meta: #3D object from just a single, in-the wild, image
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel view synthesis from in-the-wild
✅Normalized, object-centric representation
✅Disentangling shape, appearance & pose
✅Exploiting BBS & panoptic segmentation
✅Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
🤯7😱2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🌠GAN-based Darkest Dataset🌠
👉Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅"Darkest" dataset ever seen
✅Moonless, no external illumination
✅GAN-tuned physics-based model
✅Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
👉Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅"Darkest" dataset ever seen
✅Moonless, no external illumination
✅GAN-tuned physics-based model
✅Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
👍3🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖Populating with digital humans🤖
👉ETHZ unveils GAMMA to populate the #3D scene with digital humans
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅GenerAtive Motion primitive MArkers
✅Realistic, controllable, infinite motions
✅Tree-based search to preserve quality
✅SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
👉ETHZ unveils GAMMA to populate the #3D scene with digital humans
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅GenerAtive Motion primitive MArkers
✅Realistic, controllable, infinite motions
✅Tree-based search to preserve quality
✅SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
😱5👍4🔥2👏1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥#AIwithPapers: we are ~2,000!🔥
💙💛 Simply amazing. Thank you all 💙💛
😈 Invite your friends -> https://t.me/AI_DeepLearning
💙💛 Simply amazing. Thank you all 💙💛
😈 Invite your friends -> https://t.me/AI_DeepLearning
❤18🔥8🥰4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
😼GARF: Gaussian Activated NeRF😼
👉GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅NeRF from imperfect camera poses
✅NO hyper-parameter tuning/initialization
✅Theoretical insight on Gaussian activation
✅Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
👉GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅NeRF from imperfect camera poses
✅NO hyper-parameter tuning/initialization
✅Theoretical insight on Gaussian activation
✅Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
👍4🤩2❤1👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎭Novel pre-training strategy for #AI🎭
👉EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Multimodal: additional modal. over RGB
✅Multi-task: multiple outputs over RGB
✅General: MultiMAE by pseudo-labeling
✅Classification, segmentation, depth
✅Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
👉EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Multimodal: additional modal. over RGB
✅Multi-task: multiple outputs over RGB
✅General: MultiMAE by pseudo-labeling
✅Classification, segmentation, depth
✅Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
🔥7🤯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧪 A new SOTA in Dataset Distillation 🧪
👉A new approach by Matching Training Trajectories is out!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Distilling data "to match" bigger one
✅Distilled data to guide a network
✅Trajectories of experts from real data
✅SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
👉A new approach by Matching Training Trajectories is out!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Distilling data "to match" bigger one
✅Distilled data to guide a network
✅Trajectories of experts from real data
✅SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
👍5🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤 Two-Hand tracking via GCN 🧤
👉The first-ever GCN for two interacting hands in single RGB image
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Reconstruction by GCN mesh regression
✅PIFA: pyramid attention for local occlusion
✅CHA: cross hand attention for interaction
✅SOTA + generalization in-the-wild scenario
✅Source code available under GNU 🤯
More: https://bit.ly/3KH5FWO
👉The first-ever GCN for two interacting hands in single RGB image
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Reconstruction by GCN mesh regression
✅PIFA: pyramid attention for local occlusion
✅CHA: cross hand attention for interaction
✅SOTA + generalization in-the-wild scenario
✅Source code available under GNU 🤯
More: https://bit.ly/3KH5FWO
👏10👍4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🕹️Video K-Net, SOTA in Segmentation🕹️
👉Simple, strong, and unified framework for fully end-to-end video panoptic segmentation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learnable kernels from K-Net
✅K-Net learns to segment & track
✅Appearance / cross-T kernel interaction
✅New SOTA without bells and whistles 🤷♂️
More: https://bit.ly/3uEEZQR
👉Simple, strong, and unified framework for fully end-to-end video panoptic segmentation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learnable kernels from K-Net
✅K-Net learns to segment & track
✅Appearance / cross-T kernel interaction
✅New SOTA without bells and whistles 🤷♂️
More: https://bit.ly/3uEEZQR
👍6🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐭DeepLabCut: tracking animals in the wild🐭
👉A toolbox for markerless pose estimation of animals performing various tasks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Multi-animal pose estimation
✅Datasets for multi-animal pose
✅Key-points, limbs, animal identity
✅Optimal key-points without input
More: https://bit.ly/37L1mLE
👉A toolbox for markerless pose estimation of animals performing various tasks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Multi-animal pose estimation
✅Datasets for multi-animal pose
✅Key-points, limbs, animal identity
✅Optimal key-points without input
More: https://bit.ly/37L1mLE
🔥6🤔4👏2🤯2❤1👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍡Neural Articulated Human Body🍡
👉Novel neural implicit representation for articulated body
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅COmpositional Articulated People
✅Large variety of shapes & poses
✅Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
👉Novel neural implicit representation for articulated body
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅COmpositional Articulated People
✅Large variety of shapes & poses
✅Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
👍4🥰2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 2K Resolution Generative #AI 🦚
👉Novel continuous-scale training with variable output resolutions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Mixed-resolution data
✅Arbitrary scales during training
✅Generations beyond 1024×1024
✅Variant of FID metric for scales
✅Source code under MIT license
More: https://bit.ly/3uNfVY6
👉Novel continuous-scale training with variable output resolutions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Mixed-resolution data
✅Arbitrary scales during training
✅Generations beyond 1024×1024
✅Variant of FID metric for scales
✅Source code under MIT license
More: https://bit.ly/3uNfVY6
🤯11👍2🔥2😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍DS Unsupervised Video Decomposition🐍
👉Novel method to extract persistent elements of a scene
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Scene element as Deformable Sprite (DS)
✅Deformable Sprites by video auto-encoder
✅Canonical texture image for appearance
✅Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
👉Novel method to extract persistent elements of a scene
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Scene element as Deformable Sprite (DS)
✅Deformable Sprites by video auto-encoder
✅Canonical texture image for appearance
✅Non-rigid geom. transformation
More: https://bit.ly/37WV9w1
👍4🤯3🔥1🥰1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥓 L-SVPE for Deep Deblurring 🥓
👉L-SVPE to deblur scenes while recovering high-freq details
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learned Spatially Varying Pixel Exposures
✅Next-gen focal-plane sensor + DL
✅Deep conv decoder for motion deblurring
✅Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
👉L-SVPE to deblur scenes while recovering high-freq details
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learned Spatially Varying Pixel Exposures
✅Next-gen focal-plane sensor + DL
✅Deep conv decoder for motion deblurring
✅Superior results over non-optimized exp.
More: https://bit.ly/3uRYQMT
🤩7👍2🤔2🎉1
This media is not supported in your browser
VIEW IN TELEGRAM
🧧Hyper-Fast Instance Segmentation🧧
👉Novel Temporally Efficient Vision Transformer (TeViT) for VIS
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Video instance segmentation transformer
✅Contextual-info at frame/instance level
✅Nearly convolution-free framework 🤷♂️
✅The new SOTA for VIS, ~70 FPS!
✅Code & models under MIT license
More: https://bit.ly/3rCMXIn
👉Novel Temporally Efficient Vision Transformer (TeViT) for VIS
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Video instance segmentation transformer
✅Contextual-info at frame/instance level
✅Nearly convolution-free framework 🤷♂️
✅The new SOTA for VIS, ~70 FPS!
✅Code & models under MIT license
More: https://bit.ly/3rCMXIn
🔥10👍3👏1🤯1