This media is not supported in your browser
VIEW IN TELEGRAM
πDALLΒ·E 2 just announced!π
πDALLΒ·E 2 to create realistic images and art from natural language
ππ’π π‘π₯π’π π‘ππ¬:
β More realistic/accurate, 4x res.
β Better caption matching
β Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
πDALLΒ·E 2 to create realistic images and art from natural language
ππ’π π‘π₯π’π π‘ππ¬:
β More realistic/accurate, 4x res.
β Better caption matching
β Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
π₯12π€―5π2π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
πForecasting interactions via attentionπ
πPredicting the hand motion trajectory and the future contact points on the next active object
ππ’π π‘π₯π’π π‘ππ¬:
β Object-Centric Transformer (OCT)
β Self-attention Transformer mechanism
β Framework to handle uncertainty
β SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
πPredicting the hand motion trajectory and the future contact points on the next active object
ππ’π π‘π₯π’π π‘ππ¬:
β Object-Centric Transformer (OCT)
β Self-attention Transformer mechanism
β Framework to handle uncertainty
β SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
π4π₯2π1π€1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πSmeLU: Smooth Activation Functionπ
πGoogle unveils a new smooth activation function: easy to implement, cheap & less error-prone
ππ’π π‘π₯π’π π‘ππ¬:
β Smooth to mitigate irreproducibility
β Cheap function, better than GELU/Swish
β 0-1 slope through quadratic middle region
β SmeLU as convolution of ReLU with box
β Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
πGoogle unveils a new smooth activation function: easy to implement, cheap & less error-prone
ππ’π π‘π₯π’π π‘ππ¬:
β Smooth to mitigate irreproducibility
β Cheap function, better than GELU/Swish
β 0-1 slope through quadratic middle region
β SmeLU as convolution of ReLU with box
β Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
π±8π4β€1π₯1π1π€―1
πHyper-Dense Landmarks at 150FPSπ
π#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
ππ’π π‘π₯π’π π‘ππ¬:
β Accurate 10Γ as many landmarks as usual
β Synthetic data, perfect annotations
β NO appearance, light, diff-rendering
β #3D @150+FPS with a single CPU thread
β SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
π#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.
ππ’π π‘π₯π’π π‘ππ¬:
β Accurate 10Γ as many landmarks as usual
β Synthetic data, perfect annotations
β NO appearance, light, diff-rendering
β #3D @150+FPS with a single CPU thread
β SOTA in monocular 3D reconstruction
More: https://bit.ly/37pQS40
π6π₯4π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈSunStage: Selfie with the SunβοΈ
πAccurate/tailored reconstruction of facial geometry/reflectance
ππ’π π‘π₯π’π π‘ππ¬:
β Novel personalized scanning
β Disentanglement of scene params
β Geometry, materials, lighting, poses
β Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
πAccurate/tailored reconstruction of facial geometry/reflectance
ππ’π π‘π₯π’π π‘ππ¬:
β Novel personalized scanning
β Disentanglement of scene params
β Geometry, materials, lighting, poses
β Photorealistic with a single selfie video
More: https://bit.ly/36W1Oqx
π₯3π2π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π« Generative Neural Avatars π«
π3D shapes of people in a variety of garments with corresponding skinning weight
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + Uni-TΓΌbingen + Max Planck
β Animatable #3D human in garment
β Directly from raw posed 3D scans
β NO canonical, registration, manual w.
β Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
π3D shapes of people in a variety of garments with corresponding skinning weight
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + Uni-TΓΌbingen + Max Planck
β Animatable #3D human in garment
β Directly from raw posed 3D scans
β NO canonical, registration, manual w.
β Geometric detail in clothing deformation
More: https://bit.ly/3M7mCdB
π3π₯2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¨οΈConversational program synthesisπ¨οΈ
πConversational synthesis to translate English into executable code
ππ’π π‘π₯π’π π‘ππ¬:
β Conversational program synthesis
β New multi-turn progr.benchmark
β Open Custom library: JAXFORMER
β Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
πConversational synthesis to translate English into executable code
ππ’π π‘π₯π’π π‘ππ¬:
β Conversational program synthesis
β New multi-turn progr.benchmark
β Open Custom library: JAXFORMER
β Source code under BSD-3 license
More: https://bit.ly/3jjWWhk
π€―4π₯°2π₯1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π§―Long Video Diffusion Modelsπ§―
π#Google unveils a novel diffusion model for video generation
ππ’π π‘π₯π’π π‘ππ¬:
β Straightforward extension of 2D UNet
β Longer by new conditional generation
β SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
π#Google unveils a novel diffusion model for video generation
ππ’π π‘π₯π’π π‘ππ¬:
β Straightforward extension of 2D UNet
β Longer by new conditional generation
β SOTA in unconditional generation
More: https://bit.ly/35Y2rzg
π₯4π2π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π AutoRF: #3D objects in-the-wild π
πFrom #Meta: #3D object from just a single, in-the wild, image
ππ’π π‘π₯π’π π‘ππ¬:
β Novel view synthesis from in-the-wild
β Normalized, object-centric representation
β Disentangling shape, appearance & pose
β Exploiting BBS & panoptic segmentation
β Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
πFrom #Meta: #3D object from just a single, in-the wild, image
ππ’π π‘π₯π’π π‘ππ¬:
β Novel view synthesis from in-the-wild
β Normalized, object-centric representation
β Disentangling shape, appearance & pose
β Exploiting BBS & panoptic segmentation
β Shape/appearance properties for objects
More: https://bit.ly/3O4ONeQ
π€―7π±2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π GAN-based Darkest Datasetπ
πBerkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
ππ’π π‘π₯π’π π‘ππ¬:
β "Darkest" dataset ever seen
β Moonless, no external illumination
β GAN-tuned physics-based model
β Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
πBerkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)
ππ’π π‘π₯π’π π‘ππ¬:
β "Darkest" dataset ever seen
β Moonless, no external illumination
β GAN-tuned physics-based model
β Clips with dancing, volleyball, flags...
More: https://bit.ly/3LXxMkN
π3π€―2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π€Populating with digital humansπ€
πETHZ unveils GAMMA to populate the #3D scene with digital humans
ππ’π π‘π₯π’π π‘ππ¬:
β GenerAtive Motion primitive MArkers
β Realistic, controllable, infinite motions
β Tree-based search to preserve quality
β SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
πETHZ unveils GAMMA to populate the #3D scene with digital humans
ππ’π π‘π₯π’π π‘ππ¬:
β GenerAtive Motion primitive MArkers
β Realistic, controllable, infinite motions
β Tree-based search to preserve quality
β SOTA in realistic/controllable motion
More: https://bit.ly/3OgY4AG
π±5π4π₯2π1π€―1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯#AIwithPapers: we are ~2,000!π₯
ππ Simply amazing. Thank you all ππ
π Invite your friends -> https://t.me/AI_DeepLearning
ππ Simply amazing. Thank you all ππ
π Invite your friends -> https://t.me/AI_DeepLearning
β€18π₯8π₯°4π3
This media is not supported in your browser
VIEW IN TELEGRAM
πΌGARF: Gaussian Activated NeRFπΌ
πGARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF from imperfect camera poses
β NO hyper-parameter tuning/initialization
β Theoretical insight on Gaussian activation
β Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
πGARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF from imperfect camera poses
β NO hyper-parameter tuning/initialization
β Theoretical insight on Gaussian activation
β Unlocking NeRF for real-world application?
More: https://bit.ly/36bvdfU
π4π€©2β€1π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πNovel pre-training strategy for #AIπ
πEPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
ππ’π π‘π₯π’π π‘ππ¬:
β Multimodal: additional modal. over RGB
β Multi-task: multiple outputs over RGB
β General: MultiMAE by pseudo-labeling
β Classification, segmentation, depth
β Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
πEPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)
ππ’π π‘π₯π’π π‘ππ¬:
β Multimodal: additional modal. over RGB
β Multi-task: multiple outputs over RGB
β General: MultiMAE by pseudo-labeling
β Classification, segmentation, depth
β Code under NonCommercial 4.0 Int.
More: https://bit.ly/3jRhNsN
π₯7π€―2π1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π§ͺ A new SOTA in Dataset Distillation π§ͺ
πA new approach by Matching Training Trajectories is out!
ππ’π π‘π₯π’π π‘ππ¬:
β Distilling data "to match" bigger one
β Distilled data to guide a network
β Trajectories of experts from real data
β SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
πA new approach by Matching Training Trajectories is out!
ππ’π π‘π₯π’π π‘ππ¬:
β Distilling data "to match" bigger one
β Distilled data to guide a network
β Trajectories of experts from real data
β SOTA + distilling higher-res visual data
More: https://bit.ly/3JwYOxW
π5π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π§€ Two-Hand tracking via GCN π§€
πThe first-ever GCN for two interacting hands in single RGB image
ππ’π π‘π₯π’π π‘ππ¬:
β Reconstruction by GCN mesh regression
β PIFA: pyramid attention for local occlusion
β CHA: cross hand attention for interaction
β SOTA + generalization in-the-wild scenario
β Source code available under GNU π€―
More: https://bit.ly/3KH5FWO
πThe first-ever GCN for two interacting hands in single RGB image
ππ’π π‘π₯π’π π‘ππ¬:
β Reconstruction by GCN mesh regression
β PIFA: pyramid attention for local occlusion
β CHA: cross hand attention for interaction
β SOTA + generalization in-the-wild scenario
β Source code available under GNU π€―
More: https://bit.ly/3KH5FWO
π10π4π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πΉοΈVideo K-Net, SOTA in SegmentationπΉοΈ
πSimple, strong, and unified framework for fully end-to-end video panoptic segmentation
ππ’π π‘π₯π’π π‘ππ¬:
β Learnable kernels from K-Net
β K-Net learns to segment & track
β Appearance / cross-T kernel interaction
β New SOTA without bells and whistles π€·ββοΈ
More: https://bit.ly/3uEEZQR
πSimple, strong, and unified framework for fully end-to-end video panoptic segmentation
ππ’π π‘π₯π’π π‘ππ¬:
β Learnable kernels from K-Net
β K-Net learns to segment & track
β Appearance / cross-T kernel interaction
β New SOTA without bells and whistles π€·ββοΈ
More: https://bit.ly/3uEEZQR
π6π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πDeepLabCut: tracking animals in the wildπ
πA toolbox for markerless pose estimation of animals performing various tasks
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-animal pose estimation
β Datasets for multi-animal pose
β Key-points, limbs, animal identity
β Optimal key-points without input
More: https://bit.ly/37L1mLE
πA toolbox for markerless pose estimation of animals performing various tasks
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-animal pose estimation
β Datasets for multi-animal pose
β Key-points, limbs, animal identity
β Optimal key-points without input
More: https://bit.ly/37L1mLE
π₯6π€4π2π€―2β€1π1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π‘Neural Articulated Human Bodyπ‘
πNovel neural implicit representation for articulated body
ππ’π π‘π₯π’π π‘ππ¬:
β COmpositional Articulated People
β Large variety of shapes & poses
β Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
πNovel neural implicit representation for articulated body
ππ’π π‘π₯π’π π‘ππ¬:
β COmpositional Articulated People
β Large variety of shapes & poses
β Novel encoder-decoder architecture
More: https://bit.ly/3xvn7dl
π4π₯°2π1