This media is not supported in your browser
VIEW IN TELEGRAM
π #AI-clips from single frameπ
πMoving objects in #3D while generating a video by a sequence of desired actions
ππ’π π‘π₯π’π π‘ππ¬:
β A playable environments
β A single starting imageπ€―
β Controllable camera
β Unsupervised learning
More: https://bit.ly/35VDrYO
πMoving objects in #3D while generating a video by a sequence of desired actions
ππ’π π‘π₯π’π π‘ππ¬:
β A playable environments
β A single starting imageπ€―
β Controllable camera
β Unsupervised learning
More: https://bit.ly/35VDrYO
β€3π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π§Kubric: AI dataset generatorπ§
πOpen-source #Python framework for photo-realistic scenes: full control, rich annotations, TBs of fresh data π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Synthetic datasets with GT
β From NeRF to optical flow
β Full control over data
β Ok privacy & licensing
β Apache License 2.0
More: https://bit.ly/3hQCaFs
πOpen-source #Python framework for photo-realistic scenes: full control, rich annotations, TBs of fresh data π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Synthetic datasets with GT
β From NeRF to optical flow
β Full control over data
β Ok privacy & licensing
β Apache License 2.0
More: https://bit.ly/3hQCaFs
π₯6π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺΒ΅Transfer for enormous NNs πͺ
πMicrosoft unveils how to tune enormous neural networks
ππ’π π‘π₯π’π π‘ππ¬:
β New HP tuning: Β΅Transfer
β Zero-shot transfer to full-model
β Outperforming BERT-large
β Outperforming 6.7B GPT-3
β Code under MIT license
More: https://bit.ly/3qc37Ij
πMicrosoft unveils how to tune enormous neural networks
ππ’π π‘π₯π’π π‘ππ¬:
β New HP tuning: Β΅Transfer
β Zero-shot transfer to full-model
β Outperforming BERT-large
β Outperforming 6.7B GPT-3
β Code under MIT license
More: https://bit.ly/3qc37Ij
π₯2π€―2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π§Semantic via only text supervisionπ§
πGroupViT with a text encoder on a large-scale image-text dataset: semantic with any pixel-level annotations in training!
ππ’π π‘π₯π’π π‘ππ¬:
β Hierarc. Grouping Vision Transf.
β Additional text encoder
β NO pixel-level annotations
β Semantic-seg task via zero-shot
β Source code available soon
More:https://bit.ly/3hPGeWr
πGroupViT with a text encoder on a large-scale image-text dataset: semantic with any pixel-level annotations in training!
ππ’π π‘π₯π’π π‘ππ¬:
β Hierarc. Grouping Vision Transf.
β Additional text encoder
β NO pixel-level annotations
β Semantic-seg task via zero-shot
β Source code available soon
More:https://bit.ly/3hPGeWr
π6π₯°1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
β4D-Net: Lidar + RGB synchronizationβ
πGoogle unveils 4D-Net to combine 3D LiDAR and onboard RGB camera
ππ’π π‘π₯π’π π‘ππ¬:
β Point clouds/images in time
β Fusing multiple modalities in 4D
β Novel sampling for 3D P.C. in time
β New SOTA for 3D detection
More: https://bit.ly/3hZCFwN
πGoogle unveils 4D-Net to combine 3D LiDAR and onboard RGB camera
ππ’π π‘π₯π’π π‘ππ¬:
β Point clouds/images in time
β Fusing multiple modalities in 4D
β Novel sampling for 3D P.C. in time
β New SOTA for 3D detection
More: https://bit.ly/3hZCFwN
π12π₯2π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π New SOTA in video synthesis! π
πSnap unveils a novel multimodal video generation framework via text/images
ππ’π π‘π₯π’π π‘ππ¬:
β Multimodal video generation
β Bidirectional transformer
β Video token with self-learn.
β Text augmentation for robustness
β Longer sequence synthesis
More: https://bit.ly/3hZLXsG
πSnap unveils a novel multimodal video generation framework via text/images
ππ’π π‘π₯π’π π‘ππ¬:
β Multimodal video generation
β Bidirectional transformer
β Video token with self-learn.
β Text augmentation for robustness
β Longer sequence synthesis
More: https://bit.ly/3hZLXsG
π€―4π1π₯1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π StyelNeRF source code is out π
π3D consistent photo-realistic image synthesis
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF + style generator
β 3D consistency for HD image
β Novel regularization loss
β Camera control on styles
More: https://bit.ly/3t5xC49
π3D consistent photo-realistic image synthesis
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF + style generator
β 3D consistency for HD image
β Novel regularization loss
β Camera control on styles
More: https://bit.ly/3t5xC49
π₯4π₯°1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦CLD-based generative #AI by #Nvidiaπ¦
πNvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data
ππ’π π‘π₯π’π π‘ππ¬:
β A novel diffusion process for SGMs
β Novel score matching obj. for CLD
β Hybrid denoising score matching
β Efficient sampling from CLD model
β Source code under a specific license
More: https://bit.ly/35MToBe
πNvidia unveils a novel critically-damped Langevin diffusion (CLD) for synthetic data
ππ’π π‘π₯π’π π‘ππ¬:
β A novel diffusion process for SGMs
β Novel score matching obj. for CLD
β Hybrid denoising score matching
β Efficient sampling from CLD model
β Source code under a specific license
More: https://bit.ly/35MToBe
π₯2π€©2π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πΈUFO: segmentation @140+ FPSπΈ
πUnified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!
ππ’π π‘π₯π’π π‘ππ¬:
β Unified framework for co-segmentation
β Co-segmentation, co-saliency, saliency
β Block for long-range dependencies
β Able to reach for 140 FPS in inference
β The new SOTA on multiple datasets
β Source code under MIT License
More: https://bit.ly/3KLd9b9
πUnified Transformer Framework for Co-Segmentation, Co-Saliency & Salient Object Detection. All in one!
ππ’π π‘π₯π’π π‘ππ¬:
β Unified framework for co-segmentation
β Co-segmentation, co-saliency, saliency
β Block for long-range dependencies
β Able to reach for 140 FPS in inference
β The new SOTA on multiple datasets
β Source code under MIT License
More: https://bit.ly/3KLd9b9
π₯6π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π Multi-GANs fashion π
πGlobal GAN blended with other GANs for faces, shoes, etc.
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-GAN framework
β Several generators
β Free of artifacts
β Full-body generation
β Humans, 1024x1024
More: https://bit.ly/37mfOte
πGlobal GAN blended with other GANs for faces, shoes, etc.
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-GAN framework
β Several generators
β Free of artifacts
β Full-body generation
β Humans, 1024x1024
More: https://bit.ly/37mfOte
π₯2π2β€1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π§ FLAG: #3D Avatar Generation π§
πA flow-based generative model of the 3D human body from sparse observations.
ππ’π π‘π₯π’π π‘ππ¬:
β FLow-based Avatar Generative
β Conditional distro of body pose
β Exact pose likelihood process
β Invertibility -> oracle latent code
More: https://bit.ly/3CQpk3p
πA flow-based generative model of the 3D human body from sparse observations.
ππ’π π‘π₯π’π π‘ππ¬:
β FLow-based Avatar Generative
β Conditional distro of body pose
β Exact pose likelihood process
β Invertibility -> oracle latent code
More: https://bit.ly/3CQpk3p
π2π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π Dancing in the wild with StyleGAN π
πStyleGAN-based animations for AR/VR apps
ππ’π π‘π₯π’π π‘ππ¬:
β Video based motion retargeting
β A StyleGAN architecture based
β Novel explicit motion representation
β SOTA qualitatively & quantitatively
More: https://bit.ly/3CZbL1W
πStyleGAN-based animations for AR/VR apps
ππ’π π‘π₯π’π π‘ππ¬:
β Video based motion retargeting
β A StyleGAN architecture based
β Novel explicit motion representation
β SOTA qualitatively & quantitatively
More: https://bit.ly/3CZbL1W
π6π€―3π₯°2
This media is not supported in your browser
VIEW IN TELEGRAM
πͺTensoRF: the 4D evolution of NeRF πͺ
πTensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.
ππ’π π‘π₯π’π π‘ππ¬:
β VM decomposition technique
β Low-rank tensor factorization
β Lower memory footprint (speed)
β TensoRF is the new SOTA in R.F.
β Code under the MIT License
More: https://bit.ly/3qffZgI
πTensoRF, a novel radiance fields via 4D-tensor: 3D voxel grid with per-voxel multi-channel feats.
ππ’π π‘π₯π’π π‘ππ¬:
β VM decomposition technique
β Low-rank tensor factorization
β Lower memory footprint (speed)
β TensoRF is the new SOTA in R.F.
β Code under the MIT License
More: https://bit.ly/3qffZgI
π2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πΌ GAN-meshes without key-points πΌ
πETH unveils a GAN framework for generating textured triangle meshes without annotations
ππ’π π‘π₯π’π π‘ππ¬:
β Generative of textured meshes
β 3D generator for all categories
β 3D pose estimation framework
β Code licensed under MIT License
More: https://bit.ly/3qfH9nJ
πETH unveils a GAN framework for generating textured triangle meshes without annotations
ππ’π π‘π₯π’π π‘ππ¬:
β Generative of textured meshes
β 3D generator for all categories
β 3D pose estimation framework
β Code licensed under MIT License
More: https://bit.ly/3qfH9nJ
π€©3π€―2π1π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π― S.S. Latent Image Animator π―
πSelf-supervised autoencoder to animate unseen images by linear navigation in latent
ππ’π π‘π₯π’π π‘ππ¬:
β Latent Image Animator
β Linear displacement in latent
β SOTA: VoxCeleb, Taichi, TED-talk
β Source code (soon) available
More: https://bit.ly/36pgLAC
πSelf-supervised autoencoder to animate unseen images by linear navigation in latent
ππ’π π‘π₯π’π π‘ππ¬:
β Latent Image Animator
β Linear displacement in latent
β SOTA: VoxCeleb, Taichi, TED-talk
β Source code (soon) available
More: https://bit.ly/36pgLAC
π5π₯3π€―2π©1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ¨ Google URF for neural-synthesis πͺ¨
πSequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized
ππ’π π‘π₯π’π π‘ππ¬:
β Extending Neural Radiance Fields
β Leveraging asynch. lidar data
β Addressing exposure variation
β Leveraging segmentations for sky
β SOTA #3D reconstructions/synthesizes
More: https://bit.ly/3L2vTDb
πSequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized
ππ’π π‘π₯π’π π‘ππ¬:
β Extending Neural Radiance Fields
β Leveraging asynch. lidar data
β Addressing exposure variation
β Leveraging segmentations for sky
β SOTA #3D reconstructions/synthesizes
More: https://bit.ly/3L2vTDb
π₯11π4π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π AV2: next-gen. self driving π
πOne of the biggest dataset ever for #autonomousdriving
ππ’π π‘π₯π’π π‘ππ¬:
β 1k seq. of multimodal data
β 3D annotations, 26 categories
β 20k lidar & map-aligned pose
β 250k challenging interactions
β HD Map: 3D lane & crosswalk
β CC BY-NC-SA 4.0 license
More: https://bit.ly/3trx3lw
πOne of the biggest dataset ever for #autonomousdriving
ππ’π π‘π₯π’π π‘ππ¬:
β 1k seq. of multimodal data
β 3D annotations, 26 categories
β 20k lidar & map-aligned pose
β 250k challenging interactions
β HD Map: 3D lane & crosswalk
β CC BY-NC-SA 4.0 license
More: https://bit.ly/3trx3lw
π₯3π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π€CaTGrasp in Clutter from Simulationπ€
πTask-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction
ππ’π π‘π₯π’π π‘ππ¬:
β Novel cat-level, relevant grasping
β S.S. hand-object-contact
β Tiny objects from dense clutter
β Train-simulation -> to real
β Source code under Apache 2.0
More: https://bit.ly/3L2YVCo
πTask-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction
ππ’π π‘π₯π’π π‘ππ¬:
β Novel cat-level, relevant grasping
β S.S. hand-object-contact
β Tiny objects from dense clutter
β Train-simulation -> to real
β Source code under Apache 2.0
More: https://bit.ly/3L2YVCo
π1π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πΌ Drive & Segment without Supervision πΌ
πLearning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city
ππ’π π‘π₯π’π π‘ππ¬:
β Cross-modal unsupervised
β Synchronized LiDAR & RGB
β Object proposal on LiDAR points
β SOTA, significant improvements
More: https://bit.ly/3L0wWTW
πLearning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city
ππ’π π‘π₯π’π π‘ππ¬:
β Cross-modal unsupervised
β Synchronized LiDAR & RGB
β Object proposal on LiDAR points
β SOTA, significant improvements
More: https://bit.ly/3L0wWTW
π3π₯1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π NeRF-free Neural Rendering π
πA simple 2D-only method with a single pass of a neural network
ππ’π π‘π₯π’π π‘ππ¬:
β Synthesis with NO 3D reasoning
β Autoregressive & masked transf.
β Pose -> object, object -> pose
β Attention: branching attention
β Source code under MIT License
More: https://bit.ly/3JC7unt
πA simple 2D-only method with a single pass of a neural network
ππ’π π‘π₯π’π π‘ππ¬:
β Synthesis with NO 3D reasoning
β Autoregressive & masked transf.
β Pose -> object, object -> pose
β Attention: branching attention
β Source code under MIT License
More: https://bit.ly/3JC7unt
π₯3π±2π1π€©1