This media is not supported in your browser
VIEW IN TELEGRAM
♟️Neural RGB-D Reconstruction♟️
👉Novel approach for #3D mixing implicit surface representations with NeRFs
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅RGB-D based reconstruction
✅Leveraging color & depth
✅Depth into the NeRF
✅Pose & camera refinement
More: https://bit.ly/3iN6e54
👉Novel approach for #3D mixing implicit surface representations with NeRFs
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅RGB-D based reconstruction
✅Leveraging color & depth
✅Depth into the NeRF
✅Pose & camera refinement
More: https://bit.ly/3iN6e54
🔥5👍2🤯2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦓 Hyper-Fast Refinement 🦓
👉SharpContour: novel contour-based refinement for semantic segmentation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Instance-aware Point Classifier
✅Deforming by discrete updating
✅Estimating offsets independently
✅Source code soon available!
More: https://bit.ly/3qL04GY
👉SharpContour: novel contour-based refinement for semantic segmentation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Instance-aware Point Classifier
✅Deforming by discrete updating
✅Estimating offsets independently
✅Source code soon available!
More: https://bit.ly/3qL04GY
👍5🔥4🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🥗 Neural Mesh via Text only 🥗
👉Zero-shot generation of 3D model using only a target text prompt
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ZS 3D model with text only
✅ZS text-guided generation
✅Meshes with texture/normal
✅Differentiable LLS implementation
More: https://bit.ly/3u0qnvb
👉Zero-shot generation of 3D model using only a target text prompt
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ZS 3D model with text only
✅ZS text-guided generation
✅Meshes with texture/normal
✅Differentiable LLS implementation
More: https://bit.ly/3u0qnvb
🤯8👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🪆#3D, Materials, and Lighting from 2D🪆
👉Nvidia: topology, materials & map lighting jointly from 2D. INSANE 😮
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Topology, materials and lighting
✅Meshes with materials/lighting
✅Compact volumetric texturing
✅Differentiable all-frequency lighting
✅Code under #NVIDIA License
More: https://bit.ly/3IUoF2t
👉Nvidia: topology, materials & map lighting jointly from 2D. INSANE 😮
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Topology, materials and lighting
✅Meshes with materials/lighting
✅Compact volumetric texturing
✅Differentiable all-frequency lighting
✅Code under #NVIDIA License
More: https://bit.ly/3IUoF2t
👏5👍1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜Ref-NeRF for extreme realism🍜
👉Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Realism and accuracy
✅Replacing NeRF’s params
✅Regularization of volume density
✅Integrated Directional Encoding
More: https://bit.ly/3tTlS5l
👉Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Realism and accuracy
✅Replacing NeRF’s params
✅Regularization of volume density
✅Integrated Directional Encoding
More: https://bit.ly/3tTlS5l
👍4🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🦧OFA for all: Cross, Vision, Language🦧
👉Unified multimodal model for image generation, visual grounding, etc.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Sequence-to-sequence learning
✅Image Captioning / Generation
✅Visual Grounding / Classification
✅Text-to-Image Generation
✅Visual Question Answering
More: https://bit.ly/3wSTGlc
👉Unified multimodal model for image generation, visual grounding, etc.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Sequence-to-sequence learning
✅Image Captioning / Generation
✅Visual Grounding / Classification
✅Text-to-Image Generation
✅Visual Question Answering
More: https://bit.ly/3wSTGlc
👍7🤯6👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🍿Old Films Back to Life with #AI🍿
👉Recurrent transformer network (RTN) to restore heavily degraded old films
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Transformer blocks for spatial
✅Knowledge from adjacent frames
✅Color from keyframes to whole clip
✅Source code available in days!
More: https://bit.ly/3wZbV8y
👉Recurrent transformer network (RTN) to restore heavily degraded old films
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Transformer blocks for spatial
✅Knowledge from adjacent frames
✅Color from keyframes to whole clip
✅Source code available in days!
More: https://bit.ly/3wZbV8y
❤12👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍊Neural Head #Avatars from RGB🍊
👉Novel neural representation for animatable head avatar
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel articulated human head
✅Full-geometry reconstruction
✅Differentiable optimization pipeline
✅Disentanglement of shape/color
More: https://bit.ly/3DxUGMI
👉Novel neural representation for animatable head avatar
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel articulated human head
✅Full-geometry reconstruction
✅Differentiable optimization pipeline
✅Disentanglement of shape/color
More: https://bit.ly/3DxUGMI
🔥3🤯2😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🌶️ MyStyle: personal generative #AI 🌶️
👉Personalized deep generation with a few shots of a person
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Small set of portraits(∼100)
✅Local, low-dim, personal manifold
✅Personal #AI for ill-posed tasks
✅SOTA vs. previous few-shots
More: https://bit.ly/3wWMwMu
👉Personalized deep generation with a few shots of a person
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Small set of portraits(∼100)
✅Local, low-dim, personal manifold
✅Personal #AI for ill-posed tasks
✅SOTA vs. previous few-shots
More: https://bit.ly/3wWMwMu
🔥5👍4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦆 GAN + Dense Map 🦆
👉CoordGAN: structure-texture disentangled GAN with dense correspondence map
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel coordinate space
✅Warping to learn coordinate
✅Encoder for structure representation
✅HQ structure/texture editable images
More: https://bit.ly/3DOlOaB
👉CoordGAN: structure-texture disentangled GAN with dense correspondence map
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel coordinate space
✅Warping to learn coordinate
✅Encoder for structure representation
✅HQ structure/texture editable images
More: https://bit.ly/3DOlOaB
🤯4❤2🔥2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
⚓Unified shape & non-rigid motion⚓
👉CaDeX: SOTA in both shape & non-rigid motion
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Canonical Deformation Coordinate Space
✅Shape + non rigid motion representation
✅Factorization of def-homeomorphisms
✅Cycle consistency, topology & volume
✅SOTA in modelling deformable objects
More: https://bit.ly/3NM5NX1
👉CaDeX: SOTA in both shape & non-rigid motion
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Canonical Deformation Coordinate Space
✅Shape + non rigid motion representation
✅Factorization of def-homeomorphisms
✅Cycle consistency, topology & volume
✅SOTA in modelling deformable objects
More: https://bit.ly/3NM5NX1
❤4🤯1😱1
📸 ~6 BILLION CLIP-filtered pairs 📸
👉A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅2,3B English image-text pairs
✅2,2B from 100+ other languages
✅1,3B language not detected
✅KNN index for quick search
More: https://bit.ly/3LFhKvT
👉A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅2,3B English image-text pairs
✅2,2B from 100+ other languages
✅1,3B language not detected
✅KNN index for quick search
More: https://bit.ly/3LFhKvT
❤3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥮 PP-YOLOE: e-version of YOLO 🥮
👉 SOTA object detector up to 149+ FPS!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Optimized PP-YOLOv2
✅S/M/L/XL for different scenarios
✅149+ FPS, with TensorRT & FP16
✅Source code & models available
More: https://bit.ly/3x454uy
👉 SOTA object detector up to 149+ FPS!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Optimized PP-YOLOv2
✅S/M/L/XL for different scenarios
✅149+ FPS, with TensorRT & FP16
✅Source code & models available
More: https://bit.ly/3x454uy
🔥5👍3👏1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🧙 HD synthesis with LDM 🧙
👉Low-cost DM via latent space of powerful pretrained autoencoders
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Hi-res synthesis of megapixel
✅Synthesis, inpainting, stochastic SR
✅Large, consistent images of ∼1024px
✅General conditioning via cross-attention
✅Code licensed under MIT License
More: https://bit.ly/3LIVOzS
👉Low-cost DM via latent space of powerful pretrained autoencoders
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Hi-res synthesis of megapixel
✅Synthesis, inpainting, stochastic SR
✅Large, consistent images of ∼1024px
✅General conditioning via cross-attention
✅Code licensed under MIT License
More: https://bit.ly/3LIVOzS
🔥6👍3🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🎩 SinNeRF: Single Image NeRF 🎩
👉NEural Radiance Field via single view only
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅UATX + UIUC + UOregon + Picsart AI
✅"Looking only once” approach
✅semi-supervised learning process
✅Geometry/semantic pseudo-labels
✅SOTA in novel-view synthesis
More: https://bit.ly/3ujMZqF
👉NEural Radiance Field via single view only
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅UATX + UIUC + UOregon + Picsart AI
✅"Looking only once” approach
✅semi-supervised learning process
✅Geometry/semantic pseudo-labels
✅SOTA in novel-view synthesis
More: https://bit.ly/3ujMZqF
👍7🔥2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Transformer-based Tracking 🔥
👉Tracker via Transformer-based model prediction module
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Tracking by Transformer prediction
✅Extending model predictor for BBs
✅SOTA on three public benchmark
✅Code/models under GNU License 3.0
More: https://bit.ly/3ucYvUI
👉Tracker via Transformer-based model prediction module
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Tracking by Transformer prediction
✅Extending model predictor for BBs
✅SOTA on three public benchmark
✅Code/models under GNU License 3.0
More: https://bit.ly/3ucYvUI
🔥9🤯2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 In-The-Wild Virtual Try-On 👗
👉StyleGAN-based architecture for appearance flow estimation in VTON application
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Global appearance flow estimation
✅Ok with mis-alignments person/garment
✅"In-the-wild": person with natural poses
✅Code under CC BY-NC-SA 4.0 license
More: https://bit.ly/3LPR9wl
👉StyleGAN-based architecture for appearance flow estimation in VTON application
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Global appearance flow estimation
✅Ok with mis-alignments person/garment
✅"In-the-wild": person with natural poses
✅Code under CC BY-NC-SA 4.0 license
More: https://bit.ly/3LPR9wl
👏6❤3🔥1🤔1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎇DALL·E 2 just announced!🎇
👉DALL·E 2 to create realistic images and art from natural language
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅More realistic/accurate, 4x res.
✅Better caption matching
✅Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
👉DALL·E 2 to create realistic images and art from natural language
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅More realistic/accurate, 4x res.
✅Better caption matching
✅Not available yet, waiting list!
More: https://bit.ly/3j9v3bR
🔥12🤯5👍2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👋Forecasting interactions via attention👋
👉Predicting the hand motion trajectory and the future contact points on the next active object
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Object-Centric Transformer (OCT)
✅Self-attention Transformer mechanism
✅Framework to handle uncertainty
✅SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
👉Predicting the hand motion trajectory and the future contact points on the next active object
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Object-Centric Transformer (OCT)
✅Self-attention Transformer mechanism
✅Framework to handle uncertainty
✅SOTA on Epic-Kitchens and EGTEA
More: https://bit.ly/3v3PpbI
👍4🔥2👏1🤔1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍇SmeLU: Smooth Activation Function🍇
👉Google unveils a new smooth activation function: easy to implement, cheap & less error-prone
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Smooth to mitigate irreproducibility
✅Cheap function, better than GELU/Swish
✅0-1 slope through quadratic middle region
✅SmeLU as convolution of ReLU with box
✅Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
👉Google unveils a new smooth activation function: easy to implement, cheap & less error-prone
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Smooth to mitigate irreproducibility
✅Cheap function, better than GELU/Swish
✅0-1 slope through quadratic middle region
✅SmeLU as convolution of ReLU with box
✅Best reproducibility-accuracy tradeoff
More: https://bit.ly/3xcskXm
😱8👍4❤1🔥1😁1🤯1