This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️
👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CV models as ensemble of discriminators
✅Improving GAN in limited / large-scale set
✅10k samples matches StyleGAN2 w/ 1.6M
✅Source code / models under MIT license
More: https://bit.ly/3wgUVsr
👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CV models as ensemble of discriminators
✅Improving GAN in limited / large-scale set
✅10k samples matches StyleGAN2 w/ 1.6M
✅Source code / models under MIT license
More: https://bit.ly/3wgUVsr
🤯6🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🤯Cooperative Driving + AUTOCASTSIM🤯
👉COOPERNAUT: cross-vehicle perception for vision-based cooperative driving
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅UTexas + #Stanford + #Sony #AI
✅LiDAR into compact point-based
✅Network-augmented simulator
✅Source code and models available
More: https://bit.ly/3sr5HLk
👉COOPERNAUT: cross-vehicle perception for vision-based cooperative driving
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅UTexas + #Stanford + #Sony #AI
✅LiDAR into compact point-based
✅Network-augmented simulator
✅Source code and models available
More: https://bit.ly/3sr5HLk
🔥6🤯3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
💄NeuralHDHair: 3D Neural Hair💄
👉NeuralHDHair: fully automatic system for modeling HD hair from a single image
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅IRHairNet for hair geometric features
✅GrowingNet: 3D hair strands in parallel
✅VIFu: novel voxel-aligned implicit function
✅SOTA in 3D hair modeling from single pic
More: https://bit.ly/38iR0mQ
👉NeuralHDHair: fully automatic system for modeling HD hair from a single image
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅IRHairNet for hair geometric features
✅GrowingNet: 3D hair strands in parallel
✅VIFu: novel voxel-aligned implicit function
✅SOTA in 3D hair modeling from single pic
More: https://bit.ly/38iR0mQ
👍5🥰3❤1
This media is not supported in your browser
VIEW IN TELEGRAM
🐡DyNeRF: Neural 3D Video Synthesis🐡
👉#Meta unveils DyNeRF, novel rendering HQ 3D video
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel NeRF-based on temp-latent codes
✅Novel training based on hierarchical step
✅Datasets of time-synch/calibrated clips
✅Attribution-NonCommercial 4.0 Int.
More: https://bit.ly/3MlBRA9
👉#Meta unveils DyNeRF, novel rendering HQ 3D video
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel NeRF-based on temp-latent codes
✅Novel training based on hierarchical step
✅Datasets of time-synch/calibrated clips
✅Attribution-NonCommercial 4.0 Int.
More: https://bit.ly/3MlBRA9
🤯8👍2🔥1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍋GATO: agent for multiple tasks🍋
👉The same network with the same weights can play Atari, caption pics, chat, and more🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅General-purpose agent, multiple tasks
✅Multi-modal-task, multi-embodiment
✅Inspired by large-scale language model
More: https://bit.ly/3LbBOWb
👉The same network with the same weights can play Atari, caption pics, chat, and more🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅General-purpose agent, multiple tasks
✅Multi-modal-task, multi-embodiment
✅Inspired by large-scale language model
More: https://bit.ly/3LbBOWb
🤯10❤3👍2🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🪐NeRF powered by keypoints🪐
👉ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Sparse 3D keypoints for SOTA avatars
✅Unseen subjects from 2/3 views
✅Never-before-seen iPhone captures
More: https://bit.ly/39NQqhe
👉ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Sparse 3D keypoints for SOTA avatars
✅Unseen subjects from 2/3 views
✅Never-before-seen iPhone captures
More: https://bit.ly/39NQqhe
🤯5🔥2❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🐌Self-Supervised human co-evolution🐌
👉Self-supervised 3D by co-evolution of pose estimator, imitator, and hallucinator
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel self-supervised 3D pose
✅Co-evo of pose, imitator, hallucinator
✅Realist 3D pose and 2D-3D supervision
✅Source code / model under MIT license
More: https://bit.ly/37J5ImL
👉Self-supervised 3D by co-evolution of pose estimator, imitator, and hallucinator
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel self-supervised 3D pose
✅Co-evo of pose, imitator, hallucinator
✅Realist 3D pose and 2D-3D supervision
✅Source code / model under MIT license
More: https://bit.ly/37J5ImL
🔥4👍3❤1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Diff-SDF #3D Rendering 🐲
👉Reconstruction with no complex reg. or priors, using only a per-pixel RGB loss
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Diff-render to optimize geometry/albedo
✅No ad-hoc object mask or supervision
✅Extended sphere tracing algorithm
More: https://bit.ly/3yKWPnI
👉Reconstruction with no complex reg. or priors, using only a per-pixel RGB loss
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Diff-render to optimize geometry/albedo
✅No ad-hoc object mask or supervision
✅Extended sphere tracing algorithm
More: https://bit.ly/3yKWPnI
🤯10👍4🔥2❤1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👄LVD: new SOTA for #3D human👄
👉Corona et al. unveils a novel 3D human model fitting
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Solution via neural field
✅Not sensitive to initialization
✅SOTA in shape from single pic
✅SOTA in fitting 3D scans
More: https://bit.ly/3Ng4lLr
👉Corona et al. unveils a novel 3D human model fitting
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Solution via neural field
✅Not sensitive to initialization
✅SOTA in shape from single pic
✅SOTA in fitting 3D scans
More: https://bit.ly/3Ng4lLr
👍4🔥2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🏳️🌈Deep Clustering on ImageNet & Co.🏳️🌈
👉World's first deep nonparametric clustering on large dataset such as ImageNet
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Deep clustering that infers nr. of clusters
✅Loss: amortized inference in mixt-models
✅Deep nonparametric clustering on ImageNet
✅Code and model available under MIT license
More: https://bit.ly/38p62rn
👉World's first deep nonparametric clustering on large dataset such as ImageNet
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Deep clustering that infers nr. of clusters
✅Loss: amortized inference in mixt-models
✅Deep nonparametric clustering on ImageNet
✅Code and model available under MIT license
More: https://bit.ly/38p62rn
🔥9🤯3👍2🤩2
This media is not supported in your browser
VIEW IN TELEGRAM
💥HQ-E²FGVI just released💥💥
👉Flow-Guided Video Inpainting through three trainable modules
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Flow, pixel-prop, content hallucination
✅Three stage-modules, jointly optimized
✅The new SOTA, promising efficiency
✅Code and Models under MIT license
More: https://bit.ly/3Ln0ICj
👉Flow-Guided Video Inpainting through three trainable modules
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Flow, pixel-prop, content hallucination
✅Three stage-modules, jointly optimized
✅The new SOTA, promising efficiency
✅Code and Models under MIT license
More: https://bit.ly/3Ln0ICj
🤯10👍1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🪔 AvatarCLIP: Text-Driven Avatar 🪔
👉Zero-shot text-driven for #3D avatar in #metaverse
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅First text-driven synthesis
✅Shape, texture, and motion
✅Animation-ready, HQ texture/geometry
✅Zero-shot text-guided ref-based motion
✅Code and model under MIT license
More: https://bit.ly/3LjTWgB
👉Zero-shot text-driven for #3D avatar in #metaverse
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅First text-driven synthesis
✅Shape, texture, and motion
✅Animation-ready, HQ texture/geometry
✅Zero-shot text-guided ref-based motion
✅Code and model under MIT license
More: https://bit.ly/3LjTWgB
🔥4👍2🤯2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥#AIwithPapers: we are 2,500!🔥
💙💛Only 2 Billion papers remaining on arXiv. The more we are, the faster we read💙💛
😈 Invite your friends -> https://t.me/AI_DeepLearning
💙💛Only 2 Billion papers remaining on arXiv. The more we are, the faster we read💙💛
😈 Invite your friends -> https://t.me/AI_DeepLearning
🔥9❤4👍2🤔2👏1
💥Podcasting AI & CV💥
👉🏼For people fluent in Italian: 1 hour podcast in which I talk about AI, CV, Startup and more (included this wonderful project).
More: https://bit.ly/38DtBwB
👉🏼For people fluent in Italian: 1 hour podcast in which I talk about AI, CV, Startup and more (included this wonderful project).
More: https://bit.ly/38DtBwB
👏6❤3👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Inpainting: new SOTA! INSANE🔥
👉Novel two-stream approach: inpainting at the next level!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅High-freq locally, low-freq globally
✅Local to global -> error correction
✅44% / 26% improvements FID/scores
✅Source code, more clips available
More: https://bit.ly/3ltIX9R
👉Novel two-stream approach: inpainting at the next level!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅High-freq locally, low-freq globally
✅Local to global -> error correction
✅44% / 26% improvements FID/scores
✅Source code, more clips available
More: https://bit.ly/3ltIX9R
👍8🤯3🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Super-Human Crossword Solver🔥
👉Solving crosswords outperforming best humans
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Crossword solving based on NNs
✅Q&A, structured decoding, local search
✅Wide domains with perfect accuracy
✅Large question-answer dataset
More: https://bit.ly/3a3zzqQ
👉Solving crosswords outperforming best humans
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Crossword solving based on NNs
✅Q&A, structured decoding, local search
✅Wide domains with perfect accuracy
✅Large question-answer dataset
More: https://bit.ly/3a3zzqQ
🔥4🤯3👏2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥸Imagen: far beyond DALL·E 2🥸
👉#Google: unprecedented photorealism and deep level of language understanding
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic thresh diffusion sampling
✅Efficient U-Net, efficient++ variant
✅DrawBench, new text-to-image
✅The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
👉#Google: unprecedented photorealism and deep level of language understanding
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic thresh diffusion sampling
✅Efficient U-Net, efficient++ variant
✅DrawBench, new text-to-image
✅The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
🔥9🤯6👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪤Tracking over SOTA detectors🪤
👉Lightweight Python lib for real-time 2D object tracking 💥
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Layer of tracking over SOTA detectors
✅Suitable for complex video processing
✅Source code under BSD 3-Clause
✅Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
👉Lightweight Python lib for real-time 2D object tracking 💥
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Layer of tracking over SOTA detectors
✅Suitable for complex video processing
✅Source code under BSD 3-Clause
✅Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
👍7🔥3🤩3
This media is not supported in your browser
VIEW IN TELEGRAM
🥷🏿 FCA: #3D Neural Camouflage 🥷🏿
👉#3D full-camouflage adversarial patch to fool neural detectors
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Attack by diff-neural render
✅E2E physical adversarial attack
✅Envs, vehicles & detectors
✅Source code available!
More: https://bit.ly/38kKyfa
👉#3D full-camouflage adversarial patch to fool neural detectors
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Attack by diff-neural render
✅E2E physical adversarial attack
✅Envs, vehicles & detectors
✅Source code available!
More: https://bit.ly/38kKyfa
👍5🔥3🤯2👏1