This media is not supported in your browser
VIEW IN TELEGRAM
πLVD: new SOTA for #3D humanπ
πCorona et al. unveils a novel 3D human model fitting
ππ’π π‘π₯π’π π‘ππ¬:
β Solution via neural field
β Not sensitive to initialization
β SOTA in shape from single pic
β SOTA in fitting 3D scans
More: https://bit.ly/3Ng4lLr
πCorona et al. unveils a novel 3D human model fitting
ππ’π π‘π₯π’π π‘ππ¬:
β Solution via neural field
β Not sensitive to initialization
β SOTA in shape from single pic
β SOTA in fitting 3D scans
More: https://bit.ly/3Ng4lLr
π4π₯2π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π³οΈβπDeep Clustering on ImageNet & Co.π³οΈβπ
πWorld's first deep nonparametric clustering on large dataset such as ImageNet
ππ’π π‘π₯π’π π‘ππ¬:
β Deep clustering that infers nr. of clusters
β Loss: amortized inference in mixt-models
β Deep nonparametric clustering on ImageNet
β Code and model available under MIT license
More: https://bit.ly/38p62rn
πWorld's first deep nonparametric clustering on large dataset such as ImageNet
ππ’π π‘π₯π’π π‘ππ¬:
β Deep clustering that infers nr. of clusters
β Loss: amortized inference in mixt-models
β Deep nonparametric clustering on ImageNet
β Code and model available under MIT license
More: https://bit.ly/38p62rn
π₯9π€―3π2π€©2
This media is not supported in your browser
VIEW IN TELEGRAM
π₯HQ-EΒ²FGVI just releasedπ₯π₯
πFlow-Guided Video Inpainting through three trainable modules
ππ’π π‘π₯π’π π‘ππ¬:
β Flow, pixel-prop, content hallucination
β Three stage-modules, jointly optimized
β The new SOTA, promising efficiency
β Code and Models under MIT license
More: https://bit.ly/3Ln0ICj
πFlow-Guided Video Inpainting through three trainable modules
ππ’π π‘π₯π’π π‘ππ¬:
β Flow, pixel-prop, content hallucination
β Three stage-modules, jointly optimized
β The new SOTA, promising efficiency
β Code and Models under MIT license
More: https://bit.ly/3Ln0ICj
π€―10π1π±1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ AvatarCLIP: Text-Driven Avatar πͺ
πZero-shot text-driven for #3D avatar in #metaverse
ππ’π π‘π₯π’π π‘ππ¬:
β First text-driven synthesis
β Shape, texture, and motion
β Animation-ready, HQ texture/geometry
β Zero-shot text-guided ref-based motion
β Code and model under MIT license
More: https://bit.ly/3LjTWgB
πZero-shot text-driven for #3D avatar in #metaverse
ππ’π π‘π₯π’π π‘ππ¬:
β First text-driven synthesis
β Shape, texture, and motion
β Animation-ready, HQ texture/geometry
β Zero-shot text-guided ref-based motion
β Code and model under MIT license
More: https://bit.ly/3LjTWgB
π₯4π2π€―2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯#AIwithPapers: we are 2,500!π₯
ππOnly 2 Billion papers remaining on arXiv. The more we are, the faster we readππ
π Invite your friends -> https://t.me/AI_DeepLearning
ππOnly 2 Billion papers remaining on arXiv. The more we are, the faster we readππ
π Invite your friends -> https://t.me/AI_DeepLearning
π₯9β€4π2π€2π1
π₯Podcasting AI & CVπ₯
ππΌFor people fluent in Italian: 1 hour podcast in which I talk about AI, CV, Startup and more (included this wonderful project).
More: https://bit.ly/38DtBwB
ππΌFor people fluent in Italian: 1 hour podcast in which I talk about AI, CV, Startup and more (included this wonderful project).
More: https://bit.ly/38DtBwB
π6β€3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯Inpainting: new SOTA! INSANEπ₯
πNovel two-stream approach: inpainting at the next level!
ππ’π π‘π₯π’π π‘ππ¬:
β High-freq locally, low-freq globally
β Local to global -> error correction
β 44% / 26% improvements FID/scores
β Source code, more clips available
More: https://bit.ly/3ltIX9R
πNovel two-stream approach: inpainting at the next level!
ππ’π π‘π₯π’π π‘ππ¬:
β High-freq locally, low-freq globally
β Local to global -> error correction
β 44% / 26% improvements FID/scores
β Source code, more clips available
More: https://bit.ly/3ltIX9R
π8π€―3π₯1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯Super-Human Crossword Solverπ₯
πSolving crosswords outperforming best humans
ππ’π π‘π₯π’π π‘ππ¬:
β Crossword solving based on NNs
β Q&A, structured decoding, local search
β Wide domains with perfect accuracy
β Large question-answer dataset
More: https://bit.ly/3a3zzqQ
πSolving crosswords outperforming best humans
ππ’π π‘π₯π’π π‘ππ¬:
β Crossword solving based on NNs
β Q&A, structured decoding, local search
β Wide domains with perfect accuracy
β Large question-answer dataset
More: https://bit.ly/3a3zzqQ
π₯4π€―3π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ΈImagen: far beyond DALLΒ·E 2π₯Έ
π#Google: unprecedented photorealism and deep level of language understanding
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic thresh diffusion sampling
β Efficient U-Net, efficient++ variant
β DrawBench, new text-to-image
β The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
π#Google: unprecedented photorealism and deep level of language understanding
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic thresh diffusion sampling
β Efficient U-Net, efficient++ variant
β DrawBench, new text-to-image
β The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
π₯9π€―6π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ€Tracking over SOTA detectorsπͺ€
πLightweight Python lib for real-time 2D object tracking π₯
ππ’π π‘π₯π’π π‘ππ¬:
β Layer of tracking over SOTA detectors
β Suitable for complex video processing
β Source code under BSD 3-Clause
β Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
πLightweight Python lib for real-time 2D object tracking π₯
ππ’π π‘π₯π’π π‘ππ¬:
β Layer of tracking over SOTA detectors
β Suitable for complex video processing
β Source code under BSD 3-Clause
β Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
π7π₯3π€©3
This media is not supported in your browser
VIEW IN TELEGRAM
π₯·πΏ FCA: #3D Neural Camouflage π₯·πΏ
π#3D full-camouflage adversarial patch to fool neural detectors
ππ’π π‘π₯π’π π‘ππ¬:
β Attack by diff-neural render
β E2E physical adversarial attack
β Envs, vehicles & detectors
β Source code available!
More: https://bit.ly/38kKyfa
π#3D full-camouflage adversarial patch to fool neural detectors
ππ’π π‘π₯π’π π‘ππ¬:
β Attack by diff-neural render
β E2E physical adversarial attack
β Envs, vehicles & detectors
β Source code available!
More: https://bit.ly/38kKyfa
π5π₯3π€―2π1
Media is too big
VIEW IN TELEGRAM
π One-Shot Object Pose π
πA novel one-shot object pose estimator
ππ’π π‘π₯π’π π‘ππ¬:
β Visual localization pipeline for object pose
β Handling novel objects without CAD model
β Novel graph attention for 2D-3D matching
β Large dataset for one-shot object pose
More: https://bit.ly/3MTogjJ
πA novel one-shot object pose estimator
ππ’π π‘π₯π’π π‘ππ¬:
β Visual localization pipeline for object pose
β Handling novel objects without CAD model
β Novel graph attention for 2D-3D matching
β Large dataset for one-shot object pose
More: https://bit.ly/3MTogjJ
π₯11β€4π2π€―2
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈSTEVE: Slot-TransformEr for VidEosβοΈ
πSTEVE: unsupervised model for object-centric learning in videos
ππ’π π‘π₯π’π π‘ππ¬:
β Adoption of a slot decoder (SLATE)
β SLATE with slot-level recurrence model
β Complex and naturalistic videos
β Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
πSTEVE: unsupervised model for object-centric learning in videos
ππ’π π‘π₯π’π π‘ππ¬:
β Adoption of a slot decoder (SLATE)
β SLATE with slot-level recurrence model
β Complex and naturalistic videos
β Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
π₯7π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ CogVideo: insane text-to-clip π¦
πCogVideo: 9B-parameters world's first large scale open-source text-to-video π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Largest open-source T2C transformer
β Finetuning of text-to-image model
β Multi-frame-rate hierarchical training
β From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
πCogVideo: 9B-parameters world's first large scale open-source text-to-video π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Largest open-source T2C transformer
β Finetuning of text-to-image model
β Multi-frame-rate hierarchical training
β From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
π₯9π6
This media is not supported in your browser
VIEW IN TELEGRAM
π¦Time-Aware Neural Voxelsπ¦
πTiNeuVox: "NeRF" with time-aware voxel features π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic scene w/ optimizable structure
β Temporal information in radiance net
β Small/large motion w/ single-res of feats
β 192Γ faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
πTiNeuVox: "NeRF" with time-aware voxel features π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic scene w/ optimizable structure
β Temporal information in radiance net
β Small/large motion w/ single-res of feats
β 192Γ faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
π11π₯2π€―1
π«Neural Anomaly Detection by AWSπ«
πUltra-competitive inference and SOTA for both detection and localization
ππ’π π‘π₯π’π π‘ππ¬:
β Locally aggregated, mid-level feats patch
β Maximizing nominal information at test time
β Reducing biases towards ImageNet classes
β Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
πUltra-competitive inference and SOTA for both detection and localization
ππ’π π‘π₯π’π π‘ππ¬:
β Locally aggregated, mid-level feats patch
β Maximizing nominal information at test time
β Reducing biases towards ImageNet classes
β Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
π₯7π€―3π2
This media is not supported in your browser
VIEW IN TELEGRAM
πΉ Project Skate from Google #AI πΉ
π#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
π#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
π₯15π€©3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π§¬Neural Text2Human Generationπ§¬
πText-driven neural human generation
ππ’π π‘π₯π’π π‘ππ¬:
β Full-body from a given human pose
β Hierarchical texture-aware codebook
β DeepFashion -> 44k Hi-Res images
β Code and models available!
More: https://bit.ly/3Mdnpt0
πText-driven neural human generation
ππ’π π‘π₯π’π π‘ππ¬:
β Full-body from a given human pose
β Hierarchical texture-aware codebook
β DeepFashion -> 44k Hi-Res images
β Code and models available!
More: https://bit.ly/3Mdnpt0
π₯15π1
π§¨EfficientFormers: 1.6ms inference π§¨
πTransformers fast as MobileNet? Snap shows that on #iphone!
ππ’π π‘π₯π’π π‘ππ¬:
β Low latency on mobile, high performance!
β Revisiting the design of ViT through latency
β New dimension-consistent design paradigm
β EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
πTransformers fast as MobileNet? Snap shows that on #iphone!
ππ’π π‘π₯π’π π‘ππ¬:
β Low latency on mobile, high performance!
β Revisiting the design of ViT through latency
β New dimension-consistent design paradigm
β EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
π₯16π1π€―1