This media is not supported in your browser
VIEW IN TELEGRAM
☄️STEVE: Slot-TransformEr for VidEos☄️
👉STEVE: unsupervised model for object-centric learning in videos
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Adoption of a slot decoder (SLATE)
✅SLATE with slot-level recurrence model
✅Complex and naturalistic videos
✅Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
👉STEVE: unsupervised model for object-centric learning in videos
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Adoption of a slot decoder (SLATE)
✅SLATE with slot-level recurrence model
✅Complex and naturalistic videos
✅Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
🔥7👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦔 CogVideo: insane text-to-clip 🦔
👉CogVideo: 9B-parameters world's first large scale open-source text-to-video 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Largest open-source T2C transformer
✅Finetuning of text-to-image model
✅Multi-frame-rate hierarchical training
✅From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
👉CogVideo: 9B-parameters world's first large scale open-source text-to-video 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Largest open-source T2C transformer
✅Finetuning of text-to-image model
✅Multi-frame-rate hierarchical training
✅From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
🔥9👍6
This media is not supported in your browser
VIEW IN TELEGRAM
🦄Time-Aware Neural Voxels🦄
👉TiNeuVox: "NeRF" with time-aware voxel features 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic scene w/ optimizable structure
✅Temporal information in radiance net
✅Small/large motion w/ single-res of feats
✅192× faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
👉TiNeuVox: "NeRF" with time-aware voxel features 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic scene w/ optimizable structure
✅Temporal information in radiance net
✅Small/large motion w/ single-res of feats
✅192× faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
👍11🔥2🤯1
🫐Neural Anomaly Detection by AWS🫐
👉Ultra-competitive inference and SOTA for both detection and localization
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Locally aggregated, mid-level feats patch
✅Maximizing nominal information at test time
✅Reducing biases towards ImageNet classes
✅Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
👉Ultra-competitive inference and SOTA for both detection and localization
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Locally aggregated, mid-level feats patch
✅Maximizing nominal information at test time
✅Reducing biases towards ImageNet classes
✅Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
🔥7🤯3👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🛹 Project Skate from Google #AI 🛹
👉#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
👉#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
🔥15🤩3👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧬Neural Text2Human Generation🧬
👉Text-driven neural human generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Full-body from a given human pose
✅Hierarchical texture-aware codebook
✅DeepFashion -> 44k Hi-Res images
✅Code and models available!
More: https://bit.ly/3Mdnpt0
👉Text-driven neural human generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Full-body from a given human pose
✅Hierarchical texture-aware codebook
✅DeepFashion -> 44k Hi-Res images
✅Code and models available!
More: https://bit.ly/3Mdnpt0
🔥15👍1
🧨EfficientFormers: 1.6ms inference 🧨
👉Transformers fast as MobileNet? Snap shows that on #iphone!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Low latency on mobile, high performance!
✅Revisiting the design of ViT through latency
✅New dimension-consistent design paradigm
✅EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
👉Transformers fast as MobileNet? Snap shows that on #iphone!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Low latency on mobile, high performance!
✅Revisiting the design of ViT through latency
✅New dimension-consistent design paradigm
✅EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
🔥16👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐢 Transformer-Based Sens-Fusion 🐢
👉Updating TransFuser (CVPR21): image + LiDAR representations with self-attention
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Existing approach can't handle traffic 😢
✅Novel multi-modal fusion transformer
✅The new SOTA in driving performance
✅Reducing avg collisions per KM by 48%
✅Insights on current limitations of E2E
More: https://bit.ly/391dmd6
👉Updating TransFuser (CVPR21): image + LiDAR representations with self-attention
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Existing approach can't handle traffic 😢
✅Novel multi-modal fusion transformer
✅The new SOTA in driving performance
✅Reducing avg collisions per KM by 48%
✅Insights on current limitations of E2E
More: https://bit.ly/391dmd6
👍11🔥2
🧘🏻♂️YogNet: neural yoga assistant🧘🏻♂️
👉Multi-person yoga neural expert for 20 asanas
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CNNs & reg.LSTMs + 3D-CNNs
✅Multi-person asanas in real-time
✅YAR: dataset for yoga & posture
✅1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
👉Multi-person yoga neural expert for 20 asanas
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CNNs & reg.LSTMs + 3D-CNNs
✅Multi-person asanas in real-time
✅YAR: dataset for yoga & posture
✅1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
❤13👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔴 Geogram: geometric algos in C++ 🔴
👉Novel open-source programming library with (research) geometric algorithms in C++
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Geometry Processing from #INRIA
✅30+ papers from SIGGRAPH, etc.
✅Grants: GOODSHAPE & VORPALINE
✅Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
👉Novel open-source programming library with (research) geometric algorithms in C++
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Geometry Processing from #INRIA
✅30+ papers from SIGGRAPH, etc.
✅Grants: GOODSHAPE & VORPALINE
✅Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
🔥6👍3❤1
🍏 Open Source Vision from #Apple 🍏
👉CVNets: open-source (not a joke) lib for neural vision.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅PyTorch-based neural lib. for vision
✅Train 2−4× longer w/ augmentations
✅Plug-and-play components for CV
✅Source code under a custom license
More: https://bit.ly/39d1dSj
👉CVNets: open-source (not a joke) lib for neural vision.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅PyTorch-based neural lib. for vision
✅Train 2−4× longer w/ augmentations
✅Plug-and-play components for CV
✅Source code under a custom license
More: https://bit.ly/39d1dSj
👍9
This media is not supported in your browser
VIEW IN TELEGRAM
🏇🏻Neural Clips by #Nvidia: INSANE 🏇🏻
👉Neural generation with changes in camera viewpoint & content that arises over time 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel hierarchical generator architecture
✅Temp. receptive field + temporal embed.
✅Multi-res. with super-resolution network
✅SOTA in long clip with motion & changes
✅Code, data & models in August 2022 🏖️
More: https://bit.ly/3zroWsC
👉Neural generation with changes in camera viewpoint & content that arises over time 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel hierarchical generator architecture
✅Temp. receptive field + temporal embed.
✅Multi-res. with super-resolution network
✅SOTA in long clip with motion & changes
✅Code, data & models in August 2022 🏖️
More: https://bit.ly/3zroWsC
🤯9👎2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽ Zero to #Messi with #deeplearning ⚽
👉EA unveils a neural system to learn multiple soccer juggling skills 😍
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learning difficult soccer juggling skills
✅Layer-wise mixture-of-experts architecture
✅Specialization arises naturally
✅Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
👉EA unveils a neural system to learn multiple soccer juggling skills 😍
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learning difficult soccer juggling skills
✅Layer-wise mixture-of-experts architecture
✅Specialization arises naturally
✅Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
🔥7👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🏖️ HumanNeRF: source code is out! 🏖️
👉Pausing the video at any frame and rendering the subject from arbitrary views!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Synthesizing photorealistic humans
✅Synthesizing details, ie. cloth & face
✅Volumetric canonical T-pose
✅Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
👉Pausing the video at any frame and rendering the subject from arbitrary views!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Synthesizing photorealistic humans
✅Synthesizing details, ie. cloth & face
✅Volumetric canonical T-pose
✅Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
🤯17🔥5👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🎒 EG3D: source code is out! 🎒
👉#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Tri-plane-based 3D GAN framework
✅Pose-correlated attribute (expression)
✅SOTA in uncond. 3D-aware synthesis
✅Source code & models NOW available!
More: https://bit.ly/3aOfHs0
👉#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Tri-plane-based 3D GAN framework
✅Pose-correlated attribute (expression)
✅SOTA in uncond. 3D-aware synthesis
✅Source code & models NOW available!
More: https://bit.ly/3aOfHs0
🔥7🤯6👍4❤2
🔥One Millisecond Backbone. Fire!🔥
👉MobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅75.9% top-1 accuracy on ImageNet
✅38× faster than MobileFormer net
✅Classification, detection & segmentation
✅Source code & model soon available!
More: https://bit.ly/3tsT7f2
👉MobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅75.9% top-1 accuracy on ImageNet
✅38× faster than MobileFormer net
✅Classification, detection & segmentation
✅Source code & model soon available!
More: https://bit.ly/3tsT7f2
❤24👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🧨 Scaling Transformers to GigaPixels!🧨
👉Novel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Gigapixel whole-slide imaging (WSI)
✅Leveraging natural hier. structure of WSI
✅Self-supervised Hi-Res representations
✅Source code and models available!
More: https://bit.ly/3xLuzkg
👉Novel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Gigapixel whole-slide imaging (WSI)
✅Leveraging natural hier. structure of WSI
✅Self-supervised Hi-Res representations
✅Source code and models available!
More: https://bit.ly/3xLuzkg
🤯16👍1
This media is not supported in your browser
VIEW IN TELEGRAM
👗BodyMap: Hyper-Detailed Humans👗
👉#META unveils 1st-ever dense continuous correspondence for clothed humans
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅1st-ever dense continuous corresp.
✅HQ fingers, hair, and clothes
✅Novel ViT-based architecture
✅SOTA on DensePose COCO
More: https://bit.ly/39nEPps
👉#META unveils 1st-ever dense continuous correspondence for clothed humans
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅1st-ever dense continuous corresp.
✅HQ fingers, hair, and clothes
✅Novel ViT-based architecture
✅SOTA on DensePose COCO
More: https://bit.ly/39nEPps
👍13❤2
🐹 NOAH just open-sourced! 🐹
👉A novel approach to find the optimal design of prompt modules through NAS algos.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅NOAH from Neural prOmpt seArcH
✅Parameter-efficient “prompt modules”
✅Efficient NAS-based implementation
✅Better than transfer, few-shot & domain gen.
More: https://bit.ly/3MKfVhi
👉A novel approach to find the optimal design of prompt modules through NAS algos.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅NOAH from Neural prOmpt seArcH
✅Parameter-efficient “prompt modules”
✅Efficient NAS-based implementation
✅Better than transfer, few-shot & domain gen.
More: https://bit.ly/3MKfVhi
👍5👏2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🏄🏻♀️Neural Super-Resolution in Movies🏄🏻♀️
👉Implicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Video as continuous video representation
✅Clips in arbitrary space/time resolution
✅OOD generalization in space-time
✅Source code and models available
More: https://bit.ly/3xsqccf
👉Implicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Video as continuous video representation
✅Clips in arbitrary space/time resolution
✅OOD generalization in space-time
✅Source code and models available
More: https://bit.ly/3xsqccf
🔥6👍2