This media is not supported in your browser
VIEW IN TELEGRAM
π¦Time-Aware Neural Voxelsπ¦
πTiNeuVox: "NeRF" with time-aware voxel features π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic scene w/ optimizable structure
β Temporal information in radiance net
β Small/large motion w/ single-res of feats
β 192Γ faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
πTiNeuVox: "NeRF" with time-aware voxel features π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic scene w/ optimizable structure
β Temporal information in radiance net
β Small/large motion w/ single-res of feats
β 192Γ faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
π11π₯2π€―1
π«Neural Anomaly Detection by AWSπ«
πUltra-competitive inference and SOTA for both detection and localization
ππ’π π‘π₯π’π π‘ππ¬:
β Locally aggregated, mid-level feats patch
β Maximizing nominal information at test time
β Reducing biases towards ImageNet classes
β Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
πUltra-competitive inference and SOTA for both detection and localization
ππ’π π‘π₯π’π π‘ππ¬:
β Locally aggregated, mid-level feats patch
β Maximizing nominal information at test time
β Reducing biases towards ImageNet classes
β Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
π₯7π€―3π2
This media is not supported in your browser
VIEW IN TELEGRAM
πΉ Project Skate from Google #AI πΉ
π#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
π#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
π₯15π€©3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π§¬Neural Text2Human Generationπ§¬
πText-driven neural human generation
ππ’π π‘π₯π’π π‘ππ¬:
β Full-body from a given human pose
β Hierarchical texture-aware codebook
β DeepFashion -> 44k Hi-Res images
β Code and models available!
More: https://bit.ly/3Mdnpt0
πText-driven neural human generation
ππ’π π‘π₯π’π π‘ππ¬:
β Full-body from a given human pose
β Hierarchical texture-aware codebook
β DeepFashion -> 44k Hi-Res images
β Code and models available!
More: https://bit.ly/3Mdnpt0
π₯15π1
π§¨EfficientFormers: 1.6ms inference π§¨
πTransformers fast as MobileNet? Snap shows that on #iphone!
ππ’π π‘π₯π’π π‘ππ¬:
β Low latency on mobile, high performance!
β Revisiting the design of ViT through latency
β New dimension-consistent design paradigm
β EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
πTransformers fast as MobileNet? Snap shows that on #iphone!
ππ’π π‘π₯π’π π‘ππ¬:
β Low latency on mobile, high performance!
β Revisiting the design of ViT through latency
β New dimension-consistent design paradigm
β EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
π₯16π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π’ Transformer-Based Sens-Fusion π’
πUpdating TransFuser (CVPR21): image + LiDAR representations with self-attention
ππ’π π‘π₯π’π π‘ππ¬:
β Existing approach can't handle traffic π’
β Novel multi-modal fusion transformer
β The new SOTA in driving performance
β Reducing avg collisions per KM by 48%
β Insights on current limitations of E2E
More: https://bit.ly/391dmd6
πUpdating TransFuser (CVPR21): image + LiDAR representations with self-attention
ππ’π π‘π₯π’π π‘ππ¬:
β Existing approach can't handle traffic π’
β Novel multi-modal fusion transformer
β The new SOTA in driving performance
β Reducing avg collisions per KM by 48%
β Insights on current limitations of E2E
More: https://bit.ly/391dmd6
π11π₯2
π§π»ββοΈYogNet: neural yoga assistantπ§π»ββοΈ
πMulti-person yoga neural expert for 20 asanas
ππ’π π‘π₯π’π π‘ππ¬:
β CNNs & reg.LSTMs + 3D-CNNs
β Multi-person asanas in real-time
β YAR: dataset for yoga & posture
β 1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
πMulti-person yoga neural expert for 20 asanas
ππ’π π‘π₯π’π π‘ππ¬:
β CNNs & reg.LSTMs + 3D-CNNs
β Multi-person asanas in real-time
β YAR: dataset for yoga & posture
β 1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
β€13π1
This media is not supported in your browser
VIEW IN TELEGRAM
π΄ Geogram: geometric algos in C++ π΄
πNovel open-source programming library with (research) geometric algorithms in C++
ππ’π π‘π₯π’π π‘ππ¬:
β Geometry Processing from #INRIA
β 30+ papers from SIGGRAPH, etc.
β Grants: GOODSHAPE & VORPALINE
β Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
πNovel open-source programming library with (research) geometric algorithms in C++
ππ’π π‘π₯π’π π‘ππ¬:
β Geometry Processing from #INRIA
β 30+ papers from SIGGRAPH, etc.
β Grants: GOODSHAPE & VORPALINE
β Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
π₯6π3β€1
π Open Source Vision from #Apple π
πCVNets: open-source (not a joke) lib for neural vision.
ππ’π π‘π₯π’π π‘ππ¬:
β PyTorch-based neural lib. for vision
β Train 2β4Γ longer w/ augmentations
β Plug-and-play components for CV
β Source code under a custom license
More: https://bit.ly/39d1dSj
πCVNets: open-source (not a joke) lib for neural vision.
ππ’π π‘π₯π’π π‘ππ¬:
β PyTorch-based neural lib. for vision
β Train 2β4Γ longer w/ augmentations
β Plug-and-play components for CV
β Source code under a custom license
More: https://bit.ly/39d1dSj
π9
This media is not supported in your browser
VIEW IN TELEGRAM
ππ»Neural Clips by #Nvidia: INSANE ππ»
πNeural generation with changes in camera viewpoint & content that arises over time π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Novel hierarchical generator architecture
β Temp. receptive field + temporal embed.
β Multi-res. with super-resolution network
β SOTA in long clip with motion & changes
β Code, data & models in August 2022 ποΈ
More: https://bit.ly/3zroWsC
πNeural generation with changes in camera viewpoint & content that arises over time π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Novel hierarchical generator architecture
β Temp. receptive field + temporal embed.
β Multi-res. with super-resolution network
β SOTA in long clip with motion & changes
β Code, data & models in August 2022 ποΈ
More: https://bit.ly/3zroWsC
π€―9π2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
β½ Zero to #Messi with #deeplearning β½
πEA unveils a neural system to learn multiple soccer juggling skills π
ππ’π π‘π₯π’π π‘ππ¬:
β Learning difficult soccer juggling skills
β Layer-wise mixture-of-experts architecture
β Specialization arises naturally
β Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
πEA unveils a neural system to learn multiple soccer juggling skills π
ππ’π π‘π₯π’π π‘ππ¬:
β Learning difficult soccer juggling skills
β Layer-wise mixture-of-experts architecture
β Specialization arises naturally
β Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
π₯7π3
This media is not supported in your browser
VIEW IN TELEGRAM
ποΈ HumanNeRF: source code is out! ποΈ
πPausing the video at any frame and rendering the subject from arbitrary views!
ππ’π π‘π₯π’π π‘ππ¬:
β Synthesizing photorealistic humans
β Synthesizing details, ie. cloth & face
β Volumetric canonical T-pose
β Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
πPausing the video at any frame and rendering the subject from arbitrary views!
ππ’π π‘π₯π’π π‘ππ¬:
β Synthesizing photorealistic humans
β Synthesizing details, ie. cloth & face
β Volumetric canonical T-pose
β Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
π€―17π₯5π2
This media is not supported in your browser
VIEW IN TELEGRAM
π EG3D: source code is out! π
π#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
ππ’π π‘π₯π’π π‘ππ¬:
β Tri-plane-based 3D GAN framework
β Pose-correlated attribute (expression)
β SOTA in uncond. 3D-aware synthesis
β Source code & models NOW available!
More: https://bit.ly/3aOfHs0
π#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
ππ’π π‘π₯π’π π‘ππ¬:
β Tri-plane-based 3D GAN framework
β Pose-correlated attribute (expression)
β SOTA in uncond. 3D-aware synthesis
β Source code & models NOW available!
More: https://bit.ly/3aOfHs0
π₯7π€―6π4β€2
π₯One Millisecond Backbone. Fire!π₯
πMobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!
ππ’π π‘π₯π’π π‘ππ¬:
β 75.9% top-1 accuracy on ImageNet
β 38Γ faster than MobileFormer net
β Classification, detection & segmentation
β Source code & model soon available!
More: https://bit.ly/3tsT7f2
πMobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!
ππ’π π‘π₯π’π π‘ππ¬:
β 75.9% top-1 accuracy on ImageNet
β 38Γ faster than MobileFormer net
β Classification, detection & segmentation
β Source code & model soon available!
More: https://bit.ly/3tsT7f2
β€24π2
This media is not supported in your browser
VIEW IN TELEGRAM
𧨠Scaling Transformers to GigaPixels!π§¨
πNovel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!
ππ’π π‘π₯π’π π‘ππ¬:
β Gigapixel whole-slide imaging (WSI)
β Leveraging natural hier. structure of WSI
β Self-supervised Hi-Res representations
β Source code and models available!
More: https://bit.ly/3xLuzkg
πNovel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!
ππ’π π‘π₯π’π π‘ππ¬:
β Gigapixel whole-slide imaging (WSI)
β Leveraging natural hier. structure of WSI
β Self-supervised Hi-Res representations
β Source code and models available!
More: https://bit.ly/3xLuzkg
π€―16π1
This media is not supported in your browser
VIEW IN TELEGRAM
πBodyMap: Hyper-Detailed Humansπ
π#META unveils 1st-ever dense continuous correspondence for clothed humans
ππ’π π‘π₯π’π π‘ππ¬:
β 1st-ever dense continuous corresp.
β HQ fingers, hair, and clothes
β Novel ViT-based architecture
β SOTA on DensePose COCO
More: https://bit.ly/39nEPps
π#META unveils 1st-ever dense continuous correspondence for clothed humans
ππ’π π‘π₯π’π π‘ππ¬:
β 1st-ever dense continuous corresp.
β HQ fingers, hair, and clothes
β Novel ViT-based architecture
β SOTA on DensePose COCO
More: https://bit.ly/39nEPps
π13β€2
πΉ NOAH just open-sourced! πΉ
πA novel approach to find the optimal design of prompt modules through NAS algos.
ππ’π π‘π₯π’π π‘ππ¬:
β NOAH from Neural prOmpt seArcH
β Parameter-efficient βprompt modulesβ
β Efficient NAS-based implementation
β Better than transfer, few-shot & domain gen.
More: https://bit.ly/3MKfVhi
πA novel approach to find the optimal design of prompt modules through NAS algos.
ππ’π π‘π₯π’π π‘ππ¬:
β NOAH from Neural prOmpt seArcH
β Parameter-efficient βprompt modulesβ
β Efficient NAS-based implementation
β Better than transfer, few-shot & domain gen.
More: https://bit.ly/3MKfVhi
π5π2π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
ππ»ββοΈNeural Super-Resolution in Moviesππ»ββοΈ
πImplicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!
ππ’π π‘π₯π’π π‘ππ¬:
β Video as continuous video representation
β Clips in arbitrary space/time resolution
β OOD generalization in space-time
β Source code and models available
More: https://bit.ly/3xsqccf
πImplicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!
ππ’π π‘π₯π’π π‘ππ¬:
β Video as continuous video representation
β Clips in arbitrary space/time resolution
β OOD generalization in space-time
β Source code and models available
More: https://bit.ly/3xsqccf
π₯6π2
This media is not supported in your browser
VIEW IN TELEGRAM
π§ Bias in #AI, explained simple π§
πAsking DallE-Mini to help me to show what the BIAS in #AI is
πππ§ππ«ππππ πππ¦π©π₯ππ¬:
β Best eng.->men/Caucasians
β Best doctors->men/Caucasians
β Top CEOs->men/Caucasians
β Chef, kitchen->men/Caucasians
β Rich People->only Caucasians
β Poor People->non-Caucasians
β Italian engineers->back in 30's
β Chinese eng.->infrastructures
β Italian working->local market
β Chinese working->vegetables
β Men workers->constructions
β Women workers->only office
More: https://bit.ly/3b0UFqd
πAsking DallE-Mini to help me to show what the BIAS in #AI is
πππ§ππ«ππππ πππ¦π©π₯ππ¬:
β Best eng.->men/Caucasians
β Best doctors->men/Caucasians
β Top CEOs->men/Caucasians
β Chef, kitchen->men/Caucasians
β Rich People->only Caucasians
β Poor People->non-Caucasians
β Italian engineers->back in 30's
β Chinese eng.->infrastructures
β Italian working->local market
β Chinese working->vegetables
β Men workers->constructions
β Women workers->only office
More: https://bit.ly/3b0UFqd
π13β€6π4
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ SAVi++: Segmentation by #Google π¦
πNovel unsupervised object-centric #AI to predict depth signals from slot-based video representation
ππ’π π‘π₯π’π π‘ππ¬:
β Segmenting complex dynamic scenes
β Static/Moving objects on naturalistic BG
β LiDAR-SAVi: segmenting in the wild
β Source code and model soon available!
More: https://bit.ly/3n3hywd
πNovel unsupervised object-centric #AI to predict depth signals from slot-based video representation
ππ’π π‘π₯π’π π‘ππ¬:
β Segmenting complex dynamic scenes
β Static/Moving objects on naturalistic BG
β LiDAR-SAVi: segmenting in the wild
β Source code and model soon available!
More: https://bit.ly/3n3hywd
π₯7π6π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
βHaGRID : Half Million Handsπ
πRussian Sberbank opens HaGRID, enormous dataset for HGR. "Peace" label is present π΅π‘
ππ’π π‘π₯π’π π‘ππ¬:
β 552,992 samples, 18 classes
β HD resolution in RGB format
β BBox, gesture, leading hands
β Dataset/models available
More: https://bit.ly/3n2cd8r
πRussian Sberbank opens HaGRID, enormous dataset for HGR. "Peace" label is present π΅π‘
ππ’π π‘π₯π’π π‘ππ¬:
β 552,992 samples, 18 classes
β HD resolution in RGB format
β BBox, gesture, leading hands
β Dataset/models available
More: https://bit.ly/3n2cd8r
β€11π€2