This media is not supported in your browser
VIEW IN TELEGRAM
π₯Super-Human Crossword Solverπ₯
πSolving crosswords outperforming best humans
ππ’π π‘π₯π’π π‘ππ¬:
β Crossword solving based on NNs
β Q&A, structured decoding, local search
β Wide domains with perfect accuracy
β Large question-answer dataset
More: https://bit.ly/3a3zzqQ
πSolving crosswords outperforming best humans
ππ’π π‘π₯π’π π‘ππ¬:
β Crossword solving based on NNs
β Q&A, structured decoding, local search
β Wide domains with perfect accuracy
β Large question-answer dataset
More: https://bit.ly/3a3zzqQ
π₯4π€―3π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ΈImagen: far beyond DALLΒ·E 2π₯Έ
π#Google: unprecedented photorealism and deep level of language understanding
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic thresh diffusion sampling
β Efficient U-Net, efficient++ variant
β DrawBench, new text-to-image
β The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
π#Google: unprecedented photorealism and deep level of language understanding
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic thresh diffusion sampling
β Efficient U-Net, efficient++ variant
β DrawBench, new text-to-image
β The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
π₯9π€―6π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ€Tracking over SOTA detectorsπͺ€
πLightweight Python lib for real-time 2D object tracking π₯
ππ’π π‘π₯π’π π‘ππ¬:
β Layer of tracking over SOTA detectors
β Suitable for complex video processing
β Source code under BSD 3-Clause
β Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
πLightweight Python lib for real-time 2D object tracking π₯
ππ’π π‘π₯π’π π‘ππ¬:
β Layer of tracking over SOTA detectors
β Suitable for complex video processing
β Source code under BSD 3-Clause
β Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
π7π₯3π€©3
This media is not supported in your browser
VIEW IN TELEGRAM
π₯·πΏ FCA: #3D Neural Camouflage π₯·πΏ
π#3D full-camouflage adversarial patch to fool neural detectors
ππ’π π‘π₯π’π π‘ππ¬:
β Attack by diff-neural render
β E2E physical adversarial attack
β Envs, vehicles & detectors
β Source code available!
More: https://bit.ly/38kKyfa
π#3D full-camouflage adversarial patch to fool neural detectors
ππ’π π‘π₯π’π π‘ππ¬:
β Attack by diff-neural render
β E2E physical adversarial attack
β Envs, vehicles & detectors
β Source code available!
More: https://bit.ly/38kKyfa
π5π₯3π€―2π1
Media is too big
VIEW IN TELEGRAM
π One-Shot Object Pose π
πA novel one-shot object pose estimator
ππ’π π‘π₯π’π π‘ππ¬:
β Visual localization pipeline for object pose
β Handling novel objects without CAD model
β Novel graph attention for 2D-3D matching
β Large dataset for one-shot object pose
More: https://bit.ly/3MTogjJ
πA novel one-shot object pose estimator
ππ’π π‘π₯π’π π‘ππ¬:
β Visual localization pipeline for object pose
β Handling novel objects without CAD model
β Novel graph attention for 2D-3D matching
β Large dataset for one-shot object pose
More: https://bit.ly/3MTogjJ
π₯11β€4π2π€―2
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈSTEVE: Slot-TransformEr for VidEosβοΈ
πSTEVE: unsupervised model for object-centric learning in videos
ππ’π π‘π₯π’π π‘ππ¬:
β Adoption of a slot decoder (SLATE)
β SLATE with slot-level recurrence model
β Complex and naturalistic videos
β Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
πSTEVE: unsupervised model for object-centric learning in videos
ππ’π π‘π₯π’π π‘ππ¬:
β Adoption of a slot decoder (SLATE)
β SLATE with slot-level recurrence model
β Complex and naturalistic videos
β Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
π₯7π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ CogVideo: insane text-to-clip π¦
πCogVideo: 9B-parameters world's first large scale open-source text-to-video π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Largest open-source T2C transformer
β Finetuning of text-to-image model
β Multi-frame-rate hierarchical training
β From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
πCogVideo: 9B-parameters world's first large scale open-source text-to-video π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Largest open-source T2C transformer
β Finetuning of text-to-image model
β Multi-frame-rate hierarchical training
β From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
π₯9π6
This media is not supported in your browser
VIEW IN TELEGRAM
π¦Time-Aware Neural Voxelsπ¦
πTiNeuVox: "NeRF" with time-aware voxel features π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic scene w/ optimizable structure
β Temporal information in radiance net
β Small/large motion w/ single-res of feats
β 192Γ faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
πTiNeuVox: "NeRF" with time-aware voxel features π΅
ππ’π π‘π₯π’π π‘ππ¬:
β Dynamic scene w/ optimizable structure
β Temporal information in radiance net
β Small/large motion w/ single-res of feats
β 192Γ faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
π11π₯2π€―1
π«Neural Anomaly Detection by AWSπ«
πUltra-competitive inference and SOTA for both detection and localization
ππ’π π‘π₯π’π π‘ππ¬:
β Locally aggregated, mid-level feats patch
β Maximizing nominal information at test time
β Reducing biases towards ImageNet classes
β Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
πUltra-competitive inference and SOTA for both detection and localization
ππ’π π‘π₯π’π π‘ππ¬:
β Locally aggregated, mid-level feats patch
β Maximizing nominal information at test time
β Reducing biases towards ImageNet classes
β Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
π₯7π€―3π2
This media is not supported in your browser
VIEW IN TELEGRAM
πΉ Project Skate from Google #AI πΉ
π#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
π#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
π₯15π€©3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π§¬Neural Text2Human Generationπ§¬
πText-driven neural human generation
ππ’π π‘π₯π’π π‘ππ¬:
β Full-body from a given human pose
β Hierarchical texture-aware codebook
β DeepFashion -> 44k Hi-Res images
β Code and models available!
More: https://bit.ly/3Mdnpt0
πText-driven neural human generation
ππ’π π‘π₯π’π π‘ππ¬:
β Full-body from a given human pose
β Hierarchical texture-aware codebook
β DeepFashion -> 44k Hi-Res images
β Code and models available!
More: https://bit.ly/3Mdnpt0
π₯15π1
π§¨EfficientFormers: 1.6ms inference π§¨
πTransformers fast as MobileNet? Snap shows that on #iphone!
ππ’π π‘π₯π’π π‘ππ¬:
β Low latency on mobile, high performance!
β Revisiting the design of ViT through latency
β New dimension-consistent design paradigm
β EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
πTransformers fast as MobileNet? Snap shows that on #iphone!
ππ’π π‘π₯π’π π‘ππ¬:
β Low latency on mobile, high performance!
β Revisiting the design of ViT through latency
β New dimension-consistent design paradigm
β EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
π₯16π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π’ Transformer-Based Sens-Fusion π’
πUpdating TransFuser (CVPR21): image + LiDAR representations with self-attention
ππ’π π‘π₯π’π π‘ππ¬:
β Existing approach can't handle traffic π’
β Novel multi-modal fusion transformer
β The new SOTA in driving performance
β Reducing avg collisions per KM by 48%
β Insights on current limitations of E2E
More: https://bit.ly/391dmd6
πUpdating TransFuser (CVPR21): image + LiDAR representations with self-attention
ππ’π π‘π₯π’π π‘ππ¬:
β Existing approach can't handle traffic π’
β Novel multi-modal fusion transformer
β The new SOTA in driving performance
β Reducing avg collisions per KM by 48%
β Insights on current limitations of E2E
More: https://bit.ly/391dmd6
π11π₯2
π§π»ββοΈYogNet: neural yoga assistantπ§π»ββοΈ
πMulti-person yoga neural expert for 20 asanas
ππ’π π‘π₯π’π π‘ππ¬:
β CNNs & reg.LSTMs + 3D-CNNs
β Multi-person asanas in real-time
β YAR: dataset for yoga & posture
β 1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
πMulti-person yoga neural expert for 20 asanas
ππ’π π‘π₯π’π π‘ππ¬:
β CNNs & reg.LSTMs + 3D-CNNs
β Multi-person asanas in real-time
β YAR: dataset for yoga & posture
β 1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
β€13π1
This media is not supported in your browser
VIEW IN TELEGRAM
π΄ Geogram: geometric algos in C++ π΄
πNovel open-source programming library with (research) geometric algorithms in C++
ππ’π π‘π₯π’π π‘ππ¬:
β Geometry Processing from #INRIA
β 30+ papers from SIGGRAPH, etc.
β Grants: GOODSHAPE & VORPALINE
β Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
πNovel open-source programming library with (research) geometric algorithms in C++
ππ’π π‘π₯π’π π‘ππ¬:
β Geometry Processing from #INRIA
β 30+ papers from SIGGRAPH, etc.
β Grants: GOODSHAPE & VORPALINE
β Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
π₯6π3β€1
π Open Source Vision from #Apple π
πCVNets: open-source (not a joke) lib for neural vision.
ππ’π π‘π₯π’π π‘ππ¬:
β PyTorch-based neural lib. for vision
β Train 2β4Γ longer w/ augmentations
β Plug-and-play components for CV
β Source code under a custom license
More: https://bit.ly/39d1dSj
πCVNets: open-source (not a joke) lib for neural vision.
ππ’π π‘π₯π’π π‘ππ¬:
β PyTorch-based neural lib. for vision
β Train 2β4Γ longer w/ augmentations
β Plug-and-play components for CV
β Source code under a custom license
More: https://bit.ly/39d1dSj
π9
This media is not supported in your browser
VIEW IN TELEGRAM
ππ»Neural Clips by #Nvidia: INSANE ππ»
πNeural generation with changes in camera viewpoint & content that arises over time π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Novel hierarchical generator architecture
β Temp. receptive field + temporal embed.
β Multi-res. with super-resolution network
β SOTA in long clip with motion & changes
β Code, data & models in August 2022 ποΈ
More: https://bit.ly/3zroWsC
πNeural generation with changes in camera viewpoint & content that arises over time π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Novel hierarchical generator architecture
β Temp. receptive field + temporal embed.
β Multi-res. with super-resolution network
β SOTA in long clip with motion & changes
β Code, data & models in August 2022 ποΈ
More: https://bit.ly/3zroWsC
π€―9π2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
β½ Zero to #Messi with #deeplearning β½
πEA unveils a neural system to learn multiple soccer juggling skills π
ππ’π π‘π₯π’π π‘ππ¬:
β Learning difficult soccer juggling skills
β Layer-wise mixture-of-experts architecture
β Specialization arises naturally
β Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
πEA unveils a neural system to learn multiple soccer juggling skills π
ππ’π π‘π₯π’π π‘ππ¬:
β Learning difficult soccer juggling skills
β Layer-wise mixture-of-experts architecture
β Specialization arises naturally
β Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
π₯7π3
This media is not supported in your browser
VIEW IN TELEGRAM
ποΈ HumanNeRF: source code is out! ποΈ
πPausing the video at any frame and rendering the subject from arbitrary views!
ππ’π π‘π₯π’π π‘ππ¬:
β Synthesizing photorealistic humans
β Synthesizing details, ie. cloth & face
β Volumetric canonical T-pose
β Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
πPausing the video at any frame and rendering the subject from arbitrary views!
ππ’π π‘π₯π’π π‘ππ¬:
β Synthesizing photorealistic humans
β Synthesizing details, ie. cloth & face
β Volumetric canonical T-pose
β Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
π€―17π₯5π2
This media is not supported in your browser
VIEW IN TELEGRAM
π EG3D: source code is out! π
π#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
ππ’π π‘π₯π’π π‘ππ¬:
β Tri-plane-based 3D GAN framework
β Pose-correlated attribute (expression)
β SOTA in uncond. 3D-aware synthesis
β Source code & models NOW available!
More: https://bit.ly/3aOfHs0
π#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
ππ’π π‘π₯π’π π‘ππ¬:
β Tri-plane-based 3D GAN framework
β Pose-correlated attribute (expression)
β SOTA in uncond. 3D-aware synthesis
β Source code & models NOW available!
More: https://bit.ly/3aOfHs0
π₯7π€―6π4β€2