This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ VMT: Video Mask Transfiner ๐ฆ
๐Novel highly efficient ViT structure for video instance segmentation.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ HD & more temporally stable mask
โ Higher resolution features for VIS
โ Detecting error-prone s-t. regions
โ Auto-refinement on training data!
More: https://bit.ly/3RKXtb4
๐Novel highly efficient ViT structure for video instance segmentation.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ HD & more temporally stable mask
โ Higher resolution features for VIS
โ Detecting error-prone s-t. regions
โ Auto-refinement on training data!
More: https://bit.ly/3RKXtb4
๐คฏ9โค1
๐คฏ #StableDiffusion + #Dallemini = BOOM! ๐คฏ
๐A #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)
More: https://bit.ly/3TTOshR
๐A #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)
More: https://bit.ly/3TTOshR
๐ฅ9๐5๐ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ VIS - Deformable Transformers ๐
๐DeVIS: VIS method with efficiency and performance of deformable ViT
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Temp. multi-scale D-Attention
โ Instance-aware object queries
โ Mask: DA + multi-scale feats map
โ Improved multi-cue clip tracking
โ SOTA on YouTube-VIS 2021/OVIS
More: https://bit.ly/3TQv1Xc
๐DeVIS: VIS method with efficiency and performance of deformable ViT
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Temp. multi-scale D-Attention
โ Instance-aware object queries
โ Mask: DA + multi-scale feats map
โ Improved multi-cue clip tracking
โ SOTA on YouTube-VIS 2021/OVIS
More: https://bit.ly/3TQv1Xc
๐ฅ8โค1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ X-NeRF: Cross-Spectral NeRF ๐
๐Cross-Spectral NeRF from cams with different light spectrums
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ First ever cross-spectral NeRF
โ Avoiding non-trivial calib/match
โ Normalized Cross-Device Coords
โ Novel dataset w/ RGB, MS, & IR
More: https://bit.ly/3RqHnUo
๐Cross-Spectral NeRF from cams with different light spectrums
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ First ever cross-spectral NeRF
โ Avoiding non-trivial calib/match
โ Normalized Cross-Device Coords
โ Novel dataset w/ RGB, MS, & IR
More: https://bit.ly/3RqHnUo
๐7
This media is not supported in your browser
VIEW IN TELEGRAM
๐นTT-GNeRF: generative NeRF for Faces๐น
๐TT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ ETH + Uni_Trento + #Snap ๐คฏ
โ DAEM for disentanglement of 3D model
โ "Training-as-Init, Optimizing-for-Tuning"
โ Consistency++, preserving non-target ROI
โ Unsupervised optimization of geometry
More: https://bit.ly/3ARZmMw
๐TT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ ETH + Uni_Trento + #Snap ๐คฏ
โ DAEM for disentanglement of 3D model
โ "Training-as-Init, Optimizing-for-Tuning"
โ Consistency++, preserving non-target ROI
โ Unsupervised optimization of geometry
More: https://bit.ly/3ARZmMw
๐ฅ4โค1๐1
๐ช SOTA in Arbitrary Shape Text Detection ๐ช
๐Novel unified coarse-to-fine Transformer for arbitrary shape text detection
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Coarse-to-fine arbitrary text detection
โ Accurate text detection, NO post-process
โ Boundary proposal generation mechanism
โ Innovative boundary transformer (iterative)
โ Boundary energy loss (BEL) for refinement
More: https://bit.ly/3D6Ryt4
๐Novel unified coarse-to-fine Transformer for arbitrary shape text detection
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Coarse-to-fine arbitrary text detection
โ Accurate text detection, NO post-process
โ Boundary proposal generation mechanism
โ Innovative boundary transformer (iterative)
โ Boundary energy loss (BEL) for refinement
More: https://bit.ly/3D6Ryt4
โค8๐2๐ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฒ Open-Source Self-Driving projects ๐ฒ
๐A free repo with many autonomous vehicle-related projects
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Basic/Advance Lane/Line Detection
โ Driving behavior by training & validating
โ Autopilot: predicting steering angle
More: https://bit.ly/3qqJ7RB
๐A free repo with many autonomous vehicle-related projects
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Basic/Advance Lane/Line Detection
โ Driving behavior by training & validating
โ Autopilot: predicting steering angle
More: https://bit.ly/3qqJ7RB
๐ฅ22๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅคK-VIL: Keypoint-based visual imitation๐ฅค
๐K-VIL: auto-incremental extraction of object-centric task representation.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Efficient task-relevant keypoints
โ Embodiment-independent tasks
โ Adaptation of tasks to new scenes
โ Input: only a small set of demo clips
โ Novel keypoint-based controller
More: https://bit.ly/3eIrxpP
๐K-VIL: auto-incremental extraction of object-centric task representation.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Efficient task-relevant keypoints
โ Embodiment-independent tasks
โ Adaptation of tasks to new scenes
โ Input: only a small set of demo clips
โ Novel keypoint-based controller
More: https://bit.ly/3eIrxpP
๐ฅ7๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ #Selfdriving in 80's. Damn Romantic ๐
๐The first self-driving car with people on board, 1986. So slow and lovely.
More: https://bit.ly/3BtRDon
๐The first self-driving car with people on board, 1986. So slow and lovely.
More: https://bit.ly/3BtRDon
โค9๐4๐3
This media is not supported in your browser
VIEW IN TELEGRAM
๐ต๏ธ TORAS: SOTA #AI for annotation ๐ต๏ธ
๐TORAS: web-based AI-powered, cooperative, annotation platform.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ SOTA AI tools -> significant speedup
โ "Recipes" to define how to annotate
โ Repo with folder structure for storage
โ Also on-prem for (commercial) firms
More: https://bit.ly/3L78YI2
๐TORAS: web-based AI-powered, cooperative, annotation platform.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ SOTA AI tools -> significant speedup
โ "Recipes" to define how to annotate
โ Repo with folder structure for storage
โ Also on-prem for (commercial) firms
More: https://bit.ly/3L78YI2
๐ฅ9๐คฏ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฎMAXIM: Multi-Axis MLP for Vision๐ฎ
๐#Google opens MAXIM, a multi-axis MLP for low-level vision
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Denoising, deblurring, dehazing, etc
โ Multi-axis gated MLP, linear complexity
โ Cross gating block, separate features
โ SOTA results on several datasets!
More: https://bit.ly/3Dmp8LI
๐#Google opens MAXIM, a multi-axis MLP for low-level vision
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Denoising, deblurring, dehazing, etc
โ Multi-axis gated MLP, linear complexity
โ Cross gating block, separate features
โ SOTA results on several datasets!
More: https://bit.ly/3Dmp8LI
๐ฅ12โค1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅ A Survey on Diffusion Models ๐ฅ
๐A comprehensive review of denoising diffusion models in #computervision ๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Overview on diffusion models
โ Hot trend for the generative AI
โ A multi-perspective categorization
โ Current limitations / new directions
More: https://bit.ly/3RYG5zP
๐A comprehensive review of denoising diffusion models in #computervision ๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Overview on diffusion models
โ Hot trend for the generative AI
โ A multi-perspective categorization
โ Current limitations / new directions
More: https://bit.ly/3RYG5zP
โค5๐3๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐#AI finds where IG photos are taken๐
๐Brilliant work of Depoorter, Belgium artist that handles #privacy, #AI & #socialmedia
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Recorded open cameras for weeks
โ Scraped all #Instagram photos
โ Matching Instagram vs. footage
More: https://bit.ly/3eL5dfc
๐Brilliant work of Depoorter, Belgium artist that handles #privacy, #AI & #socialmedia
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Recorded open cameras for weeks
โ Scraped all #Instagram photos
โ Matching Instagram vs. footage
More: https://bit.ly/3eL5dfc
๐ฑ18๐13๐ฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฏSAMURAI: in-the-wild Shape/Material๐ฏ
๐#Google SAMURAI: shape, BRDF, per-image pose & illumination. Relightable #3D assets for #AR/#VR.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Parametrization for varying distances
โ Camera multiplex optimization
โ Posterior scaling of input images
โ Explicit meshes extraction with BRDF
โ Code/data soon available ->#NeurIPS
More: https://bit.ly/3BKWgf3
๐#Google SAMURAI: shape, BRDF, per-image pose & illumination. Relightable #3D assets for #AR/#VR.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Parametrization for varying distances
โ Camera multiplex optimization
โ Posterior scaling of input images
โ Explicit meshes extraction with BRDF
โ Code/data soon available ->#NeurIPS
More: https://bit.ly/3BKWgf3
๐8๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐จ Lang<->Pics in 100+ Languages ๐จ
๐#Google PaLI: unified lang-image #AI to perform tasks in 109 languages ๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ PaLI: Pathways Lang & Image model
โ Answering, captioning, reasoning, etc
โ From Eng. to 109 lang. understanding
โ The new SOTA on several datasets
More: https://bit.ly/3QMslHC
๐#Google PaLI: unified lang-image #AI to perform tasks in 109 languages ๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ PaLI: Pathways Lang & Image model
โ Answering, captioning, reasoning, etc
โ From Eng. to 109 lang. understanding
โ The new SOTA on several datasets
More: https://bit.ly/3QMslHC
๐ฅ6๐1๐ฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐PeRFception: Largest IR Dataset๐
๐#Nvidia, a new frontier in data collection via Plenoxels: same info, -96.4% in size.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ POSTECH + NVIDIA + Caltech = ๐คฏ
โ Size: -96.4% from original dataset!
โ 2D/3D image/object class/semantic
โ Ready-to-use pipeline for implicit dataset
More: https://bit.ly/3eW9hJA
๐#Nvidia, a new frontier in data collection via Plenoxels: same info, -96.4% in size.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ POSTECH + NVIDIA + Caltech = ๐คฏ
โ Size: -96.4% from original dataset!
โ 2D/3D image/object class/semantic
โ Ready-to-use pipeline for implicit dataset
More: https://bit.ly/3eW9hJA
โค9โคโ๐ฅ1๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ธ CHARL-E: Stable Diffusion in 1 click ๐ธ
๐CHARL-E packages Stable Diffusion into a simple app.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ No setup, dependencies, or internet
โ Images with 1-click on #macbook
โ Suitable only for M1/M2 processor
โ Source code under MIT license
More: https://bit.ly/3xv2z3G
๐CHARL-E packages Stable Diffusion into a simple app.
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ No setup, dependencies, or internet
โ Images with 1-click on #macbook
โ Suitable only for M1/M2 processor
โ Source code under MIT license
More: https://bit.ly/3xv2z3G
๐ฅ11๐3โคโ๐ฅ1โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐YOLOPv2: Better Driving Perception๐
๐YOLOPv2: simultaneous object, road segmentation & lane detection
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ E2E perception net with better backbone
โ Efficient ELAN for reasonable memory
โ Stability for adapting to scenarios
โ SOTA on BDD100K, +50% faster!
โ Source code under MIT license
More: https://bit.ly/3LvYGBh
๐YOLOPv2: simultaneous object, road segmentation & lane detection
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ E2E perception net with better backbone
โ Efficient ELAN for reasonable memory
โ Stability for adapting to scenarios
โ SOTA on BDD100K, +50% faster!
โ Source code under MIT license
More: https://bit.ly/3LvYGBh
๐ฅ12
๐SegNeXt: new SOTA in Semantic Seg.๐
๐SOTA (by large margin) on ADE20K, Cityscapes, COCO-Stuff, Pascal VOC, Pascal Context, and iSAID ๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel tailored network architecture
โ Spatial attention via multi-scale feats
โ Encoder + conv. better than transformers
โ SOTA on several datasets (ADE20K, etc.)
More: https://bit.ly/3UrZhrH
๐SOTA (by large margin) on ADE20K, Cityscapes, COCO-Stuff, Pascal VOC, Pascal Context, and iSAID ๐คฏ
๐๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โ Novel tailored network architecture
โ Spatial attention via multi-scale feats
โ Encoder + conv. better than transformers
โ SOTA on several datasets (ADE20K, etc.)
More: https://bit.ly/3UrZhrH
๐ฅ9๐1