This media is not supported in your browser
VIEW IN TELEGRAM
π₯IDOL (#CVPR2022 winner): code is out!π₯
πIDOL for VIS: outperforming all online/offline methods, the new SOTA!
ππ’π π‘π₯π’π π‘ππ¬:
β Online usually inferior by >10AP
β Online based on contrast-learning
β Discriminative++ instance embeddings
β Full exploiting history for stability
More https://bit.ly/3dXCDXw
πIDOL for VIS: outperforming all online/offline methods, the new SOTA!
ππ’π π‘π₯π’π π‘ππ¬:
β Online usually inferior by >10AP
β Online based on contrast-learning
β Discriminative++ instance embeddings
β Full exploiting history for stability
More https://bit.ly/3dXCDXw
π€―16π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ #AIwithPapers: we are 4,000+! π₯
ππLot of people joined, and we talked about #StableDiffusion only twice! Can't believe it.ππ
π Invite your friends -> https://t.me/AI_DeepLearning
ππLot of people joined, and we talked about #StableDiffusion only twice! Can't believe it.ππ
π Invite your friends -> https://t.me/AI_DeepLearning
π₯10
This media is not supported in your browser
VIEW IN TELEGRAM
π΅ Deep Saliency: driving the attention π΅
πGoogle unveils a family of operators to "drive" human saliency
ππ’π π‘π₯π’π π‘ππ¬:
β Editing image to drive Saliency
β Transforms to hide distractors
β Warping operator for distractor
β GAN-op for less-saliency altern.
More: https://bit.ly/3KoQQc2
πGoogle unveils a family of operators to "drive" human saliency
ππ’π π‘π₯π’π π‘ππ¬:
β Editing image to drive Saliency
β Transforms to hide distractors
β Warping operator for distractor
β GAN-op for less-saliency altern.
More: https://bit.ly/3KoQQc2
π9π€©4
This media is not supported in your browser
VIEW IN TELEGRAM
π#3D scene manipulation from 2Dπ
πReconstruct, decompose, manipulate & render 3D scenes in a single pipeline
ππ’π π‘π₯π’π π‘ππ¬:
β Unique 3D, non-occupied space from 2D
β Inverse query algorithm for shapes
β First synthetic dataset for 3D editing
More: https://bit.ly/3RlYhTY
πReconstruct, decompose, manipulate & render 3D scenes in a single pipeline
ππ’π π‘π₯π’π π‘ππ¬:
β Unique 3D, non-occupied space from 2D
β Inverse query algorithm for shapes
β First synthetic dataset for 3D editing
More: https://bit.ly/3RlYhTY
π₯11β€1
This media is not supported in your browser
VIEW IN TELEGRAM
πStableFace: Talking Face Generationπ
πAnalysis on motion jittering in 3D face generation (audio-in -> video-out)
ππ’π π‘π₯π’π π‘ππ¬:
β Motion jittering analysis for stability
β Gaussian-based adaptive smoothing
β Augmented erosions of neural renderer
β Audio-fused generator for dependency
More: https://bit.ly/3Kt95gI
πAnalysis on motion jittering in 3D face generation (audio-in -> video-out)
ππ’π π‘π₯π’π π‘ππ¬:
β Motion jittering analysis for stability
β Gaussian-based adaptive smoothing
β Augmented erosions of neural renderer
β Audio-fused generator for dependency
More: https://bit.ly/3Kt95gI
π5π±3β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π§‘ Avatarization in 90's. So Romantic π§‘
πMaking of the first #MortalKombat in early 90's
More: https://bit.ly/3wTSpJB
πMaking of the first #MortalKombat in early 90's
More: https://bit.ly/3wTSpJB
β€13
This media is not supported in your browser
VIEW IN TELEGRAM
π Massive Dataset in Virtual Cities π
πSynthehicle: 7 hours of labeled material, 340 cams, 64 days, rain, dawn, & night scenes.
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-target multi-cam tracking
β 2D, 3D, segm. & depth annotations
β Instance, semantic & panoptic segm.
β 340 clips, 64 scenes, 17 hrs, 4M BBs
More: https://bit.ly/3TArHiV
πSynthehicle: 7 hours of labeled material, 340 cams, 64 days, rain, dawn, & night scenes.
ππ’π π‘π₯π’π π‘ππ¬:
β Multi-target multi-cam tracking
β 2D, 3D, segm. & depth annotations
β Instance, semantic & panoptic segm.
β 340 clips, 64 scenes, 17 hrs, 4M BBs
More: https://bit.ly/3TArHiV
β€10π6
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ¨Controllable #3D Adversarial Faceπͺ¨
π#Meta (+CMU) on decoupling identity/expression + granular control over expressions
ππ’π π‘π₯π’π π‘ππ¬:
β Supervised auto-enc. + GAN
β UV texture maps + 3D faces
β Control expression, saving ID
β Code under X11 License
More: https://bit.ly/3AVE80q
π#Meta (+CMU) on decoupling identity/expression + granular control over expressions
ππ’π π‘π₯π’π π‘ππ¬:
β Supervised auto-enc. + GAN
β UV texture maps + 3D faces
β Control expression, saving ID
β Code under X11 License
More: https://bit.ly/3AVE80q
π6
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ DALLΒ·E: Outpainting via #NLP π₯
πExtending any original image, creating large-scale images in any aspect ratio
ππ’π π‘π₯π’π π‘ππ¬:
β Extending an image beyond its borders
β Visual elements in same style of the input
β Driving the image "story" in new directions
β Shadows, reflections & textures w/ context
More: https://bit.ly/3eoH8uD
πExtending any original image, creating large-scale images in any aspect ratio
ππ’π π‘π₯π’π π‘ππ¬:
β Extending an image beyond its borders
β Visual elements in same style of the input
β Driving the image "story" in new directions
β Shadows, reflections & textures w/ context
More: https://bit.ly/3eoH8uD
π₯20π€―7β€1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺοΈ TimeLapse++: Video Temporal PyramidπͺοΈ
πMulti-scale lens to view the passage of time: far beyond a "classic" timelapse
ππ’π π‘π₯π’π π‘ππ¬:
β Inspired by "old-school" spatial pyramids
β Video Spectrogram to go through pyramid
β Months/years of data in a few seconds!
β Multi-temporal freq., no aliasing
More: https://bit.ly/3TKnYPS
πMulti-scale lens to view the passage of time: far beyond a "classic" timelapse
ππ’π π‘π₯π’π π‘ππ¬:
β Inspired by "old-school" spatial pyramids
β Video Spectrogram to go through pyramid
β Months/years of data in a few seconds!
β Multi-temporal freq., no aliasing
More: https://bit.ly/3TKnYPS
π€―6π2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π« Stable Diffusion Video is out! π«
πA free notebook to generate videos by interpolating the latent space of SD.
ππ’π π‘π₯π’π π‘ππ¬:
β Blueberry to strawberry spaghetti
β Dream items from same prompt
β Morph different prompts (seeds)
β Built on a script by A. Karpathy
More: https://bit.ly/3ey8632
πA free notebook to generate videos by interpolating the latent space of SD.
ππ’π π‘π₯π’π π‘ππ¬:
β Blueberry to strawberry spaghetti
β Dream items from same prompt
β Morph different prompts (seeds)
β Built on a script by A. Karpathy
More: https://bit.ly/3ey8632
π€―15π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ VMT: Video Mask Transfiner π¦
πNovel highly efficient ViT structure for video instance segmentation.
ππ’π π‘π₯π’π π‘ππ¬:
β HD & more temporally stable mask
β Higher resolution features for VIS
β Detecting error-prone s-t. regions
β Auto-refinement on training data!
More: https://bit.ly/3RKXtb4
πNovel highly efficient ViT structure for video instance segmentation.
ππ’π π‘π₯π’π π‘ππ¬:
β HD & more temporally stable mask
β Higher resolution features for VIS
β Detecting error-prone s-t. regions
β Auto-refinement on training data!
More: https://bit.ly/3RKXtb4
π€―9β€1
π€― #StableDiffusion + #Dallemini = BOOM! π€―
πA #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)
More: https://bit.ly/3TTOshR
πA #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)
More: https://bit.ly/3TTOshR
π₯9π5π’1
This media is not supported in your browser
VIEW IN TELEGRAM
π VIS - Deformable Transformers π
πDeVIS: VIS method with efficiency and performance of deformable ViT
ππ’π π‘π₯π’π π‘ππ¬:
β Temp. multi-scale D-Attention
β Instance-aware object queries
β Mask: DA + multi-scale feats map
β Improved multi-cue clip tracking
β SOTA on YouTube-VIS 2021/OVIS
More: https://bit.ly/3TQv1Xc
πDeVIS: VIS method with efficiency and performance of deformable ViT
ππ’π π‘π₯π’π π‘ππ¬:
β Temp. multi-scale D-Attention
β Instance-aware object queries
β Mask: DA + multi-scale feats map
β Improved multi-cue clip tracking
β SOTA on YouTube-VIS 2021/OVIS
More: https://bit.ly/3TQv1Xc
π₯8β€1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π X-NeRF: Cross-Spectral NeRF π
πCross-Spectral NeRF from cams with different light spectrums
ππ’π π‘π₯π’π π‘ππ¬:
β First ever cross-spectral NeRF
β Avoiding non-trivial calib/match
β Normalized Cross-Device Coords
β Novel dataset w/ RGB, MS, & IR
More: https://bit.ly/3RqHnUo
πCross-Spectral NeRF from cams with different light spectrums
ππ’π π‘π₯π’π π‘ππ¬:
β First ever cross-spectral NeRF
β Avoiding non-trivial calib/match
β Normalized Cross-Device Coords
β Novel dataset w/ RGB, MS, & IR
More: https://bit.ly/3RqHnUo
π7
This media is not supported in your browser
VIEW IN TELEGRAM
πΉTT-GNeRF: generative NeRF for FacesπΉ
πTT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + Uni_Trento + #Snap π€―
β DAEM for disentanglement of 3D model
β "Training-as-Init, Optimizing-for-Tuning"
β Consistency++, preserving non-target ROI
β Unsupervised optimization of geometry
More: https://bit.ly/3ARZmMw
πTT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + Uni_Trento + #Snap π€―
β DAEM for disentanglement of 3D model
β "Training-as-Init, Optimizing-for-Tuning"
β Consistency++, preserving non-target ROI
β Unsupervised optimization of geometry
More: https://bit.ly/3ARZmMw
π₯4β€1π1
πͺ SOTA in Arbitrary Shape Text Detection πͺ
πNovel unified coarse-to-fine Transformer for arbitrary shape text detection
ππ’π π‘π₯π’π π‘ππ¬:
β Coarse-to-fine arbitrary text detection
β Accurate text detection, NO post-process
β Boundary proposal generation mechanism
β Innovative boundary transformer (iterative)
β Boundary energy loss (BEL) for refinement
More: https://bit.ly/3D6Ryt4
πNovel unified coarse-to-fine Transformer for arbitrary shape text detection
ππ’π π‘π₯π’π π‘ππ¬:
β Coarse-to-fine arbitrary text detection
β Accurate text detection, NO post-process
β Boundary proposal generation mechanism
β Innovative boundary transformer (iterative)
β Boundary energy loss (BEL) for refinement
More: https://bit.ly/3D6Ryt4
β€8π2π’1
This media is not supported in your browser
VIEW IN TELEGRAM
π² Open-Source Self-Driving projects π²
πA free repo with many autonomous vehicle-related projects
ππ’π π‘π₯π’π π‘ππ¬:
β Basic/Advance Lane/Line Detection
β Driving behavior by training & validating
β Autopilot: predicting steering angle
More: https://bit.ly/3qqJ7RB
πA free repo with many autonomous vehicle-related projects
ππ’π π‘π₯π’π π‘ππ¬:
β Basic/Advance Lane/Line Detection
β Driving behavior by training & validating
β Autopilot: predicting steering angle
More: https://bit.ly/3qqJ7RB
π₯22π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯€K-VIL: Keypoint-based visual imitationπ₯€
πK-VIL: auto-incremental extraction of object-centric task representation.
ππ’π π‘π₯π’π π‘ππ¬:
β Efficient task-relevant keypoints
β Embodiment-independent tasks
β Adaptation of tasks to new scenes
β Input: only a small set of demo clips
β Novel keypoint-based controller
More: https://bit.ly/3eIrxpP
πK-VIL: auto-incremental extraction of object-centric task representation.
ππ’π π‘π₯π’π π‘ππ¬:
β Efficient task-relevant keypoints
β Embodiment-independent tasks
β Adaptation of tasks to new scenes
β Input: only a small set of demo clips
β Novel keypoint-based controller
More: https://bit.ly/3eIrxpP
π₯7π1