πΉ NOAH just open-sourced! πΉ
πA novel approach to find the optimal design of prompt modules through NAS algos.
ππ’π π‘π₯π’π π‘ππ¬:
β NOAH from Neural prOmpt seArcH
β Parameter-efficient βprompt modulesβ
β Efficient NAS-based implementation
β Better than transfer, few-shot & domain gen.
More: https://bit.ly/3MKfVhi
πA novel approach to find the optimal design of prompt modules through NAS algos.
ππ’π π‘π₯π’π π‘ππ¬:
β NOAH from Neural prOmpt seArcH
β Parameter-efficient βprompt modulesβ
β Efficient NAS-based implementation
β Better than transfer, few-shot & domain gen.
More: https://bit.ly/3MKfVhi
π5π2π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
ππ»ββοΈNeural Super-Resolution in Moviesππ»ββοΈ
πImplicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!
ππ’π π‘π₯π’π π‘ππ¬:
β Video as continuous video representation
β Clips in arbitrary space/time resolution
β OOD generalization in space-time
β Source code and models available
More: https://bit.ly/3xsqccf
πImplicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!
ππ’π π‘π₯π’π π‘ππ¬:
β Video as continuous video representation
β Clips in arbitrary space/time resolution
β OOD generalization in space-time
β Source code and models available
More: https://bit.ly/3xsqccf
π₯6π2
This media is not supported in your browser
VIEW IN TELEGRAM
π§ Bias in #AI, explained simple π§
πAsking DallE-Mini to help me to show what the BIAS in #AI is
πππ§ππ«ππππ πππ¦π©π₯ππ¬:
β Best eng.->men/Caucasians
β Best doctors->men/Caucasians
β Top CEOs->men/Caucasians
β Chef, kitchen->men/Caucasians
β Rich People->only Caucasians
β Poor People->non-Caucasians
β Italian engineers->back in 30's
β Chinese eng.->infrastructures
β Italian working->local market
β Chinese working->vegetables
β Men workers->constructions
β Women workers->only office
More: https://bit.ly/3b0UFqd
πAsking DallE-Mini to help me to show what the BIAS in #AI is
πππ§ππ«ππππ πππ¦π©π₯ππ¬:
β Best eng.->men/Caucasians
β Best doctors->men/Caucasians
β Top CEOs->men/Caucasians
β Chef, kitchen->men/Caucasians
β Rich People->only Caucasians
β Poor People->non-Caucasians
β Italian engineers->back in 30's
β Chinese eng.->infrastructures
β Italian working->local market
β Chinese working->vegetables
β Men workers->constructions
β Women workers->only office
More: https://bit.ly/3b0UFqd
π13β€6π4
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ SAVi++: Segmentation by #Google π¦
πNovel unsupervised object-centric #AI to predict depth signals from slot-based video representation
ππ’π π‘π₯π’π π‘ππ¬:
β Segmenting complex dynamic scenes
β Static/Moving objects on naturalistic BG
β LiDAR-SAVi: segmenting in the wild
β Source code and model soon available!
More: https://bit.ly/3n3hywd
πNovel unsupervised object-centric #AI to predict depth signals from slot-based video representation
ππ’π π‘π₯π’π π‘ππ¬:
β Segmenting complex dynamic scenes
β Static/Moving objects on naturalistic BG
β LiDAR-SAVi: segmenting in the wild
β Source code and model soon available!
More: https://bit.ly/3n3hywd
π₯7π6π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
βHaGRID : Half Million Handsπ
πRussian Sberbank opens HaGRID, enormous dataset for HGR. "Peace" label is present π΅π‘
ππ’π π‘π₯π’π π‘ππ¬:
β 552,992 samples, 18 classes
β HD resolution in RGB format
β BBox, gesture, leading hands
β Dataset/models available
More: https://bit.ly/3n2cd8r
πRussian Sberbank opens HaGRID, enormous dataset for HGR. "Peace" label is present π΅π‘
ππ’π π‘π₯π’π π‘ππ¬:
β 552,992 samples, 18 classes
β HD resolution in RGB format
β BBox, gesture, leading hands
β Dataset/models available
More: https://bit.ly/3n2cd8r
β€11π€2
π₯ #AIwithPapers: we are 2,900+! π₯
ππ Cheers from "Black Metal Lady Gaga" plotted by DallE-mini ππ
π Invite your friends -> https://t.me/AI_DeepLearning
ππ Cheers from "Black Metal Lady Gaga" plotted by DallE-mini ππ
π Invite your friends -> https://t.me/AI_DeepLearning
π8π3β€2
This media is not supported in your browser
VIEW IN TELEGRAM
π
Segmentation with INSANE Occlusionsπ
πCMU unveils WALT: segmenting in severe occlusion scenarios. Performance over human.
ππ’π π‘π₯π’π π‘ππ¬:
β WALT: Watch & Learn Time-lapse
β 4K/1080p cams on streets over a year
β Performance over human-supervised
β Object-occluder-occluded neural layers
β Source code under MIT license
More: https://bit.ly/3n7pvjO
πCMU unveils WALT: segmenting in severe occlusion scenarios. Performance over human.
ππ’π π‘π₯π’π π‘ππ¬:
β WALT: Watch & Learn Time-lapse
β 4K/1080p cams on streets over a year
β Performance over human-supervised
β Object-occluder-occluded neural layers
β Source code under MIT license
More: https://bit.ly/3n7pvjO
π€―14π4π₯3
This media is not supported in your browser
VIEW IN TELEGRAM
π Largest Dataset for #autonomousdrivingπ
πSHIFT: largest synthetic dataset for #selfdrivingcars. Shifts in cloud, rain, fog, time of day, vehicle & pedestrian densityπ€―
ππ’π π‘π₯π’π π‘ππ¬:
β 4,800+ clips, multi-view sensor suite
β Semantic/instance, M/stereo depth
β 2D/3D object detection, MOT
β Optical flow, point cloud registration
β Visual-Odo, trajectory & human pose
More: https://bit.ly/3HJBUUT
πSHIFT: largest synthetic dataset for #selfdrivingcars. Shifts in cloud, rain, fog, time of day, vehicle & pedestrian densityπ€―
ππ’π π‘π₯π’π π‘ππ¬:
β 4,800+ clips, multi-view sensor suite
β Semantic/instance, M/stereo depth
β 2D/3D object detection, MOT
β Optical flow, point cloud registration
β Visual-Odo, trajectory & human pose
More: https://bit.ly/3HJBUUT
π€―9π5β€2
This media is not supported in your browser
VIEW IN TELEGRAM
π¦Big Egocentric Dataset by #Meta π¦
πNovel dataset to speed-up research on egocentric MR/AI
ππ’π π‘π₯π’π π‘ππ¬:
β 159 sequences, multiple sensors
β Scenarios: cooking, exercising, etc.
β βDesktop Activitiesβ via multi-view mocap
β Dataset available upon request
More: https://bit.ly/3QDccVW
πNovel dataset to speed-up research on egocentric MR/AI
ππ’π π‘π₯π’π π‘ππ¬:
β 159 sequences, multiple sensors
β Scenarios: cooking, exercising, etc.
β βDesktop Activitiesβ via multi-view mocap
β Dataset available upon request
More: https://bit.ly/3QDccVW
π₯8π3
This media is not supported in your browser
VIEW IN TELEGRAM
π¦Transf-Codebook HD-Face Restorationπ¦
πS-Lab unveils CodeFormer: hyper-datailed face restoration from degraded clips
ππ’π π‘π₯π’π π‘ππ¬:
β Face restoration as a code prediction
β Discrete CB prior in small proxy space
β Controllable transformation for LQ->HQ
β Robustness and global coherence
β Code and models soon available
More: https://bit.ly/3QEa9B5
πS-Lab unveils CodeFormer: hyper-datailed face restoration from degraded clips
ππ’π π‘π₯π’π π‘ππ¬:
β Face restoration as a code prediction
β Discrete CB prior in small proxy space
β Controllable transformation for LQ->HQ
β Robustness and global coherence
β Code and models soon available
More: https://bit.ly/3QEa9B5
π₯13π7β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π Fully Controllable "NeRF" Faces π
πNeural control of pose/expressions from single portrait video
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF-control of the human head
β Loss of rigidity by dynamic NeRF
β 3D full control/modelling of faces
β No source code or models yet π’
More: https://bit.ly/3OEjwi7
πNeural control of pose/expressions from single portrait video
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF-control of the human head
β Loss of rigidity by dynamic NeRF
β 3D full control/modelling of faces
β No source code or models yet π’
More: https://bit.ly/3OEjwi7
π₯8π3β€2
This media is not supported in your browser
VIEW IN TELEGRAM
π«I M AVATAR: source code is out!π«
πNeural implicit head avatars from monocular videos
ππ’π π‘π₯π’π π‘ππ¬:
β #3D morphing-based implicit avatar
β Detailed Geometry/appearance
β D-Rendering e2e learning from clips
β Novel synthetic dataset for evaluation
More: https://bit.ly/3A2yzy9
πNeural implicit head avatars from monocular videos
ππ’π π‘π₯π’π π‘ππ¬:
β #3D morphing-based implicit avatar
β Detailed Geometry/appearance
β D-Rendering e2e learning from clips
β Novel synthetic dataset for evaluation
More: https://bit.ly/3A2yzy9
π8π4
This media is not supported in your browser
VIEW IN TELEGRAM
πΊοΈNeural Translation Image -> MapπΊοΈ
πA novel method for instantaneous mapping as a translation problem
ππ’π π‘π₯π’π π‘ππ¬:
β Birdβs-eye-view (BEV) map from image
β A restricted data-efficient transformer
β Monotonic attention from lang.domain
β SOTA across several datasets
More: https://bit.ly/39MQ76Z
πA novel method for instantaneous mapping as a translation problem
ππ’π π‘π₯π’π π‘ππ¬:
β Birdβs-eye-view (BEV) map from image
β A restricted data-efficient transformer
β Monotonic attention from lang.domain
β SOTA across several datasets
More: https://bit.ly/39MQ76Z
π₯20π6π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯Ά E2V-SDE: biggest troll ever? π₯Ά
πE2V-SDE paper (accepted to #CVPR2022) consists of texts copied from 10+ previously published papers π
ππ’π π‘π₯π’π π‘ππ¬:
β Latent ODEs for Irregularly-Sampled TS
β Stochastic Adversarial Video Prediction
β Continuous Latent Process Flows
β More papers....
More: https://bit.ly/3bsL8Zw (AUDIO ON!)
πE2V-SDE paper (accepted to #CVPR2022) consists of texts copied from 10+ previously published papers π
ππ’π π‘π₯π’π π‘ππ¬:
β Latent ODEs for Irregularly-Sampled TS
β Stochastic Adversarial Video Prediction
β Continuous Latent Process Flows
β More papers....
More: https://bit.ly/3bsL8Zw (AUDIO ON!)
π9
This media is not supported in your browser
VIEW IN TELEGRAM
π₯π₯YOLOv6 is out: PURE FIRE!π₯π₯
πYOLOv6 is a single-stage object detection framework for industrial applications
ππ’π π‘π₯π’π π‘ππ¬:
β Efficient Decoupled Head with SIoU Loss
β Hardware-friendly for Backbone/Neck
β 520+ FPS on T4 + TensorRT FP16
β Released under GNU General Public v3.0
More: https://bit.ly/3OLjncK
πYOLOv6 is a single-stage object detection framework for industrial applications
ππ’π π‘π₯π’π π‘ππ¬:
β Efficient Decoupled Head with SIoU Loss
β Hardware-friendly for Backbone/Neck
β 520+ FPS on T4 + TensorRT FP16
β Released under GNU General Public v3.0
More: https://bit.ly/3OLjncK
π₯37π6
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ BlazePose: Real-Time Human Tracking πͺ
πNovel real-time #3D human landmarks from #google. Suitable for mobile.
ππ’π π‘π₯π’π π‘ππ¬:
β MoCap from single RGB on mobile
β Avatar, Fitness, #Yoga & AR/VR
β Full body pose from monocular
β Novel 3D ground truth acquisition
β Additional hand landmarks
β Fully integrated in #MediaPipe
More: https://bit.ly/3uvyiAv
πNovel real-time #3D human landmarks from #google. Suitable for mobile.
ππ’π π‘π₯π’π π‘ππ¬:
β MoCap from single RGB on mobile
β Avatar, Fitness, #Yoga & AR/VR
β Full body pose from monocular
β Novel 3D ground truth acquisition
β Additional hand landmarks
β Fully integrated in #MediaPipe
More: https://bit.ly/3uvyiAv
π₯14π4
This media is not supported in your browser
VIEW IN TELEGRAM
π₯YOLOv7: YOLO for segmentationπ₯
πYOLOv7: adding a lot of newer skills to the YOLO architecture family.
ππ’π π‘π₯π’π π‘ππ¬:
β YOLOv7, not a successor of YOLO family!
β Framework for detection & segmentation
β Applications based on #META detectron2
β DETR & ViT detection out-of-box
β Easy support for pipeline thought #ONNX
β YOLOv4 + InstanceSegm. via single stage
β The latest YOLOv6 training is supported!
β Source code under GPL license.
More: https://bit.ly/3ysSJAp
πYOLOv7: adding a lot of newer skills to the YOLO architecture family.
ππ’π π‘π₯π’π π‘ππ¬:
β YOLOv7, not a successor of YOLO family!
β Framework for detection & segmentation
β Applications based on #META detectron2
β DETR & ViT detection out-of-box
β Easy support for pipeline thought #ONNX
β YOLOv4 + InstanceSegm. via single stage
β The latest YOLOv6 training is supported!
β Source code under GPL license.
More: https://bit.ly/3ysSJAp
π₯22π€―9π5π2
This media is not supported in your browser
VIEW IN TELEGRAM
π₯π₯ HD Dichotomous Segmentation π₯π₯
π A new task to segment highly accurate objects from natural images.
ππ’π π‘π₯π’π π‘ππ¬:
β 5,000+ HD images + accurate binary mask
β IS-Net baseline in high-dim feature spaces
β HCE: model vs. human interventions
β Source code (should be) available soon
More: https://bit.ly/3ah2BDO
π A new task to segment highly accurate objects from natural images.
ππ’π π‘π₯π’π π‘ππ¬:
β 5,000+ HD images + accurate binary mask
β IS-Net baseline in high-dim feature spaces
β HCE: model vs. human interventions
β Source code (should be) available soon
More: https://bit.ly/3ah2BDO
π₯13
This media is not supported in your browser
VIEW IN TELEGRAM
π₯π₯ Neural Segmentation on fire π₯π₯
πNovel methods for segmentation with mask calibration. Robustness++ in VOS.
ππ’π π‘π₯π’π π‘ππ¬:
β Study: VOS robustness vs. perturbations
β Adaptive object proxy (AOP) aggregation
β Less errors due unstable pixel-level match
β Code/models (should be) available soon
More: https://bit.ly/3yhIY6Q
πNovel methods for segmentation with mask calibration. Robustness++ in VOS.
ππ’π π‘π₯π’π π‘ππ¬:
β Study: VOS robustness vs. perturbations
β Adaptive object proxy (AOP) aggregation
β Less errors due unstable pixel-level match
β Code/models (should be) available soon
More: https://bit.ly/3yhIY6Q
π15β€1π₯1