AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
πŸ”₯One Millisecond Backbone. Fire!πŸ”₯

πŸ‘‰MobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…75.9% top-1 accuracy on ImageNet
βœ…38Γ— faster than MobileFormer net
βœ…Classification, detection & segmentation
βœ…Source code & model soon available!

More: https://bit.ly/3tsT7f2
❀24πŸ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
🧨 Scaling Transformers to GigaPixels!🧨

πŸ‘‰Novel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Gigapixel whole-slide imaging (WSI)
βœ…Leveraging natural hier. structure of WSI
βœ…Self-supervised Hi-Res representations
βœ…Source code and models available!

More: https://bit.ly/3xLuzkg
🀯16πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘—BodyMap: Hyper-Detailed HumansπŸ‘—

πŸ‘‰#META unveils 1st-ever dense continuous correspondence for clothed humans

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…1st-ever dense continuous corresp.
βœ…HQ fingers, hair, and clothes
βœ…Novel ViT-based architecture
βœ…SOTA on DensePose COCO

More: https://bit.ly/39nEPps
πŸ‘13❀2
🐹 NOAH just open-sourced! 🐹

πŸ‘‰A novel approach to find the optimal design of prompt modules through NAS algos.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…NOAH from Neural prOmpt seArcH
βœ…Parameter-efficient β€œprompt modules”
βœ…Efficient NAS-based implementation
βœ…Better than transfer, few-shot & domain gen.

More: https://bit.ly/3MKfVhi
πŸ‘5πŸ‘2πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ„πŸ»β€β™€οΈNeural Super-Resolution in MoviesπŸ„πŸ»β€β™€οΈ

πŸ‘‰Implicit neural representation to get arbitrary spatial resolution & FPS -> Super Resolution!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Video as continuous video representation
βœ…Clips in arbitrary space/time resolution
βœ…OOD generalization in space-time
βœ…Source code and models available

More: https://bit.ly/3xsqccf
πŸ”₯6πŸ‘2
This media is not supported in your browser
VIEW IN TELEGRAM
🧠 Bias in #AI, explained simple 🧠

πŸ‘‰Asking DallE-Mini to help me to show what the BIAS in #AI is

π†πžπ§πžπ«πšπ­πžπ π’πšπ¦π©π₯𝐞𝐬:
βœ…Best eng.->men/Caucasians
βœ…Best doctors->men/Caucasians
βœ…Top CEOs->men/Caucasians
βœ…Chef, kitchen->men/Caucasians
βœ…Rich People->only Caucasians
βœ…Poor People->non-Caucasians
βœ…Italian engineers->back in 30's
βœ…Chinese eng.->infrastructures
βœ…Italian working->local market
βœ…Chinese working->vegetables
βœ…Men workers->constructions
βœ…Women workers->only office

More: https://bit.ly/3b0UFqd
πŸ‘13❀6😁4
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦• SAVi++: Segmentation by #Google πŸ¦•

πŸ‘‰Novel unsupervised object-centric #AI to predict depth signals from slot-based video representation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Segmenting complex dynamic scenes
βœ…Static/Moving objects on naturalistic BG
βœ…LiDAR-SAVi: segmenting in the wild
βœ…Source code and model soon available!

More: https://bit.ly/3n3hywd
πŸ”₯7πŸ‘6πŸ₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
βœ‹HaGRID : Half Million HandsπŸ‘‹

πŸ‘‰Russian Sberbank opens HaGRID, enormous dataset for HGR. "Peace" label is present πŸ”΅πŸŸ‘

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…552,992 samples, 18 classes
βœ…HD resolution in RGB format
βœ…BBox, gesture, leading hands
βœ…Dataset/models available

More: https://bit.ly/3n2cd8r
❀11πŸ€”2
πŸ”₯ #AIwithPapers: we are 2,900+! πŸ”₯

πŸ’™πŸ’› Cheers from "Black Metal Lady Gaga" plotted by DallE-mini πŸ’™πŸ’›

😈 Invite your friends -> https://t.me/AI_DeepLearning
😁8πŸ‘3❀2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ…Segmentation with INSANE OcclusionsπŸ…

πŸ‘‰CMU unveils WALT: segmenting in severe occlusion scenarios. Performance over human.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…WALT: Watch & Learn Time-lapse
βœ…4K/1080p cams on streets over a year
βœ…Performance over human-supervised
βœ…Object-occluder-occluded neural layers
βœ…Source code under MIT license

More: https://bit.ly/3n7pvjO
🀯14πŸ‘4πŸ”₯3
This media is not supported in your browser
VIEW IN TELEGRAM
🐠Largest Dataset for #autonomousdriving🐠

πŸ‘‰SHIFT: largest synthetic dataset for #selfdrivingcars. Shifts in cloud, rain, fog, time of day, vehicle & pedestrian density🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…4,800+ clips, multi-view sensor suite
βœ…Semantic/instance, M/stereo depth
βœ…2D/3D object detection, MOT
βœ…Optical flow, point cloud registration
βœ…Visual-Odo, trajectory & human pose

More: https://bit.ly/3HJBUUT
🀯9πŸ‘5❀2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦‘Big Egocentric Dataset by #Meta πŸ¦‘

πŸ‘‰Novel dataset to speed-up research on egocentric MR/AI

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…159 sequences, multiple sensors
βœ…Scenarios: cooking, exercising, etc.
βœ…β€˜Desktop Activities’ via multi-view mocap
βœ…Dataset available upon request

More: https://bit.ly/3QDccVW
πŸ”₯8πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ¦‹Transf-Codebook HD-Face RestorationπŸ¦‹

πŸ‘‰S-Lab unveils CodeFormer: hyper-datailed face restoration from degraded clips

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Face restoration as a code prediction
βœ…Discrete CB prior in small proxy space
βœ…Controllable transformation for LQ->HQ
βœ…Robustness and global coherence
βœ…Code and models soon available

More: https://bit.ly/3QEa9B5
πŸ”₯13πŸ‘7❀1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ” Fully Controllable "NeRF" Faces πŸ”

πŸ‘‰Neural control of pose/expressions from single portrait video

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…NeRF-control of the human head
βœ…Loss of rigidity by dynamic NeRF
βœ…3D full control/modelling of faces
βœ…No source code or models yet 😒

More: https://bit.ly/3OEjwi7
πŸ”₯8πŸ‘3❀2
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ«€I M AVATAR: source code is out!πŸ«€

πŸ‘‰Neural implicit head avatars from monocular videos

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…#3D morphing-based implicit avatar
βœ…Detailed Geometry/appearance
βœ…D-Rendering e2e learning from clips
βœ…Novel synthetic dataset for evaluation

More: https://bit.ly/3A2yzy9
πŸ‘8πŸ‘4
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ—ΊοΈNeural Translation Image -> MapπŸ—ΊοΈ

πŸ‘‰A novel method for instantaneous mapping as a translation problem

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Bird’s-eye-view (BEV) map from image
βœ…A restricted data-efficient transformer
βœ…Monotonic attention from lang.domain
βœ…SOTA across several datasets

More: https://bit.ly/39MQ76Z
πŸ”₯20πŸ‘6😱1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯Ά E2V-SDE: biggest troll ever? πŸ₯Ά

πŸ‘‰E2V-SDE paper (accepted to #CVPR2022) consists of texts copied from 10+ previously published papers πŸ˜‚

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Latent ODEs for Irregularly-Sampled TS
βœ…Stochastic Adversarial Video Prediction
βœ…Continuous Latent Process Flows
βœ…More papers....


More: https://bit.ly/3bsL8Zw (AUDIO ON!)
πŸ‘9
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯πŸ”₯YOLOv6 is out: PURE FIRE!πŸ”₯πŸ”₯

πŸ‘‰YOLOv6 is a single-stage object detection framework for industrial applications

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Efficient Decoupled Head with SIoU Loss
βœ…Hardware-friendly for Backbone/Neck
βœ…520+ FPS on T4 + TensorRT FP16
βœ…Released under GNU General Public v3.0

More: https://bit.ly/3OLjncK
πŸ”₯37πŸ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ BlazePose: Real-Time Human Tracking πŸͺ

πŸ‘‰Novel real-time #3D human landmarks from #google. Suitable for mobile.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…MoCap from single RGB on mobile
βœ…Avatar, Fitness, #Yoga & AR/VR
βœ…Full body pose from monocular
βœ…Novel 3D ground truth acquisition
βœ…Additional hand landmarks
βœ…Fully integrated in #MediaPipe

More: https://bit.ly/3uvyiAv
πŸ”₯14πŸ‘4