AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĒ¨ Google URF for neural-synthesis đŸĒ¨

👉Sequence of RGB + Lidar -> 3D surfaces and novel RGB images synthesized

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Extending Neural Radiance Fields
✅Leveraging asynch. lidar data
✅Addressing exposure variation
✅Leveraging segmentations for sky
✅SOTA #3D reconstructions/synthesizes

More: https://bit.ly/3L2vTDb
đŸ”Ĩ11👍4👏1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚛 AV2: next-gen. self driving 🚛

👉One of the biggest dataset ever for #autonomousdriving

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅1k seq. of multimodal data
✅3D annotations, 26 categories
✅20k lidar & map-aligned pose
✅250k challenging interactions
✅HD Map: 3D lane & crosswalk
✅CC BY-NC-SA 4.0 license

More: https://bit.ly/3trx3lw
đŸ”Ĩ3👍1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖CaTGrasp in Clutter from Simulation🤖

👉Task-relevant grasping: trained solely in simulation with synthetic + SS. hand-object interaction

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Novel cat-level, relevant grasping
✅S.S. hand-object-contact
✅Tiny objects from dense clutter
✅Train-simulation -> to real
✅Source code under Apache 2.0

More: https://bit.ly/3L2YVCo
👍1đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ›ŧ Drive & Segment without Supervision đŸ›ŧ

👉Learning pixel-wise semantic seg. on non-curated data collection by cars (cameras + LiDAR) driving around a city

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Cross-modal unsupervised
✅Synchronized LiDAR & RGB
✅Object proposal on LiDAR points
✅SOTA, significant improvements

More: https://bit.ly/3L0wWTW
👍3đŸ”Ĩ1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌍 NeRF-free Neural Rendering 🌍

👉A simple 2D-only method with a single pass of a neural network

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Synthesis with NO 3D reasoning
✅Autoregressive & masked transf.
✅Pose -> object, object -> pose
✅Attention: branching attention
✅Source code under MIT License

More: https://bit.ly/3JC7unt
đŸ”Ĩ3😱2👍1🤩1
🤓👌Hey, TAKE OFF my eyeglasses! 😙👌

👉A novel framework to remove eyeglasses as well as their cast shadows from faces

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Novel mask-guided multi-step network
✅Leveraging 3D synthetic data only
✅Synthetic portraits with supervisions
✅Eyeglasses & shadows simultaneously

More: https://bit.ly/3IvQzlf
👍7đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ #AI models/dataset for open surgery đŸĨ

👉Multi-task #AI model/dataset of real-time surgical behaviors, hands, and tools.

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Annotated Videos Open Surgery
✅Largest dataset of open surgical
✅2k clips and 23 procedures
✅12k annotations, 11k+ keypoints
✅Models/Dataset soon available!

More: https://bit.ly/3tvDdkK
👍8đŸ¤¯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨŊ #metaverse in 1991 đŸĨŊ

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

Discussion: https://bit.ly/3txWF07
👍3đŸ¤Ŧ3đŸĨ°1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĢ•NeRFusion: Large-Scale ReconstructionđŸĢ•

👉Efficient large-scale reconstruction & photo-realistic rendering

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Frame-by-frame R.F.
✅Neural reconstruction
✅Real-time at 20+ fps
✅SOTA on indoor / objects

More: https://bit.ly/3iyfoCo
đŸ¤¯7đŸ”Ĩ4👍3👏2
This media is not supported in your browser
VIEW IN TELEGRAM
☕ORViT for understanding tasks☕

👉ORViT: object-centric approach that extends ViT layers incorporating object representations

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Spatio-temporal through the net
✅''Object-Region Attention''
✅''Object-Dynamics" module
✅Code just released! Apache 2.0

More: https://bit.ly/3wAUavW
đŸ”Ĩ5👍3😱2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĒ…Insane Neural Sketching from #MITđŸĒ…

👉Line drawing generation as unsupervised image translation with various losses

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Unpaired method for line drawing
✅Geometry loss to predict depth
✅Semantic loss to match CLIP feats
✅SOTA on unpaired translation/generation
✅Code and Models under MIT License

More: https://bit.ly/36JRr8A
đŸ¤¯7đŸ”Ĩ4❤1👍1đŸĨ°1👏1😁1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ”ī¸MPS-Net: new SOTA for #3D humanđŸ”ī¸

👉MPS-Net: accurate & temporally coherent 3D human pose/shape from video

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅MoCA: visual cues from motion
✅HAFI to mix past/future feats
✅Stronger temporal correlation
✅SOTA on multiple datasets

More: https://bit.ly/3uAI5EB
đŸ¤¯9đŸ”Ĩ1đŸĨ°1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸ¤ŋTransfiner: hyper-detailed segmentationđŸ¤ŋ

👉Mask Transfiner: #AI for HQ & efficient instance segmentation

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅Transfiner: HQ segmentation
✅HQ seg. via quadtree structure
✅SOTA & extreme details
✅Code under MIT License

More: https://bit.ly/3KVzseM
👍5đŸ”Ĩ3đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
đŸĨ™ DualStyleGAN: SOTA in style transferđŸĨ™

👉Flexible control of dual styles of face domain and extended artistic portrait domain

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅High-resolution (1024*1024)
✅Intrinsic/extrinsic style path
✅Hierarchical style manipulation
✅Novel progressive fine-tuning
✅Source code under MIT License

More: https://bit.ly/3uS26Xp
👍11🤩4đŸ”Ĩ1
This media is not supported in your browser
VIEW IN TELEGRAM
🍚 GTR: Global Tracking Transformers 🍚

👉UTexas + Apple: transformer for global multi-object tracking

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅GTR operates on any object
✅Few frames->global trajectories
✅SOTA on detectors for any object
✅Code under Apache License 2.0

More: https://bit.ly/3DiqkxF
đŸ”Ĩ7👍2đŸ¤¯2😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧠E2E Perception for #selfdrivingcars🧠

👉HybridNets: multi-task net with several key optimizations

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅End-to-end perception network
✅Traffic, lane, object detection
✅Drivable segmentation area
✅Real-time on embedded systems
✅Source code under MIT License

More: https://bit.ly/3JMk8Az
👍8❤4👏2đŸ¤¯1😱1
Media is too big
VIEW IN TELEGRAM
đŸ›Šī¸Smart Parking with UAVsđŸ›Šī¸

👉A novel methodology to monitor car parking areas in real-time via Drones/UAVs

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅YoloV3 + DeepSort tracker
✅Vehicle detection/tracking
✅Occupancy estimation via RT
✅Four blocks, unique pipeline

More: https://bit.ly/3iJD8nm
❤8👍5đŸĨ°1đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
👕 Detecting Events via #AI 👕

👉Localizing object states & corresponding state-modifying actions

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅SS-learning state-modifying
✅Noise adaptive weighting
✅ChangeIt: 2.6k+ hrs , 34k+ changes
✅Dataset, code, and model!

More: https://bit.ly/3uBwxkj
👍7đŸ¤¯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈🌈 Interactive Neural Labelling 🌈🌈

👉Dense labelling of geometry, color & semantics via #3D neural field

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅No training data
✅Dense labeling
✅Classes on the fly
✅Labelling at a scale

More: https://bit.ly/36Y0faQ
đŸ”Ĩ4👍1đŸ¤¯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
â™Ÿī¸Neural RGB-D Reconstructionâ™Ÿī¸

👉Novel approach for #3D mixing implicit surface representations with NeRFs

𝐇đĸ𝐠𝐡đĨđĸ𝐠𝐡𝐭đŦ:
✅RGB-D based reconstruction
✅Leveraging color & depth
✅Depth into the NeRF
✅Pose & camera refinement

More: https://bit.ly/3iN6e54
đŸ”Ĩ5👍2đŸ¤¯2🤩1