AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ MinVIS, a new SOTA is out πŸ”₯

πŸ‘‰#Nvidia miniVIS: no video-based architectures nor training procedures🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Video architecture/train not required
βœ…MinVIS outperforms the previous SOTA
βœ…Occluded VIS (OVIS): >10% improvement
βœ…1% of labeled frames >> fully-supervised

More: https://bit.ly/3pcYzk1
πŸ”₯12
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯πŸ”₯MultiNeRF: three NeRFs are out!πŸ”₯πŸ”₯

πŸ‘‰Google opens the code of three #cvpr2022 papers: Mip-NeRF 360, Ref-NeRF, RawNeRF

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Paper_1: Mip-NeRF 360
βœ…Paper_2: Ref-NeRF
βœ…Paper_3: NeRF in the Dark

More: https://bit.ly/3QjpRRc
πŸ‘13❀4🀯4
This media is not supported in your browser
VIEW IN TELEGRAM
β˜€οΈLocoProp: Neural Layers Compositionβ˜€οΈ

πŸ‘‰Google AI unveils LocoProp: novel neural paradigm for modular composition of layers.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Backprop++ via Local Loss Optimization
βœ…Layer-based w-reg, target output, loss
βœ…Multiple local update via first-order opt.
βœ…Superior performance and efficiency

More: https://bit.ly/3Q40YJn
πŸ”₯13
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯PCVOS: clip-wise mask VOSπŸ”₯

πŸ‘‰PCVOS: new semi-supervised video object segmentation method

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Reformulating semi-supervised VOS
βœ…Novel per-clip inference perspective
βœ…Clip-wise operation on intra-clip
βœ…PCVOS: model for per-clip inference
βœ…New SOTA on multiple benchmarks

More: https://bit.ly/3vJtmbz
πŸ‘10😁2❀1🀩1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘ World-Object Detection via ViT πŸ‘

πŸ‘‰Google unveils OWL-ViT: open-vocabulary detector based on ViTs 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ViTs for Open-World Localization
βœ…Img-level to open-vocabulary detection
βœ…SOTA one-shot (img.cond.) detection

More: https://bit.ly/3Sy3jOj
🀯12πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
🎹🎹 Learning Piano in #AR 🎹🎹

πŸ‘‰PianoVision (on #META #Quest2) accelerates the piano learning via Passthrough #AR & hand tracking

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Sheet Insight to learn sight-read
βœ…MIDI keyboard connectivity
βœ…Air piano for no physical pianos
βœ…Multiplayer Music Instruction
βœ…PianoVision Music Hall in #VR

More: https://bit.ly/3zYvwGX
❀15🀯6πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊EPro-PnP: Persp-n-Points Detection🧊

πŸ‘‰EPro-PnP: probabilistic PnP layer for general e2e pose estimation

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Probabilistic PnP for general e2e pose
βœ…Top-tier in 6DoF by inserting into CDPN
βœ…Deformable accurate detection
βœ…2D-3D corresp. learned from scratch

More: https://bit.ly/3BNPXYr
πŸ‘11
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯‡#NVIDIA wins SIGGRAPH's Best PaperπŸ₯‡

πŸ‘‰Instant #NeRF awarded as a best paper at SIGGRAPH 2022!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Speed-up of several orders of magnitude
βœ…HQ neural primitives in a matter of secs
βœ…Render in tens of milliseconds at 1080p
βœ…Source code and resources available!

More: https://bit.ly/3Qt8c9D
πŸ‘16πŸ”₯6❀3πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ° EasyMocap: Open Neural Mocap πŸͺ°

πŸ‘‰EasyMocap: open-source marker-less mocap with novel view synthesis from RGB

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬 (of last paper added):
βœ…Editable free-viewpoint video
βœ…Layered neural representation of humans
βœ…Multi-pax -> instances, weakly-supervised
βœ…HQ neural representation of the humans
βœ…Addressing camera error by human poses

More: https://bit.ly/3p6lUDO
🀯6πŸ‘3πŸ‘3❀2
This media is not supported in your browser
VIEW IN TELEGRAM
🎰 Texturify: Neural Textures Generator 🎰

πŸ‘‰A step towards automated content creation. HQ textures directly on surface of 3D object

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…TUM + Max Planck + Apple 🍏
βœ…Realistic, HQ textures from 2D pics
βœ…3D shape geometry, no 3D supervision
βœ…3D-aware surface-based generation net

More: https://bit.ly/3BW7UUU
πŸ‘8
This media is not supported in your browser
VIEW IN TELEGRAM
🍨 Scaling Neural Indoor Scene 🍨

πŸ‘‰Neural scene rendering for indoor: scalable in both training/rendering

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Neural scene rendering for indoor
βœ…#3D into tiles with MLPs to scale up
βœ…Parallel training of tile-based MLPs
βœ…View-indep. components (via surf-MLP)

More: https://bit.ly/3bH94IX
πŸ”₯2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯Stable Diffusion on clips. INSANEπŸ”₯

πŸ‘‰The most advanced latent text-to-image DM. #RunwayML just announced is going to apply it on clips

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Latent DM on 512p from LAION-5B
βœ…Frozen CLIP ViT-L/14 text encoder
βœ…Lightweight, runs on a 10GB-GPU
βœ…Checkpoints only for research

More: https://bit.ly/3QfkRx3
🀯13😱12πŸ‘2πŸ”₯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍 Implicitron: "democratizing" NeRF🐍

πŸ‘‰#META opens a novel framework for NeRF-world in #PyTorch3D #pytorch

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Implicit representations (NeRF) / Render
βœ…RaySampler/PointSampler & more
βœ…NeRF’s MLP, IDR’s FF, SRN, etc.
βœ…Renderers: MEAR, LSTMRenderer, etc.

More: https://bit.ly/3bPyJPJ
πŸ”₯4🀯2
This media is not supported in your browser
VIEW IN TELEGRAM
🧰 FGT: flow-guided inpainting 🧰

πŸ‘‰#Microsoft (+USTC) unveils FGT: flow-guided ViT for video inpainting 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…OF into transformer for attention++
βœ…Flow completion net w/ local feats.
βœ…Dual perspective spatial MHSA
βœ…Local attention with global content

More: https://bit.ly/3pk5J5S
❀11πŸ‘5
This media is not supported in your browser
VIEW IN TELEGRAM
🍏NeuMan: Human NeRF in the wild🍏

πŸ‘‰#Apple opens a novel human pose/view from just a single in-the-wild video

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…No extra devices/annotations
βœ…Both Human (novel poses) + Scene
βœ…E2E SMPL optimization + error-corr.
βœ…Applications such as "telegathering"

More: https://bit.ly/3K4iTO6
πŸ‘15
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯‘ CLIP-based Neural Style Transfer πŸ₯‘

πŸ‘‰From #Nvidia a novel method for transferring the style to a #3D object

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Texture style for 3D by CLIP-ResNet50
βœ…Nearest-neighbor feature matching loss
βœ…CLIP-based loss extraction of textures
βœ…NNFM for multiple style pics / control
βœ…No source code or models available πŸ˜’

More: https://bit.ly/3c32dK5
🀯12πŸ”₯5❀4πŸ‘2😱2😁1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ KeypointNeRF: code is out! πŸ”₯

πŸ‘‰KeypointNeRF by #Meta: "NeRF"-avatars

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Generalizable NeRF for virtual avatar
βœ…Sparse 3D keypoints for SOTA avatar
βœ…Novel unseen subjects from 2/3 views
βœ…"iPhone" captures for #metaverse

More: https://bit.ly/3pyl17e
πŸ”₯8πŸ‘3πŸ‘Ž1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ₯­Massive GTA-V human datasetπŸ₯­

πŸ‘‰GTA-Human: outperforming SOTA with a purely synthetic training.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…600+ gender, age, ethnicity & clothing
βœ…20,000+ clips, variety of human activities
βœ…6 categories of location, different BGs
βœ…Occlusions, lighting, and weather system

More: https://bit.ly/3wpZyRD
πŸ”₯14❀2πŸ‘1