AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
237 videos
11 files
1.27K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
☀️LocoProp: Neural Layers Composition☀️

👉Google AI unveils LocoProp: novel neural paradigm for modular composition of layers.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Backprop++ via Local Loss Optimization
Layer-based w-reg, target output, loss
Multiple local update via first-order opt.
Superior performance and efficiency

More: https://bit.ly/3Q40YJn
🔥13
This media is not supported in your browser
VIEW IN TELEGRAM
🔥PCVOS: clip-wise mask VOS🔥

👉PCVOS: new semi-supervised video object segmentation method

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Reformulating semi-supervised VOS
Novel per-clip inference perspective
Clip-wise operation on intra-clip
PCVOS: model for per-clip inference
New SOTA on multiple benchmarks

More: https://bit.ly/3vJtmbz
👍10😁21🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍑 World-Object Detection via ViT 🍑

👉Google unveils OWL-ViT: open-vocabulary detector based on ViTs 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ViTs for Open-World Localization
Img-level to open-vocabulary detection
SOTA one-shot (img.cond.) detection

More: https://bit.ly/3Sy3jOj
🤯12👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🎹🎹 Learning Piano in #AR 🎹🎹

👉PianoVision (on #META #Quest2) accelerates the piano learning via Passthrough #AR & hand tracking

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Sheet Insight to learn sight-read
MIDI keyboard connectivity
Air piano for no physical pianos
Multiplayer Music Instruction
PianoVision Music Hall in #VR

More: https://bit.ly/3zYvwGX
15🤯6👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊EPro-PnP: Persp-n-Points Detection🧊

👉EPro-PnP: probabilistic PnP layer for general e2e pose estimation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Probabilistic PnP for general e2e pose
Top-tier in 6DoF by inserting into CDPN
Deformable accurate detection
2D-3D corresp. learned from scratch

More: https://bit.ly/3BNPXYr
👍11
This media is not supported in your browser
VIEW IN TELEGRAM
🥇#NVIDIA wins SIGGRAPH's Best Paper🥇

👉Instant #NeRF awarded as a best paper at SIGGRAPH 2022!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Speed-up of several orders of magnitude
HQ neural primitives in a matter of secs
Render in tens of milliseconds at 1080p
Source code and resources available!

More: https://bit.ly/3Qt8c9D
👏16🔥63👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪰 EasyMocap: Open Neural Mocap 🪰

👉EasyMocap: open-source marker-less mocap with novel view synthesis from RGB

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬 (of last paper added):
Editable free-viewpoint video
Layered neural representation of humans
Multi-pax -> instances, weakly-supervised
HQ neural representation of the humans
Addressing camera error by human poses

More: https://bit.ly/3p6lUDO
🤯6👍3👏32
This media is not supported in your browser
VIEW IN TELEGRAM
🎰 Texturify: Neural Textures Generator 🎰

👉A step towards automated content creation. HQ textures directly on surface of 3D object

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
TUM + Max Planck + Apple 🍏
Realistic, HQ textures from 2D pics
3D shape geometry, no 3D supervision
3D-aware surface-based generation net

More: https://bit.ly/3BW7UUU
👍8
This media is not supported in your browser
VIEW IN TELEGRAM
🍨 Scaling Neural Indoor Scene 🍨

👉Neural scene rendering for indoor: scalable in both training/rendering

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Neural scene rendering for indoor
#3D into tiles with MLPs to scale up
Parallel training of tile-based MLPs
View-indep. components (via surf-MLP)

More: https://bit.ly/3bH94IX
🔥2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Stable Diffusion on clips. INSANE🔥

👉The most advanced latent text-to-image DM. #RunwayML just announced is going to apply it on clips

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Latent DM on 512p from LAION-5B
Frozen CLIP ViT-L/14 text encoder
Lightweight, runs on a 10GB-GPU
Checkpoints only for research

More: https://bit.ly/3QfkRx3
🤯13😱12👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐍 Implicitron: "democratizing" NeRF🐍

👉#META opens a novel framework for NeRF-world in #PyTorch3D #pytorch

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Implicit representations (NeRF) / Render
RaySampler/PointSampler & more
NeRF’s MLP, IDR’s FF, SRN, etc.
Renderers: MEAR, LSTMRenderer, etc.

More: https://bit.ly/3bPyJPJ
🔥4🤯2
This media is not supported in your browser
VIEW IN TELEGRAM
🧰 FGT: flow-guided inpainting 🧰

👉#Microsoft (+USTC) unveils FGT: flow-guided ViT for video inpainting 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
OF into transformer for attention++
Flow completion net w/ local feats.
Dual perspective spatial MHSA
Local attention with global content

More: https://bit.ly/3pk5J5S
11👍5
This media is not supported in your browser
VIEW IN TELEGRAM
🍏NeuMan: Human NeRF in the wild🍏

👉#Apple opens a novel human pose/view from just a single in-the-wild video

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
No extra devices/annotations
Both Human (novel poses) + Scene
E2E SMPL optimization + error-corr.
Applications such as "telegathering"

More: https://bit.ly/3K4iTO6
👍15
This media is not supported in your browser
VIEW IN TELEGRAM
🥑 CLIP-based Neural Style Transfer 🥑

👉From #Nvidia a novel method for transferring the style to a #3D object

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Texture style for 3D by CLIP-ResNet50
Nearest-neighbor feature matching loss
CLIP-based loss extraction of textures
NNFM for multiple style pics / control
No source code or models available 😒

More: https://bit.ly/3c32dK5
🤯12🔥54👍2😱2😁1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 KeypointNeRF: code is out! 🔥

👉KeypointNeRF by #Meta: "NeRF"-avatars

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Generalizable NeRF for virtual avatar
Sparse 3D keypoints for SOTA avatar
Novel unseen subjects from 2/3 views
"iPhone" captures for #metaverse

More: https://bit.ly/3pyl17e
🔥8👍3👎1
This media is not supported in your browser
VIEW IN TELEGRAM
🥭Massive GTA-V human dataset🥭

👉GTA-Human: outperforming SOTA with a purely synthetic training.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
600+ gender, age, ethnicity & clothing
20,000+ clips, variety of human activities
6 categories of location, different BGs
Occlusions, lighting, and weather system

More: https://bit.ly/3wpZyRD
🔥142👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🍈DeepBillboards: old-school trick for #VR🍈

👉DeepBillboards models a 3D object implicitly using neural net on the user’s viewing direction

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
#Google Brain +Tsukuba + Tokyo
Rendering at higher res., improving #VR
NeRF into interactive VR with accuracy++
NeRF (or any others) directly in #Unity

More: https://bit.ly/3CsTQ5y
👍6👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌐RelPose: Probabilistic Relative Pose🌐

👉A novel method for core component in #SLAM / NeRF-powered apps.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Core component of SfM/SLAM
Pre-processing for neural (NeRF)
Energy-based over rotations
SOTA on both seen/unseen objects

More: https://bit.ly/3T60TXw
🔥12👍2👏21