AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🍜Ref-NeRF for extreme realism🍜

👉Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Realism and accuracy
Replacing NeRF’s params
Regularization of volume density
Integrated Directional Encoding

More: https://bit.ly/3tTlS5l
👍4🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🦧OFA for all: Cross, Vision, Language🦧

👉Unified multimodal model for image generation, visual grounding, etc.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Sequence-to-sequence learning
Image Captioning / Generation
Visual Grounding / Classification
Text-to-Image Generation
Visual Question Answering

More: https://bit.ly/3wSTGlc
👍7🤯6👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🍿Old Films Back to Life with #AI🍿

👉Recurrent transformer network (RTN) to restore heavily degraded old films

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Transformer blocks for spatial
Knowledge from adjacent frames
Color from keyframes to whole clip
Source code available in days!

More: https://bit.ly/3wZbV8y
12👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍊Neural Head #Avatars from RGB🍊

👉Novel neural representation for animatable head avatar

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel articulated human head
Full-geometry reconstruction
Differentiable optimization pipeline
Disentanglement of shape/color

More: https://bit.ly/3DxUGMI
🔥3🤯2😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🌶️ MyStyle: personal generative #AI 🌶️

👉Personalized deep generation with a few shots of a person

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Small set of portraits(∼100)
Local, low-dim, personal manifold
Personal #AI for ill-posed tasks
SOTA vs. previous few-shots

More: https://bit.ly/3wWMwMu
🔥5👍4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦆 GAN + Dense Map 🦆

👉CoordGAN: structure-texture disentangled GAN with dense correspondence map

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel coordinate space
Warping to learn coordinate
Encoder for structure representation
HQ structure/texture editable images

More: https://bit.ly/3DOlOaB
🤯42🔥2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
Unified shape & non-rigid motion

👉CaDeX: SOTA in both shape & non-rigid motion

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Canonical Deformation Coordinate Space
Shape + non rigid motion representation
Factorization of def-homeomorphisms
Cycle consistency, topology & volume
SOTA in modelling deformable objects

More: https://bit.ly/3NM5NX1
4🤯1😱1
📸 ~6 BILLION CLIP-filtered pairs 📸

👉A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
2,3B English image-text pairs
2,2B from 100+ other languages
1,3B language not detected
KNN index for quick search

More: https://bit.ly/3LFhKvT
3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🥮 PP-YOLOE: e-version of YOLO 🥮

👉 SOTA object detector up to 149+ FPS!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Optimized PP-YOLOv2
S/M/L/XL for different scenarios
149+ FPS, with TensorRT & FP16
Source code & models available

More: https://bit.ly/3x454uy
🔥5👍3👏1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🧙 HD synthesis with LDM 🧙

👉Low-cost DM via latent space of powerful pretrained autoencoders

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Hi-res synthesis of megapixel
Synthesis, inpainting, stochastic SR
Large, consistent images of ∼1024px
General conditioning via cross-attention
Code licensed under MIT License

More: https://bit.ly/3LIVOzS
🔥6👍3🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🎩 SinNeRF: Single Image NeRF 🎩

👉NEural Radiance Field via single view only

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
UATX + UIUC + UOregon + Picsart AI
"Looking only once” approach
semi-supervised learning process
Geometry/semantic pseudo-labels
SOTA in novel-view synthesis

More: https://bit.ly/3ujMZqF
👍7🔥2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Transformer-based Tracking 🔥

👉Tracker via Transformer-based model prediction module

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Tracking by Transformer prediction
Extending model predictor for BBs
SOTA on three public benchmark
Code/models under GNU License 3.0

More: https://bit.ly/3ucYvUI
🔥9🤯2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 In-The-Wild Virtual Try-On 👗

👉StyleGAN-based architecture for appearance flow estimation in VTON application

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Global appearance flow estimation
Ok with mis-alignments person/garment
"In-the-wild": person with natural poses
Code under CC BY-NC-SA 4.0 license

More: https://bit.ly/3LPR9wl
👏63🔥1🤔1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎇DALL·E 2 just announced!🎇

👉DALL·E 2 to create realistic images and art from natural language

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
More realistic/accurate, 4x res.
Better caption matching
Not available yet, waiting list!

More: https://bit.ly/3j9v3bR
🔥12🤯5👍2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👋Forecasting interactions via attention👋

👉Predicting the hand motion trajectory and the future contact points on the next active object

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Object-Centric Transformer (OCT)
Self-attention Transformer mechanism
Framework to handle uncertainty
SOTA on Epic-Kitchens and EGTEA

More: https://bit.ly/3v3PpbI
👍4🔥2👏1🤔1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍇SmeLU: Smooth Activation Function🍇

👉Google unveils a new smooth activation function: easy to implement, cheap & less error-prone

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Smooth to mitigate irreproducibility
Cheap function, better than GELU/Swish
0-1 slope through quadratic middle region
SmeLU as convolution of ReLU with box
Best reproducibility-accuracy tradeoff

More: https://bit.ly/3xcskXm
😱8👍41🔥1😁1🤯1
📍Hyper-Dense Landmarks at 150FPS📍

👉#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Accurate 10× as many landmarks as usual
Synthetic data, perfect annotations
NO appearance, light, diff-rendering
#3D @150+FPS with a single CPU thread
SOTA in monocular 3D reconstruction

More: https://bit.ly/37pQS40
👍6🔥4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
☀️SunStage: Selfie with the Sun☀️

👉Accurate/tailored reconstruction of facial geometry/reflectance

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel personalized scanning
Disentanglement of scene params
Geometry, materials, lighting, poses
Photorealistic with a single selfie video

More: https://bit.ly/36W1Oqx
🔥3👏2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
📫 Generative Neural Avatars 📫

👉3D shapes of people in a variety of garments with corresponding skinning weight

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ETH + Uni-Tübingen + Max Planck
Animatable #3D human in garment
Directly from raw posed 3D scans
NO canonical, registration, manual w.
Geometric detail in clothing deformation


More: https://bit.ly/3M7mCdB
👏3🔥2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🗨️Conversational program synthesis🗨️

👉Conversational synthesis to translate English into executable code

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Conversational program synthesis
New multi-turn progr.benchmark
Open Custom library: JAXFORMER
Source code under BSD-3 license

More: https://bit.ly/3jjWWhk
🤯4🥰2🔥1😱1