AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฟOld Films Back to Life with #AI๐Ÿฟ

๐Ÿ‘‰Recurrent transformer network (RTN) to restore heavily degraded old films

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Transformer blocks for spatial
โœ…Knowledge from adjacent frames
โœ…Color from keyframes to whole clip
โœ…Source code available in days!

More: https://bit.ly/3wZbV8y
โค12๐Ÿ‘2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŠNeural Head #Avatars from RGB๐ŸŠ

๐Ÿ‘‰Novel neural representation for animatable head avatar

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel articulated human head
โœ…Full-geometry reconstruction
โœ…Differentiable optimization pipeline
โœ…Disentanglement of shape/color

More: https://bit.ly/3DxUGMI
๐Ÿ”ฅ3๐Ÿคฏ2๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒถ๏ธ MyStyle: personal generative #AI ๐ŸŒถ๏ธ

๐Ÿ‘‰Personalized deep generation with a few shots of a person

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Small set of portraits(โˆผ100)
โœ…Local, low-dim, personal manifold
โœ…Personal #AI for ill-posed tasks
โœ…SOTA vs. previous few-shots

More: https://bit.ly/3wWMwMu
๐Ÿ”ฅ5๐Ÿ‘4๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ† GAN + Dense Map ๐Ÿฆ†

๐Ÿ‘‰CoordGAN: structure-texture disentangled GAN with dense correspondence map

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel coordinate space
โœ…Warping to learn coordinate
โœ…Encoder for structure representation
โœ…HQ structure/texture editable images

More: https://bit.ly/3DOlOaB
๐Ÿคฏ4โค2๐Ÿ”ฅ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โš“Unified shape & non-rigid motionโš“

๐Ÿ‘‰CaDeX: SOTA in both shape & non-rigid motion

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Canonical Deformation Coordinate Space
โœ…Shape + non rigid motion representation
โœ…Factorization of def-homeomorphisms
โœ…Cycle consistency, topology & volume
โœ…SOTA in modelling deformable objects

More: https://bit.ly/3NM5NX1
โค4๐Ÿคฏ1๐Ÿ˜ฑ1
๐Ÿ“ธ ~6 BILLION CLIP-filtered pairs ๐Ÿ“ธ

๐Ÿ‘‰A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…2,3B English image-text pairs
โœ…2,2B from 100+ other languages
โœ…1,3B language not detected
โœ…KNN index for quick search

More: https://bit.ly/3LFhKvT
โค3๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅฎ PP-YOLOE: e-version of YOLO ๐Ÿฅฎ

๐Ÿ‘‰ SOTA object detector up to 149+ FPS!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Optimized PP-YOLOv2
โœ…S/M/L/XL for different scenarios
โœ…149+ FPS, with TensorRT & FP16
โœ…Source code & models available

More: https://bit.ly/3x454uy
๐Ÿ”ฅ5๐Ÿ‘3๐Ÿ‘1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง™ HD synthesis with LDM ๐Ÿง™

๐Ÿ‘‰Low-cost DM via latent space of powerful pretrained autoencoders

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Hi-res synthesis of megapixel
โœ…Synthesis, inpainting, stochastic SR
โœ…Large, consistent images of โˆผ1024px
โœ…General conditioning via cross-attention
โœ…Code licensed under MIT License

More: https://bit.ly/3LIVOzS
๐Ÿ”ฅ6๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽฉ SinNeRF: Single Image NeRF ๐ŸŽฉ

๐Ÿ‘‰NEural Radiance Field via single view only

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…UATX + UIUC + UOregon + Picsart AI
โœ…"Looking only onceโ€ approach
โœ…semi-supervised learning process
โœ…Geometry/semantic pseudo-labels
โœ…SOTA in novel-view synthesis

More: https://bit.ly/3ujMZqF
๐Ÿ‘7๐Ÿ”ฅ2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ Transformer-based Tracking ๐Ÿ”ฅ

๐Ÿ‘‰Tracker via Transformer-based model prediction module

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Tracking by Transformer prediction
โœ…Extending model predictor for BBs
โœ…SOTA on three public benchmark
โœ…Code/models under GNU License 3.0

More: https://bit.ly/3ucYvUI
๐Ÿ”ฅ9๐Ÿคฏ2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘— In-The-Wild Virtual Try-On ๐Ÿ‘—

๐Ÿ‘‰StyleGAN-based architecture for appearance flow estimation in VTON application

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Global appearance flow estimation
โœ…Ok with mis-alignments person/garment
โœ…"In-the-wild": person with natural poses
โœ…Code under CC BY-NC-SA 4.0 license

More: https://bit.ly/3LPR9wl
๐Ÿ‘6โค3๐Ÿ”ฅ1๐Ÿค”1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ‡DALLยทE 2 just announced!๐ŸŽ‡

๐Ÿ‘‰DALLยทE 2 to create realistic images and art from natural language

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…More realistic/accurate, 4x res.
โœ…Better caption matching
โœ…Not available yet, waiting list!

More: https://bit.ly/3j9v3bR
๐Ÿ”ฅ12๐Ÿคฏ5๐Ÿ‘2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘‹Forecasting interactions via attention๐Ÿ‘‹

๐Ÿ‘‰Predicting the hand motion trajectory and the future contact points on the next active object

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Object-Centric Transformer (OCT)
โœ…Self-attention Transformer mechanism
โœ…Framework to handle uncertainty
โœ…SOTA on Epic-Kitchens and EGTEA

More: https://bit.ly/3v3PpbI
๐Ÿ‘4๐Ÿ”ฅ2๐Ÿ‘1๐Ÿค”1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‡SmeLU: Smooth Activation Function๐Ÿ‡

๐Ÿ‘‰Google unveils a new smooth activation function: easy to implement, cheap & less error-prone

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Smooth to mitigate irreproducibility
โœ…Cheap function, better than GELU/Swish
โœ…0-1 slope through quadratic middle region
โœ…SmeLU as convolution of ReLU with box
โœ…Best reproducibility-accuracy tradeoff

More: https://bit.ly/3xcskXm
๐Ÿ˜ฑ8๐Ÿ‘4โค1๐Ÿ”ฅ1๐Ÿ˜1๐Ÿคฏ1
๐Ÿ“Hyper-Dense Landmarks at 150FPS๐Ÿ“

๐Ÿ‘‰#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Accurate 10ร— as many landmarks as usual
โœ…Synthetic data, perfect annotations
โœ…NO appearance, light, diff-rendering
โœ…#3D @150+FPS with a single CPU thread
โœ…SOTA in monocular 3D reconstruction

More: https://bit.ly/37pQS40
๐Ÿ‘6๐Ÿ”ฅ4๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜€๏ธSunStage: Selfie with the Sunโ˜€๏ธ

๐Ÿ‘‰Accurate/tailored reconstruction of facial geometry/reflectance

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel personalized scanning
โœ…Disentanglement of scene params
โœ…Geometry, materials, lighting, poses
โœ…Photorealistic with a single selfie video

More: https://bit.ly/36W1Oqx
๐Ÿ”ฅ3๐Ÿ‘2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ“ซ Generative Neural Avatars ๐Ÿ“ซ

๐Ÿ‘‰3D shapes of people in a variety of garments with corresponding skinning weight

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…ETH + Uni-Tรผbingen + Max Planck
โœ…Animatable #3D human in garment
โœ…Directly from raw posed 3D scans
โœ…NO canonical, registration, manual w.
โœ…Geometric detail in clothing deformation


More: https://bit.ly/3M7mCdB
๐Ÿ‘3๐Ÿ”ฅ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ—จ๏ธConversational program synthesis๐Ÿ—จ๏ธ

๐Ÿ‘‰Conversational synthesis to translate English into executable code

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Conversational program synthesis
โœ…New multi-turn progr.benchmark
โœ…Open Custom library: JAXFORMER
โœ…Source code under BSD-3 license

More: https://bit.ly/3jjWWhk
๐Ÿคฏ4๐Ÿฅฐ2๐Ÿ”ฅ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงฏLong Video Diffusion Models๐Ÿงฏ

๐Ÿ‘‰#Google unveils a novel diffusion model for video generation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Straightforward extension of 2D UNet
โœ…Longer by new conditional generation
โœ…SOTA in unconditional generation

More: https://bit.ly/35Y2rzg
๐Ÿ”ฅ4๐ŸŽ‰2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿš™ AutoRF: #3D objects in-the-wild ๐Ÿš™

๐Ÿ‘‰From #Meta: #3D object from just a single, in-the wild, image

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel view synthesis from in-the-wild
โœ…Normalized, object-centric representation
โœ…Disentangling shape, appearance & pose
โœ…Exploiting BBS & panoptic segmentation
โœ…Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
๐Ÿคฏ7๐Ÿ˜ฑ2๐Ÿ”ฅ1