AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ“ Hyper-Fast Refinement ๐Ÿฆ“

๐Ÿ‘‰SharpContour: novel contour-based refinement for semantic segmentation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Instance-aware Point Classifier
โœ…Deforming by discrete updating
โœ…Estimating offsets independently
โœ…Source code soon available!

More: https://bit.ly/3qL04GY
๐Ÿ‘5๐Ÿ”ฅ4๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅ— Neural Mesh via Text only ๐Ÿฅ—

๐Ÿ‘‰Zero-shot generation of 3D model using only a target text prompt

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…ZS 3D model with text only
โœ…ZS text-guided generation
โœ…Meshes with texture/normal
โœ…Differentiable LLS implementation

More: https://bit.ly/3u0qnvb
๐Ÿคฏ8๐Ÿ‘1๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿช†#3D, Materials, and Lighting from 2D๐Ÿช†

๐Ÿ‘‰Nvidia: topology, materials & map lighting jointly from 2D. INSANE ๐Ÿ˜ฎ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Topology, materials and lighting
โœ…Meshes with materials/lighting
โœ…Compact volumetric texturing
โœ…Differentiable all-frequency lighting
โœ…Code under #NVIDIA License

More: https://bit.ly/3IUoF2t
๐Ÿ‘5๐Ÿ‘1๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸœRef-NeRF for extreme realism๐Ÿœ

๐Ÿ‘‰Ref-NeRF: reflected radiance & structures via collection of spatially-varying scene properties

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Realism and accuracy
โœ…Replacing NeRFโ€™s params
โœ…Regularization of volume density
โœ…Integrated Directional Encoding

More: https://bit.ly/3tTlS5l
๐Ÿ‘4๐Ÿคฏ2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฆงOFA for all: Cross, Vision, Language๐Ÿฆง

๐Ÿ‘‰Unified multimodal model for image generation, visual grounding, etc.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Sequence-to-sequence learning
โœ…Image Captioning / Generation
โœ…Visual Grounding / Classification
โœ…Text-to-Image Generation
โœ…Visual Question Answering

More: https://bit.ly/3wSTGlc
๐Ÿ‘7๐Ÿคฏ6๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฟOld Films Back to Life with #AI๐Ÿฟ

๐Ÿ‘‰Recurrent transformer network (RTN) to restore heavily degraded old films

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Transformer blocks for spatial
โœ…Knowledge from adjacent frames
โœ…Color from keyframes to whole clip
โœ…Source code available in days!

More: https://bit.ly/3wZbV8y
โค12๐Ÿ‘2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŠNeural Head #Avatars from RGB๐ŸŠ

๐Ÿ‘‰Novel neural representation for animatable head avatar

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel articulated human head
โœ…Full-geometry reconstruction
โœ…Differentiable optimization pipeline
โœ…Disentanglement of shape/color

More: https://bit.ly/3DxUGMI
๐Ÿ”ฅ3๐Ÿคฏ2๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒถ๏ธ MyStyle: personal generative #AI ๐ŸŒถ๏ธ

๐Ÿ‘‰Personalized deep generation with a few shots of a person

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Small set of portraits(โˆผ100)
โœ…Local, low-dim, personal manifold
โœ…Personal #AI for ill-posed tasks
โœ…SOTA vs. previous few-shots

More: https://bit.ly/3wWMwMu
๐Ÿ”ฅ5๐Ÿ‘4๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ† GAN + Dense Map ๐Ÿฆ†

๐Ÿ‘‰CoordGAN: structure-texture disentangled GAN with dense correspondence map

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel coordinate space
โœ…Warping to learn coordinate
โœ…Encoder for structure representation
โœ…HQ structure/texture editable images

More: https://bit.ly/3DOlOaB
๐Ÿคฏ4โค2๐Ÿ”ฅ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
โš“Unified shape & non-rigid motionโš“

๐Ÿ‘‰CaDeX: SOTA in both shape & non-rigid motion

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Canonical Deformation Coordinate Space
โœ…Shape + non rigid motion representation
โœ…Factorization of def-homeomorphisms
โœ…Cycle consistency, topology & volume
โœ…SOTA in modelling deformable objects

More: https://bit.ly/3NM5NX1
โค4๐Ÿคฏ1๐Ÿ˜ฑ1
๐Ÿ“ธ ~6 BILLION CLIP-filtered pairs ๐Ÿ“ธ

๐Ÿ‘‰A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…2,3B English image-text pairs
โœ…2,2B from 100+ other languages
โœ…1,3B language not detected
โœ…KNN index for quick search

More: https://bit.ly/3LFhKvT
โค3๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅฎ PP-YOLOE: e-version of YOLO ๐Ÿฅฎ

๐Ÿ‘‰ SOTA object detector up to 149+ FPS!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Optimized PP-YOLOv2
โœ…S/M/L/XL for different scenarios
โœ…149+ FPS, with TensorRT & FP16
โœ…Source code & models available

More: https://bit.ly/3x454uy
๐Ÿ”ฅ5๐Ÿ‘3๐Ÿ‘1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง™ HD synthesis with LDM ๐Ÿง™

๐Ÿ‘‰Low-cost DM via latent space of powerful pretrained autoencoders

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Hi-res synthesis of megapixel
โœ…Synthesis, inpainting, stochastic SR
โœ…Large, consistent images of โˆผ1024px
โœ…General conditioning via cross-attention
โœ…Code licensed under MIT License

More: https://bit.ly/3LIVOzS
๐Ÿ”ฅ6๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽฉ SinNeRF: Single Image NeRF ๐ŸŽฉ

๐Ÿ‘‰NEural Radiance Field via single view only

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…UATX + UIUC + UOregon + Picsart AI
โœ…"Looking only onceโ€ approach
โœ…semi-supervised learning process
โœ…Geometry/semantic pseudo-labels
โœ…SOTA in novel-view synthesis

More: https://bit.ly/3ujMZqF
๐Ÿ‘7๐Ÿ”ฅ2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ Transformer-based Tracking ๐Ÿ”ฅ

๐Ÿ‘‰Tracker via Transformer-based model prediction module

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Tracking by Transformer prediction
โœ…Extending model predictor for BBs
โœ…SOTA on three public benchmark
โœ…Code/models under GNU License 3.0

More: https://bit.ly/3ucYvUI
๐Ÿ”ฅ9๐Ÿคฏ2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘— In-The-Wild Virtual Try-On ๐Ÿ‘—

๐Ÿ‘‰StyleGAN-based architecture for appearance flow estimation in VTON application

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Global appearance flow estimation
โœ…Ok with mis-alignments person/garment
โœ…"In-the-wild": person with natural poses
โœ…Code under CC BY-NC-SA 4.0 license

More: https://bit.ly/3LPR9wl
๐Ÿ‘6โค3๐Ÿ”ฅ1๐Ÿค”1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ‡DALLยทE 2 just announced!๐ŸŽ‡

๐Ÿ‘‰DALLยทE 2 to create realistic images and art from natural language

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…More realistic/accurate, 4x res.
โœ…Better caption matching
โœ…Not available yet, waiting list!

More: https://bit.ly/3j9v3bR
๐Ÿ”ฅ12๐Ÿคฏ5๐Ÿ‘2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘‹Forecasting interactions via attention๐Ÿ‘‹

๐Ÿ‘‰Predicting the hand motion trajectory and the future contact points on the next active object

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Object-Centric Transformer (OCT)
โœ…Self-attention Transformer mechanism
โœ…Framework to handle uncertainty
โœ…SOTA on Epic-Kitchens and EGTEA

More: https://bit.ly/3v3PpbI
๐Ÿ‘4๐Ÿ”ฅ2๐Ÿ‘1๐Ÿค”1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‡SmeLU: Smooth Activation Function๐Ÿ‡

๐Ÿ‘‰Google unveils a new smooth activation function: easy to implement, cheap & less error-prone

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Smooth to mitigate irreproducibility
โœ…Cheap function, better than GELU/Swish
โœ…0-1 slope through quadratic middle region
โœ…SmeLU as convolution of ReLU with box
โœ…Best reproducibility-accuracy tradeoff

More: https://bit.ly/3xcskXm
๐Ÿ˜ฑ8๐Ÿ‘4โค1๐Ÿ”ฅ1๐Ÿ˜1๐Ÿคฏ1
๐Ÿ“Hyper-Dense Landmarks at 150FPS๐Ÿ“

๐Ÿ‘‰#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Accurate 10ร— as many landmarks as usual
โœ…Synthetic data, perfect annotations
โœ…NO appearance, light, diff-rendering
โœ…#3D @150+FPS with a single CPU thread
โœ…SOTA in monocular 3D reconstruction

More: https://bit.ly/37pQS40
๐Ÿ‘6๐Ÿ”ฅ4๐Ÿคฏ1