AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
โš“Unified shape & non-rigid motionโš“

๐Ÿ‘‰CaDeX: SOTA in both shape & non-rigid motion

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Canonical Deformation Coordinate Space
โœ…Shape + non rigid motion representation
โœ…Factorization of def-homeomorphisms
โœ…Cycle consistency, topology & volume
โœ…SOTA in modelling deformable objects

More: https://bit.ly/3NM5NX1
โค4๐Ÿคฏ1๐Ÿ˜ฑ1
๐Ÿ“ธ ~6 BILLION CLIP-filtered pairs ๐Ÿ“ธ

๐Ÿ‘‰A dataset 14x bigger than the previously biggest openly accessible image-text dataset in the world.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…2,3B English image-text pairs
โœ…2,2B from 100+ other languages
โœ…1,3B language not detected
โœ…KNN index for quick search

More: https://bit.ly/3LFhKvT
โค3๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅฎ PP-YOLOE: e-version of YOLO ๐Ÿฅฎ

๐Ÿ‘‰ SOTA object detector up to 149+ FPS!

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Optimized PP-YOLOv2
โœ…S/M/L/XL for different scenarios
โœ…149+ FPS, with TensorRT & FP16
โœ…Source code & models available

More: https://bit.ly/3x454uy
๐Ÿ”ฅ5๐Ÿ‘3๐Ÿ‘1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง™ HD synthesis with LDM ๐Ÿง™

๐Ÿ‘‰Low-cost DM via latent space of powerful pretrained autoencoders

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Hi-res synthesis of megapixel
โœ…Synthesis, inpainting, stochastic SR
โœ…Large, consistent images of โˆผ1024px
โœ…General conditioning via cross-attention
โœ…Code licensed under MIT License

More: https://bit.ly/3LIVOzS
๐Ÿ”ฅ6๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽฉ SinNeRF: Single Image NeRF ๐ŸŽฉ

๐Ÿ‘‰NEural Radiance Field via single view only

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…UATX + UIUC + UOregon + Picsart AI
โœ…"Looking only onceโ€ approach
โœ…semi-supervised learning process
โœ…Geometry/semantic pseudo-labels
โœ…SOTA in novel-view synthesis

More: https://bit.ly/3ujMZqF
๐Ÿ‘7๐Ÿ”ฅ2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ Transformer-based Tracking ๐Ÿ”ฅ

๐Ÿ‘‰Tracker via Transformer-based model prediction module

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Tracking by Transformer prediction
โœ…Extending model predictor for BBs
โœ…SOTA on three public benchmark
โœ…Code/models under GNU License 3.0

More: https://bit.ly/3ucYvUI
๐Ÿ”ฅ9๐Ÿคฏ2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘— In-The-Wild Virtual Try-On ๐Ÿ‘—

๐Ÿ‘‰StyleGAN-based architecture for appearance flow estimation in VTON application

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Global appearance flow estimation
โœ…Ok with mis-alignments person/garment
โœ…"In-the-wild": person with natural poses
โœ…Code under CC BY-NC-SA 4.0 license

More: https://bit.ly/3LPR9wl
๐Ÿ‘6โค3๐Ÿ”ฅ1๐Ÿค”1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ‡DALLยทE 2 just announced!๐ŸŽ‡

๐Ÿ‘‰DALLยทE 2 to create realistic images and art from natural language

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…More realistic/accurate, 4x res.
โœ…Better caption matching
โœ…Not available yet, waiting list!

More: https://bit.ly/3j9v3bR
๐Ÿ”ฅ12๐Ÿคฏ5๐Ÿ‘2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘‹Forecasting interactions via attention๐Ÿ‘‹

๐Ÿ‘‰Predicting the hand motion trajectory and the future contact points on the next active object

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Object-Centric Transformer (OCT)
โœ…Self-attention Transformer mechanism
โœ…Framework to handle uncertainty
โœ…SOTA on Epic-Kitchens and EGTEA

More: https://bit.ly/3v3PpbI
๐Ÿ‘4๐Ÿ”ฅ2๐Ÿ‘1๐Ÿค”1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‡SmeLU: Smooth Activation Function๐Ÿ‡

๐Ÿ‘‰Google unveils a new smooth activation function: easy to implement, cheap & less error-prone

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Smooth to mitigate irreproducibility
โœ…Cheap function, better than GELU/Swish
โœ…0-1 slope through quadratic middle region
โœ…SmeLU as convolution of ReLU with box
โœ…Best reproducibility-accuracy tradeoff

More: https://bit.ly/3xcskXm
๐Ÿ˜ฑ8๐Ÿ‘4โค1๐Ÿ”ฅ1๐Ÿ˜1๐Ÿคฏ1
๐Ÿ“Hyper-Dense Landmarks at 150FPS๐Ÿ“

๐Ÿ‘‰#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Accurate 10ร— as many landmarks as usual
โœ…Synthetic data, perfect annotations
โœ…NO appearance, light, diff-rendering
โœ…#3D @150+FPS with a single CPU thread
โœ…SOTA in monocular 3D reconstruction

More: https://bit.ly/37pQS40
๐Ÿ‘6๐Ÿ”ฅ4๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜€๏ธSunStage: Selfie with the Sunโ˜€๏ธ

๐Ÿ‘‰Accurate/tailored reconstruction of facial geometry/reflectance

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel personalized scanning
โœ…Disentanglement of scene params
โœ…Geometry, materials, lighting, poses
โœ…Photorealistic with a single selfie video

More: https://bit.ly/36W1Oqx
๐Ÿ”ฅ3๐Ÿ‘2๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ“ซ Generative Neural Avatars ๐Ÿ“ซ

๐Ÿ‘‰3D shapes of people in a variety of garments with corresponding skinning weight

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…ETH + Uni-Tรผbingen + Max Planck
โœ…Animatable #3D human in garment
โœ…Directly from raw posed 3D scans
โœ…NO canonical, registration, manual w.
โœ…Geometric detail in clothing deformation


More: https://bit.ly/3M7mCdB
๐Ÿ‘3๐Ÿ”ฅ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ—จ๏ธConversational program synthesis๐Ÿ—จ๏ธ

๐Ÿ‘‰Conversational synthesis to translate English into executable code

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Conversational program synthesis
โœ…New multi-turn progr.benchmark
โœ…Open Custom library: JAXFORMER
โœ…Source code under BSD-3 license

More: https://bit.ly/3jjWWhk
๐Ÿคฏ4๐Ÿฅฐ2๐Ÿ”ฅ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงฏLong Video Diffusion Models๐Ÿงฏ

๐Ÿ‘‰#Google unveils a novel diffusion model for video generation

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Straightforward extension of 2D UNet
โœ…Longer by new conditional generation
โœ…SOTA in unconditional generation

More: https://bit.ly/35Y2rzg
๐Ÿ”ฅ4๐ŸŽ‰2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿš™ AutoRF: #3D objects in-the-wild ๐Ÿš™

๐Ÿ‘‰From #Meta: #3D object from just a single, in-the wild, image

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel view synthesis from in-the-wild
โœ…Normalized, object-centric representation
โœ…Disentangling shape, appearance & pose
โœ…Exploiting BBS & panoptic segmentation
โœ…Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
๐Ÿคฏ7๐Ÿ˜ฑ2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒ GAN-based Darkest Dataset๐ŸŒ 

๐Ÿ‘‰Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…"Darkest" dataset ever seen
โœ…Moonless, no external illumination
โœ…GAN-tuned physics-based model
โœ…Clips with dancing, volleyball, flags...

More: https://bit.ly/3LXxMkN
๐Ÿ‘3๐Ÿคฏ2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿค–Populating with digital humans๐Ÿค–

๐Ÿ‘‰ETHZ unveils GAMMA to populate the #3D scene with digital humans

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…GenerAtive Motion primitive MArkers
โœ…Realistic, controllable, infinite motions
โœ…Tree-based search to preserve quality
โœ…SOTA in realistic/controllable motion

More: https://bit.ly/3OgY4AG
๐Ÿ˜ฑ5๐Ÿ‘4๐Ÿ”ฅ2๐Ÿ‘1๐Ÿคฏ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ#AIwithPapers: we are ~2,000!๐Ÿ”ฅ

๐Ÿ’™๐Ÿ’› Simply amazing. Thank you all ๐Ÿ’™๐Ÿ’›

๐Ÿ˜ˆ Invite your friends -> https://t.me/AI_DeepLearning
โค18๐Ÿ”ฅ8๐Ÿฅฐ4๐Ÿ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ˜ผGARF: Gaussian Activated NeRF๐Ÿ˜ผ

๐Ÿ‘‰GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…NeRF from imperfect camera poses
โœ…NO hyper-parameter tuning/initialization
โœ…Theoretical insight on Gaussian activation
โœ…Unlocking NeRF for real-world application?

More: https://bit.ly/36bvdfU
๐Ÿ‘4๐Ÿคฉ2โค1๐Ÿ‘1๐Ÿคฏ1