AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯OmniBenchmark: CV beyond ImageNetπŸ”₯

πŸ‘‰ 21 realms, 7,000+ concepts and 1M+ images. Far beyond ImageNet!

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…vs. ImageNet: 2.5x realms, 9x concepts
βœ…Conciseness: no concept overlapping
βœ…ReCo: Relational Contrastive Learning
βœ…New supervised contrastive learning SOTA

More: https://bit.ly/3RJRKU0
πŸ”₯11🀩3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’£ HD Neural Avatar @130FPS πŸ’£

πŸ‘‰Samsung unveils MegaPortraits: novel one-shot creation of HD neural human avatar

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…One-shot neural avatars, SOTA up 512p
βœ…"Upgrading" to megapixel via more pics
βœ…First Neural Head Avatars in HD
βœ…Up to to 130 FPS via #GPU

More: https://bit.ly/3oboWWT
πŸ”₯22πŸ‘1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TimeLens++: Event-based Interpolation 🦚

πŸ‘‰Novel event-based interpolation with non-linear flow & multi-scale fusion

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Novel motion spline estimator
βœ…Non-linear continuous event/frames flow
βœ…Multi-feature fusion, gated compression
βœ…Novel hybrid dataset with 100+ videos

More: https://bit.ly/3yJyY6g
πŸ”₯16πŸ‘4
This media is not supported in your browser
VIEW IN TELEGRAM
πŸͺ°NUWA-Infinity is out!πŸͺ°

πŸ‘‰βˆž generation by #Microsoft: arbitrarily-sized HD images and long videos 🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Unconditional Image Gen.
βœ…Text-to-Image/Text-to-Clip
βœ…Animation / Out-painting
βœ…Hi-res, arbitrary long clip
βœ…NCP for patches caching

More: https://bit.ly/3zmBf9f
πŸ”₯7πŸ‘2❀1πŸ‘1🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ”₯ #AIwithPapers: we are 3,500+! πŸ”₯

πŸ’™πŸ’› Ready for YOLO 10, 11, Ο€, ∞, Ξ¨, and more? The more we are, the faster we catch'em all πŸ’™πŸ’›

😈 Invite your friends -> https://t.me/AI_DeepLearning
πŸ‘12❀10😁5πŸ”₯3
This media is not supported in your browser
VIEW IN TELEGRAM
🎷🎷OMNI3D: #3D Objects in the Wild🎷🎷

πŸ‘‰#3D detection: 234k images, 3M+ instances & 97 categories

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…OMNI3D from publicly released dataset
βœ…234k pics, 3M+ annotation with 3D box
βœ…97 categories such as sofa, table, cars
βœ…Fast (450x) and exact algorithm for IoU
βœ…Cube R-CNN: novel 3D object detector

More: https://bit.ly/3cznjzG
πŸ‘11
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ‘ΉMultiface Neural Rendering πŸ‘Ή

πŸ‘‰A new multi-view, Hi-Res data collected at #META Reality Labs for neural face

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Mugsy, large scale multi-cam apparatus
βœ…High-Res sync facial performance
βœ…Closing the gap in accessing HQ data
βœ…Suitable for #VR & #mixedreality

More: https://bit.ly/3b6XfeL
🀯8πŸ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ’„DEVIANT: SOTA in mono-3D detectionπŸ’„

πŸ‘‰A novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Michigan + #Meta + Ford 🀯
βœ…Depth-equi. + scale equiv. steerable
βœ…New SOTA on KITTI & Waymo
βœ…Ok cross-dataset -> generalization

More: https://bit.ly/3OEFtgK
πŸ”₯16πŸ‘2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🧱 Assembling #LEGO with #AI 🧱

πŸ‘‰Step-by-step assembly manual created by human into machine-interpretable instructions

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Stanford + MIT + #Google 🀯
βœ…MEPNet: Manual-to-Executable-Plan Net
βœ…Manual to machine-executable plan
βœ…2D manual - 3D geometric shape
βœ…Reasoning on 3D alignments of legos

More: https://bit.ly/3PCwn5C
πŸ”₯9❀3
This media is not supported in your browser
VIEW IN TELEGRAM
πŸŽƒNew SOTA in UDA Semantic Seg.πŸŽƒ

πŸ‘‰HRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…ETH + MPG + KU Leuven 🀯
βœ…HRDA: multi-res approach for UDA
βœ…Manageable GPU memory footprint
βœ…Small objects & fine segmentation detail
βœ…New SOTA on GTA and Synthia dataset

More: https://bit.ly/3cKtDEp
🀯8πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
βš—οΈ SemAbs: 3D Scene Understanding βš—οΈ

πŸ‘‰Framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…2D VLMs with 3D reasoning skills
βœ…ViTs Efficient MS Relevancy Extraction
βœ…Novel Open-World understanding tasks
βœ…Completing partially observed objects
βœ…Finding hidden objects from language

More: https://bit.ly/3PYYk7d
πŸ”₯7❀1πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TinyCD: Neural Change Detection 🦚

πŸ‘‰TinyCD: new SOTA in change detection with up to 150x fewer parameters.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…SOTA with up to 150X fewer params
βœ…Mixing blocks for s.t. cross-correlation
βœ…PW-MLP for pixel wise classification
βœ…MAMB: novel block for skip connection

More: https://bit.ly/3zFEngk
❀16πŸ‘2πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
🦊 3D-Aware "StyleGANv2" version 🦊

πŸ‘‰Upgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changes🀯

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…MPI-like 3D-aware GAN w/ single-view
βœ…GMPI: generative multiplane image
βœ…2D GAN 3D-aware with a minimal changes
βœ…Encoding 3D-aware inductive biases

More: https://bit.ly/3OJ5gnS
🀯6πŸ‘4❀1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ“Ί NeRF-ing "The Big Bang Theory" πŸ“Ί

πŸ‘‰Berkeley unveils an approach for accurate estimation of actor’s 3D pose & location

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Input: images across the whole season
βœ…3D context (i.e. cams, structure, body)
βœ…Integrating context in 3D estimation
βœ…Re-ID, gaze, cinematography, pic editing
βœ…Knock, Knock, Penny!

More: https://bit.ly/3OLuaUb
πŸ”₯7🀯5πŸ₯°2❀1
This media is not supported in your browser
VIEW IN TELEGRAM
🎩ShAPO: SOTA in object understanding🎩

πŸ‘‰Joint multi-object detection, #3D texture, 6D object pose & size estimation.

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…Disentangled shape & appearance
βœ…Efficient octree-based differentiable
βœ…Object-centric understanding pipeline
βœ…Detection, reconstruction , 6D & size
βœ…SOTA in reconstruction & pose est.

More: https://bit.ly/3oHN5EQ
πŸ‘7🀯1
This media is not supported in your browser
VIEW IN TELEGRAM
πŸ™οΈ CityNeRF: Neural Rendering of City Scenes πŸ™οΈ

πŸ‘‰Progressive NeRF model and training set on city-scenes

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…BungeeNeRF: novel progressive NeRF
βœ…Details on drastically varied scales
βœ…Growing with residual block structure
βœ…Inclusive multi-level data supervision

More: https://bit.ly/3cS9vk7
πŸ₯°7πŸ‘3🀯3😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🍦🍦 Rewriting Geometry of GAN 🍦🍦

πŸ‘‰Drive GAN synthesizing many unseen objects with the desired shape

𝐇𝐒𝐠𝐑π₯𝐒𝐠𝐑𝐭𝐬:
βœ…User-friendly "warping" with geometry
βœ…Low-rank update to layer for editing
βœ…Latent augmentation based on style-mix
βœ…Endless objects with defined changes
βœ…Latent space interpolation, image editing

More: https://bit.ly/3zIfOj8
πŸ‘8😱7😁3πŸ‘Ž2❀1πŸ”₯1