AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ#3D scene manipulation from 2D๐ŸŽ

๐Ÿ‘‰Reconstruct, decompose, manipulate & render 3D scenes in a single pipeline

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Unique 3D, non-occupied space from 2D
โœ…Inverse query algorithm for shapes
โœ…First synthetic dataset for 3D editing

More: https://bit.ly/3RlYhTY
๐Ÿ”ฅ11โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŠStableFace: Talking Face Generation๐ŸŠ

๐Ÿ‘‰Analysis on motion jittering in 3D face generation (audio-in -> video-out)

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Motion jittering analysis for stability
โœ…Gaussian-based adaptive smoothing
โœ…Augmented erosions of neural renderer
โœ…Audio-fused generator for dependency

More: https://bit.ly/3Kt95gI
๐Ÿ‘5๐Ÿ˜ฑ3โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงก Avatarization in 90's. So Romantic ๐Ÿงก

๐Ÿ‘‰Making of the first #MortalKombat in early 90's

More: https://bit.ly/3wTSpJB
โค13
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿš— Massive Dataset in Virtual Cities ๐Ÿš—

๐Ÿ‘‰Synthehicle: 7 hours of labeled material, 340 cams, 64 days, rain, dawn, & night scenes.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Multi-target multi-cam tracking
โœ…2D, 3D, segm. & depth annotations
โœ…Instance, semantic & panoptic segm.
โœ…340 clips, 64 scenes, 17 hrs, 4M BBs

More: https://bit.ly/3TArHiV
โค10๐Ÿ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸชจControllable #3D Adversarial Face๐Ÿชจ

๐Ÿ‘‰#Meta (+CMU) on decoupling identity/expression + granular control over expressions

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Supervised auto-enc. + GAN
โœ…UV texture maps + 3D faces
โœ…Control expression, saving ID
โœ…Code under X11 License

More: https://bit.ly/3AVE80q
๐Ÿ‘6
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฅ‘ DALLยทE: Outpainting via #NLP ๐Ÿฅ‘

๐Ÿ‘‰Extending any original image, creating large-scale images in any aspect ratio

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Extending an image beyond its borders
โœ…Visual elements in same style of the input
โœ…Driving the image "story" in new directions
โœ…Shadows, reflections & textures w/ context

More: https://bit.ly/3eoH8uD
๐Ÿ”ฅ20๐Ÿคฏ7โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒช๏ธ TimeLapse++: Video Temporal Pyramid๐ŸŒช๏ธ

๐Ÿ‘‰Multi-scale lens to view the passage of time: far beyond a "classic" timelapse

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Inspired by "old-school" spatial pyramids
โœ…Video Spectrogram to go through pyramid
โœ…Months/years of data in a few seconds!
โœ…Multi-temporal freq., no aliasing

More: https://bit.ly/3TKnYPS
๐Ÿคฏ6๐Ÿ‘2โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿซ Stable Diffusion Video is out! ๐Ÿซ

๐Ÿ‘‰A free notebook to generate videos by interpolating the latent space of SD.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Blueberry to strawberry spaghetti
โœ…Dream items from same prompt
โœ…Morph different prompts (seeds)
โœ…Built on a script by A. Karpathy

More: https://bit.ly/3ey8632
๐Ÿคฏ15๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฆŽ VMT: Video Mask Transfiner ๐ŸฆŽ

๐Ÿ‘‰Novel highly efficient ViT structure for video instance segmentation.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…HD & more temporally stable mask
โœ…Higher resolution features for VIS
โœ…Detecting error-prone s-t. regions
โœ…Auto-refinement on training data!

More: https://bit.ly/3RKXtb4
๐Ÿคฏ9โค1
๐Ÿคฏ #StableDiffusion + #Dallemini = BOOM! ๐Ÿคฏ

๐Ÿ‘‰A #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)

More: https://bit.ly/3TTOshR
๐Ÿ”ฅ9๐Ÿ‘5๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ VIS - Deformable Transformers ๐Ÿ 

๐Ÿ‘‰DeVIS: VIS method with efficiency and performance of deformable ViT

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Temp. multi-scale D-Attention
โœ…Instance-aware object queries
โœ…Mask: DA + multi-scale feats map
โœ…Improved multi-cue clip tracking
โœ…SOTA on YouTube-VIS 2021/OVIS

More: https://bit.ly/3TQv1Xc
๐Ÿ”ฅ8โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒˆ X-NeRF: Cross-Spectral NeRF ๐ŸŒˆ

๐Ÿ‘‰Cross-Spectral NeRF from cams with different light spectrums

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…First ever cross-spectral NeRF
โœ…Avoiding non-trivial calib/match
โœ…Normalized Cross-Device Coords
โœ…Novel dataset w/ RGB, MS, & IR

More: https://bit.ly/3RqHnUo
๐Ÿ‘7
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘นTT-GNeRF: generative NeRF for Faces๐Ÿ‘น

๐Ÿ‘‰TT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…ETH + Uni_Trento + #Snap ๐Ÿคฏ
โœ…DAEM for disentanglement of 3D model
โœ…"Training-as-Init, Optimizing-for-Tuning"
โœ…Consistency++, preserving non-target ROI
โœ…Unsupervised optimization of geometry

More: https://bit.ly/3ARZmMw
๐Ÿ”ฅ4โค1๐Ÿ‘1
๐ŸŽช SOTA in Arbitrary Shape Text Detection ๐ŸŽช

๐Ÿ‘‰Novel unified coarse-to-fine Transformer for arbitrary shape text detection

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Coarse-to-fine arbitrary text detection
โœ…Accurate text detection, NO post-process
โœ…Boundary proposal generation mechanism
โœ…Innovative boundary transformer (iterative)
โœ…Boundary energy loss (BEL) for refinement

More: https://bit.ly/3D6Ryt4
โค8๐Ÿ‘2๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฒ Open-Source Self-Driving projects ๐Ÿฒ

๐Ÿ‘‰A free repo with many autonomous vehicle-related projects

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Basic/Advance Lane/Line Detection
โœ…Driving behavior by training & validating
โœ…Autopilot: predicting steering angle

More: https://bit.ly/3qqJ7RB
๐Ÿ”ฅ22๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฅคK-VIL: Keypoint-based visual imitation๐Ÿฅค

๐Ÿ‘‰K-VIL: auto-incremental extraction of object-centric task representation.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Efficient task-relevant keypoints
โœ…Embodiment-independent tasks
โœ…Adaptation of tasks to new scenes
โœ…Input: only a small set of demo clips
โœ…Novel keypoint-based controller

More: https://bit.ly/3eIrxpP
๐Ÿ”ฅ7๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’œ #Selfdriving in 80's. Damn Romantic ๐Ÿ’œ

๐Ÿ‘‰The first self-driving car with people on board, 1986. So slow and lovely.

More: https://bit.ly/3BtRDon
โค9๐Ÿ‘4๐Ÿ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿต๏ธ TORAS: SOTA #AI for annotation ๐Ÿต๏ธ

๐Ÿ‘‰TORAS: web-based AI-powered, cooperative, annotation platform.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…SOTA AI tools -> significant speedup
โœ…"Recipes" to define how to annotate
โœ…Repo with folder structure for storage
โœ…Also on-prem for (commercial) firms

More: https://bit.ly/3L78YI2
๐Ÿ”ฅ9๐Ÿคฏ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ฎMAXIM: Multi-Axis MLP for Vision๐Ÿ’ฎ

๐Ÿ‘‰#Google opens MAXIM, a multi-axis MLP for low-level vision

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Denoising, deblurring, dehazing, etc
โœ…Multi-axis gated MLP, linear complexity
โœ…Cross gating block, separate features
โœ…SOTA results on several datasets!

More: https://bit.ly/3Dmp8LI
๐Ÿ”ฅ12โค1๐Ÿ‘Ž1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ A Survey on Diffusion Models ๐Ÿ”ฅ

๐Ÿ‘‰A comprehensive review of denoising diffusion models in #computervision ๐Ÿคฏ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Overview on diffusion models
โœ…Hot trend for the generative AI
โœ…A multi-perspective categorization
โœ…Current limitations / new directions

More: https://bit.ly/3RYG5zP
โค5๐Ÿ‘3๐Ÿ”ฅ1