AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
236 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฆŽ VMT: Video Mask Transfiner ๐ŸฆŽ

๐Ÿ‘‰Novel highly efficient ViT structure for video instance segmentation.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…HD & more temporally stable mask
โœ…Higher resolution features for VIS
โœ…Detecting error-prone s-t. regions
โœ…Auto-refinement on training data!

More: https://bit.ly/3RKXtb4
๐Ÿคฏ9โค1
๐Ÿคฏ #StableDiffusion + #Dallemini = BOOM! ๐Ÿคฏ

๐Ÿ‘‰A #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)

More: https://bit.ly/3TTOshR
๐Ÿ”ฅ9๐Ÿ‘5๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ VIS - Deformable Transformers ๐Ÿ 

๐Ÿ‘‰DeVIS: VIS method with efficiency and performance of deformable ViT

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Temp. multi-scale D-Attention
โœ…Instance-aware object queries
โœ…Mask: DA + multi-scale feats map
โœ…Improved multi-cue clip tracking
โœ…SOTA on YouTube-VIS 2021/OVIS

More: https://bit.ly/3TQv1Xc
๐Ÿ”ฅ8โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒˆ X-NeRF: Cross-Spectral NeRF ๐ŸŒˆ

๐Ÿ‘‰Cross-Spectral NeRF from cams with different light spectrums

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…First ever cross-spectral NeRF
โœ…Avoiding non-trivial calib/match
โœ…Normalized Cross-Device Coords
โœ…Novel dataset w/ RGB, MS, & IR

More: https://bit.ly/3RqHnUo
๐Ÿ‘7
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘นTT-GNeRF: generative NeRF for Faces๐Ÿ‘น

๐Ÿ‘‰TT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…ETH + Uni_Trento + #Snap ๐Ÿคฏ
โœ…DAEM for disentanglement of 3D model
โœ…"Training-as-Init, Optimizing-for-Tuning"
โœ…Consistency++, preserving non-target ROI
โœ…Unsupervised optimization of geometry

More: https://bit.ly/3ARZmMw
๐Ÿ”ฅ4โค1๐Ÿ‘1
๐ŸŽช SOTA in Arbitrary Shape Text Detection ๐ŸŽช

๐Ÿ‘‰Novel unified coarse-to-fine Transformer for arbitrary shape text detection

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Coarse-to-fine arbitrary text detection
โœ…Accurate text detection, NO post-process
โœ…Boundary proposal generation mechanism
โœ…Innovative boundary transformer (iterative)
โœ…Boundary energy loss (BEL) for refinement

More: https://bit.ly/3D6Ryt4
โค8๐Ÿ‘2๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฒ Open-Source Self-Driving projects ๐Ÿฒ

๐Ÿ‘‰A free repo with many autonomous vehicle-related projects

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Basic/Advance Lane/Line Detection
โœ…Driving behavior by training & validating
โœ…Autopilot: predicting steering angle

More: https://bit.ly/3qqJ7RB
๐Ÿ”ฅ22๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฅคK-VIL: Keypoint-based visual imitation๐Ÿฅค

๐Ÿ‘‰K-VIL: auto-incremental extraction of object-centric task representation.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Efficient task-relevant keypoints
โœ…Embodiment-independent tasks
โœ…Adaptation of tasks to new scenes
โœ…Input: only a small set of demo clips
โœ…Novel keypoint-based controller

More: https://bit.ly/3eIrxpP
๐Ÿ”ฅ7๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’œ #Selfdriving in 80's. Damn Romantic ๐Ÿ’œ

๐Ÿ‘‰The first self-driving car with people on board, 1986. So slow and lovely.

More: https://bit.ly/3BtRDon
โค9๐Ÿ‘4๐Ÿ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿต๏ธ TORAS: SOTA #AI for annotation ๐Ÿต๏ธ

๐Ÿ‘‰TORAS: web-based AI-powered, cooperative, annotation platform.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…SOTA AI tools -> significant speedup
โœ…"Recipes" to define how to annotate
โœ…Repo with folder structure for storage
โœ…Also on-prem for (commercial) firms

More: https://bit.ly/3L78YI2
๐Ÿ”ฅ9๐Ÿคฏ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ฎMAXIM: Multi-Axis MLP for Vision๐Ÿ’ฎ

๐Ÿ‘‰#Google opens MAXIM, a multi-axis MLP for low-level vision

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Denoising, deblurring, dehazing, etc
โœ…Multi-axis gated MLP, linear complexity
โœ…Cross gating block, separate features
โœ…SOTA results on several datasets!

More: https://bit.ly/3Dmp8LI
๐Ÿ”ฅ12โค1๐Ÿ‘Ž1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅ A Survey on Diffusion Models ๐Ÿ”ฅ

๐Ÿ‘‰A comprehensive review of denoising diffusion models in #computervision ๐Ÿคฏ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Overview on diffusion models
โœ…Hot trend for the generative AI
โœ…A multi-perspective categorization
โœ…Current limitations / new directions

More: https://bit.ly/3RYG5zP
โค5๐Ÿ‘3๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‰#AI finds where IG photos are taken๐Ÿ‰

๐Ÿ‘‰Brilliant work of Depoorter, Belgium artist that handles #privacy, #AI & #socialmedia

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Recorded open cameras for weeks
โœ…Scraped all #Instagram photos
โœ…Matching Instagram vs. footage

More: https://bit.ly/3eL5dfc
๐Ÿ˜ฑ18๐Ÿ‘13๐Ÿฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸˆฏSAMURAI: in-the-wild Shape/Material๐Ÿˆฏ

๐Ÿ‘‰#Google SAMURAI: shape, BRDF, per-image pose & illumination. Relightable #3D assets for #AR/#VR.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Parametrization for varying distances
โœ…Camera multiplex optimization
โœ…Posterior scaling of input images
โœ…Explicit meshes extraction with BRDF
โœ…Code/data soon available ->#NeurIPS

More: https://bit.ly/3BKWgf3
๐Ÿ‘8๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŸจ Lang<->Pics in 100+ Languages ๐ŸŸจ

๐Ÿ‘‰#Google PaLI: unified lang-image #AI to perform tasks in 109 languages ๐Ÿคฏ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…PaLI: Pathways Lang & Image model
โœ…Answering, captioning, reasoning, etc
โœ…From Eng. to 109 lang. understanding
โœ…The new SOTA on several datasets

More: https://bit.ly/3QMslHC
๐Ÿ”ฅ6๐Ÿ‘1๐Ÿ’ฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸPeRFception: Largest IR Dataset๐Ÿ

๐Ÿ‘‰#Nvidia, a new frontier in data collection via Plenoxels: same info, -96.4% in size.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…POSTECH + NVIDIA + Caltech = ๐Ÿคฏ
โœ…Size: -96.4% from original dataset!
โœ…2D/3D image/object class/semantic
โœ…Ready-to-use pipeline for implicit dataset

More: https://bit.ly/3eW9hJA
โค9โคโ€๐Ÿ”ฅ1๐Ÿ‘1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿธ CHARL-E: Stable Diffusion in 1 click ๐Ÿธ

๐Ÿ‘‰CHARL-E packages Stable Diffusion into a simple app.

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…No setup, dependencies, or internet
โœ…Images with 1-click on #macbook
โœ…Suitable only for M1/M2 processor
โœ…Source code under MIT license

More: https://bit.ly/3xv2z3G
๐Ÿ”ฅ11๐Ÿ‘3โคโ€๐Ÿ”ฅ1โค1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‹YOLOPv2: Better Driving Perception๐Ÿ‹

๐Ÿ‘‰YOLOPv2: simultaneous object, road segmentation & lane detection

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…E2E perception net with better backbone
โœ…Efficient ELAN for reasonable memory
โœ…Stability for adapting to scenarios
โœ…SOTA on BDD100K, +50% faster!
โœ…Source code under MIT license

More: https://bit.ly/3LvYGBh
๐Ÿ”ฅ12
๐ŸˆSegNeXt: new SOTA in Semantic Seg.๐Ÿˆ

๐Ÿ‘‰SOTA (by large margin) on ADE20K, Cityscapes, COCO-Stuff, Pascal VOC, Pascal Context, and iSAID ๐Ÿคฏ

๐‡๐ข๐ ๐ก๐ฅ๐ข๐ ๐ก๐ญ๐ฌ:
โœ…Novel tailored network architecture
โœ…Spatial attention via multi-scale feats
โœ…Encoder + conv. better than transformers
โœ…SOTA on several datasets (ADE20K, etc.)

More: https://bit.ly/3UrZhrH
๐Ÿ”ฅ9๐Ÿ‘1