AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽซ 100% Mask-Free VIS ๐ŸŽซ

๐Ÿ‘‰ETH Z unveils MaskFreeVIS: novel high-performing VIS without any mask annotations.

๐Ÿ˜ŽReview https://bit.ly/3Wg7CQB
๐Ÿ˜ŽPaper arxiv.org/pdf/2303.15904.pdf
๐Ÿ˜ŽProject www.vis.xyz/pub/maskfreevis/
๐Ÿ˜ŽCode github.com/SysCV/maskfreevis
๐Ÿ”ฅ6๐Ÿ‘4๐Ÿคฏ2โค1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ€„ Drag-GAN: user-friendly image-manipulation ๐Ÿ€„

๐Ÿ‘‰ Manual deforming of (real and generated) images over pose, shape, expression and layout.

๐Ÿ˜ŽReview https://bit.ly/3BFyXlR
๐Ÿ˜ŽPaper arxiv.org/pdf/2305.10973.pdf
๐Ÿ˜ŽProject vcai.mpi-inf.mpg.de/projects/DragGAN
๐Ÿ˜ŽCode github.com/XingangPan/DragGAN
๐Ÿ”ฅ34๐Ÿคฏ18โค6๐Ÿ‘4๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ—บ๏ธ AI-generated stereotypical men ๐Ÿ—บ๏ธ

๐Ÿ‘‰A thread about generating stereotypical person from 15 countries all around the world. And yes, Italian love Pizza.

๐Ÿ˜Ž More https://bit.ly/3oo0t4c
๐Ÿคฃ6โค3๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿถ AVOS Multiscale Encoder-Decoder ViT ๐Ÿถ

๐Ÿ‘‰ MED-VT, world's first Multiscale Encoder Decoder Video Transformer for AVOS

๐Ÿ˜ŽReview https://bit.ly/3MohFi1
๐Ÿ˜ŽPaper arxiv.org/pdf/2304.05930.pdf
๐Ÿ˜ŽProject rkyuca.github.io/medvt
๐Ÿ˜ŽCode github.com/rkyuca/medvt
๐Ÿ‘13๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒŠ Neural Dynamic Image-Based Rendering ๐ŸŒŠ

๐Ÿ‘‰ DynIBaR: synthesizing novel views from monocular video depicting a complex dynamic scene.

๐Ÿ˜ŽReview https://t.ly/90Kw
๐Ÿ˜ŽPaper arxiv.org/pdf/2211.11082.pdf
๐Ÿ˜ŽProject https://dynibar.github.io/
๐Ÿ˜ŽCode github.com/google/dynibar
โค9๐Ÿ‘3๐Ÿฅฐ1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ Open Semantic Segmentation ๐Ÿฆ

๐Ÿ‘‰SSSegmentation: open source supervised semantic segmentation toolbox based on #PyTorch

๐Ÿ˜ŽReview https://t.ly/ZE9q
๐Ÿ˜ŽPaper arxiv.org/pdf/2305.17091.pdf
๐Ÿ˜ŽCode github.com/SegmentationBLWX/sssegmentation
๐Ÿ”ฅ10โค4โšก1๐Ÿ‘1๐Ÿคฏ1๐Ÿคฉ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽ—๏ธ 4D Humans with Transformers ๐ŸŽ—๏ธ

๐Ÿ‘‰Novel approach to reconstruct and track humans (even in unusual poses)

๐Ÿ˜ŽReview https://t.ly/XGv_
๐Ÿ˜ŽPaper arxiv.org/pdf/2305.20091.pdf
๐Ÿ˜ŽProject shubham-goel.github.io/4dhumans/#
๐Ÿ˜ŽCode github.com/shubham-goel/4D-Humans
๐Ÿคฏ10๐Ÿ‘7๐Ÿ”ฅ5โค2โšก1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ—ฝ Neuralangelo Digital Twins. INSANE๐Ÿ—ฝ

๐Ÿ‘‰ A novel framework from #Nvidia for Hi-Fi 3D Digital twins.

๐Ÿ˜ŽReview https://t.ly/rxoF4
๐Ÿ˜ŽProject research.nvidia.com/labs/dir/neuralangelo
๐Ÿ˜ŽPaper research.nvidia.com/labs/dir/neuralangelo/paper.pdf
๐Ÿ”ฅ15๐Ÿ‘4๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆœ ColorDiffuser: Text-to-Video Colorization ๐Ÿฆœ

๐Ÿ‘‰HK University unveils ColorDiffuser: adapting pre-trained text-to-image latent diffusion model for video colorization

๐Ÿ˜ŽReview https://t.ly/XGv_
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.01732.pdf
๐Ÿ˜ŽProject colordiffuser.github.io/
๐Ÿ˜ŽCode github.com/ColorDiffuser/ColorDiffuser
๐Ÿคฏ8โค2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒป Extending Mona Lisa with AI ๐ŸŒป

๐Ÿ‘‰ A guy on Reddit extends Mona Lisa Painting with #Photoshop AI. The result is surprising.

๐Ÿ˜ŽMore https://t.ly/j_2r
๐Ÿคฏ20๐Ÿ‘5๐Ÿคฉ4๐Ÿ”ฅ3๐Ÿ˜ฑ2๐Ÿคฃ2โšก1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿธ Segment Anything in HQ ๐Ÿธ

๐Ÿ‘‰HQ-SAM: SAM with the ability to accurately segment objects, maintaining promptable design, efficiency, zero-shot generalizability

๐Ÿ˜ŽReview https://t.ly/GxX5B
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.01567.pdf
๐Ÿ˜ŽModels github.com/SysCV/SAM-HQ
๐Ÿ”ฅ18๐Ÿ‘4๐Ÿคฏ1๐Ÿ˜ฑ1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒˆ Track Everything Everywhere ๐ŸŒˆ

๐Ÿ‘‰#Google unveils OmniMotion: full-length motion tracking for every pixel in every frame of video.

๐Ÿ˜ŽReview https://t.ly/Krvw
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.05422.pdf
๐Ÿ˜ŽProject omnimotion.github.io/
๐Ÿ˜ŽDemo omnimotion.github.io/#interactive_demo
๐Ÿ˜ŽCode github.com/qianqianwang68/omnimotion
๐Ÿ”ฅ23โค5๐Ÿคฏ3๐Ÿคฉ1๐Ÿ’ฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘๏ธ Scene Five: Through Her Eyes ๐Ÿ‘๏ธ

๐Ÿ‘‰ #3D scene reconstruction of what a person is observing using only the reflections of their eyes

๐Ÿ˜ŽReview https://t.ly/uBO6
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.09348.pdf
๐Ÿ˜ŽProject https://world-from-eyes.github.io/
๐Ÿคฏ28๐Ÿ”ฅ12๐Ÿ’ฉ2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงฟ NeRF-Supervised Deep Stereo ๐Ÿงฟ

๐Ÿ‘‰A novel pioneering pipeline for training deep stereo networks WITH NO ground-truth

๐Ÿ˜ŽReview https://t.ly/c7j-
๐Ÿ˜ŽProject nerfstereo.github.io/
๐Ÿ˜ŽDataset https://amsacta.unibo.it/id/eprint/7218/
๐Ÿ˜ŽCode github.com/fabiotosi92/NeRF-Supervised-Deep-Stereo
๐Ÿ˜ŽPaper https://openaccess.thecvf.com/content/CVPR2023/papers/Tosi_NeRF-Supervised_Deep_Stereo_CVPR_2023_paper.pdf
๐Ÿฅฐ8๐Ÿคฉ3โค1๐Ÿ‘1๐Ÿ’ฉ1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿซฃ Text-Guided Adversarial Makeup ๐Ÿซฃ

๐Ÿ‘‰Novel facial privacy protection via adversarial latent codes. Makeup vs Face Recognition.

๐Ÿ˜ŽReview https://t.ly/pBCP
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.10008.pdf
๐Ÿ˜ŽCode github.com/fahadshamshad/Clip2Protect
โค6๐Ÿ‘1๐Ÿ”ฅ1๐Ÿฅฐ1๐Ÿ’ฉ1
Media is too big
VIEW IN TELEGRAM
๐Ÿฆท Few-Shot Geometry-Aware Keypoints ๐Ÿฆท

๐Ÿ‘‰UBC (+Flawless AI) unveils the new SOTA in semantic keypoints localization. Suitable for faces, animals, cars, mouth, teeth & more

๐Ÿ˜ŽReview https://t.ly/-0qN
๐Ÿ˜ŽPaper arxiv.org/pdf/2303.17216.pdf
๐Ÿ˜ŽProject xingzhehe.github.io/FewShot3DKP/
๐Ÿคฏ10๐Ÿ‘4โค2โšก2๐Ÿ‘2๐Ÿคฉ2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿš” Fooling Neural Forensic Classifiers ๐Ÿš”

๐Ÿ‘‰Adversarial faces able to fool the forensic classifiers, while remaining undetectable by humans

๐Ÿ˜ŽReview https://t.ly/33Cc
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.13091.pdf
๐Ÿ˜ŽProject koushiksrivats.github.io/face_attribute_attack
๐Ÿ˜ŽCode github.com/koushiksrivats/face_attribute_attack
๐Ÿ˜ข6โค4๐Ÿ‘2๐Ÿ˜ฑ2๐Ÿพ2๐Ÿ‘1๐Ÿคฏ1๐Ÿ˜1
panohead_overview-min.gif
24.3 MB
๐Ÿฅ PanoHead: 3D Full-Head Synthesis ๐Ÿฅ

๐Ÿ‘‰#ByteDance (+UW-M) unveils PanoHead: 360โ—ฆ view-consistent portraits from a single-view image

๐Ÿ˜ŽReview https://t.ly/MrLNR
๐Ÿ˜ŽPaper arxiv.org/pdf/2303.13071.pdf
๐Ÿ˜ŽProject sizhean.github.io/panohead
๐Ÿ˜ŽCode github.com/sizhean/panohead
๐Ÿ”ฅ7โค4๐Ÿคฏ3๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฎSAM-PT: Segment Anything+Tracking๐Ÿ”ฎ

๐Ÿ‘‰SAM-PT is the first method to utilize sparse point propagation for Video Object Segmentation (VOS).

๐Ÿ˜ŽReview https://t.ly/QLMG
๐Ÿ˜ŽPaper arxiv.org/pdf/2307.01197.pdf
๐Ÿ˜ŽProject www.vis.xyz/pub/sam-pt/
๐Ÿ˜ŽCode github.com/SysCV/sam-pt
๐Ÿ”ฅ14โค7๐Ÿคฏ3๐Ÿ‘1๐Ÿ˜ฑ1