AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
๐ŸŸ DE-ViT: detecting everything via DINOv2 ๐ŸŸ

๐Ÿ‘‰DE-ViT: open-set object detector based on DINOv2 backbone. It's the new SOTA on COCO & LVIS dataset

๐Ÿ˜ŽReview https://t.ly/_DAmt
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.12969.pdf
๐Ÿ˜ŽCode https://github.com/mlzxy/devit
๐Ÿ”ฅ8๐Ÿ‘4โค1๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›ตCoTracker: fast transformer-tracker๐Ÿ›ต

๐Ÿ‘‰META's CoTracker is a fast transformer-based model that can track any point in a video

๐Ÿ˜ŽReview https://t.ly/M36A_
๐Ÿ˜ŽPaper arxiv.org/pdf/2307.07635.pdf
๐Ÿ˜ŽProject https://co-tracker.github.io/
๐Ÿ˜ŽCode github.com/facebookresearch/co-tracker
โค7๐Ÿ‘4๐Ÿคฏ2๐Ÿ”ฅ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฌ๏ธ Neural Blowing in Still Photos ๐ŸŒฌ๏ธ

๐Ÿ‘‰ A novel approach to animate human hair (and clothes) in a still portraits

๐Ÿ˜ŽReview https://t.ly/HKG0t
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.14207.pdf
๐Ÿ˜ŽProject nevergiveu.github.io/AutomaticHairBlowing
๐Ÿ‘6๐Ÿคฏ3๐Ÿ”ฅ1๐Ÿ‘1๐Ÿ˜1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฎ OW Indoor Segmentation ๐ŸŒฎ

๐Ÿ‘‰3D-OWIS is a novel open-world 3D indoor instance segmentation method (with auto-labeling scheme) to separate known/unknown category labels

๐Ÿ˜ŽReview https://t.ly/-7ALf
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.14338.pdf
๐Ÿ˜ŽCode github.com/aminebdj/3D-OWIS
๐Ÿ‘6๐Ÿ”ฅ1๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงฑ Generating Scenes from Touch ๐Ÿงฑ

๐Ÿ‘‰#AI for synthesizing images from tactile signals (and vice versa) and apply it to a number of visuo-tactile synthesis tasks

๐Ÿ˜ŽReview https://t.ly/Gxr0L
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2309.15117.pdf
๐Ÿ˜ŽProject https://fredfyyang.github.io/vision-from-touch
๐Ÿ˜ŽCode https://github.com/fredfyyang/vision-from-touch
๐Ÿคฏ9๐Ÿ‘6โค1๐Ÿ”ฅ1๐Ÿ‘1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜•Decaf: 3D Face-Hand Interactionsโ˜•

๐Ÿ‘‰The first learning-based MoCap to track human hands interacting with human faces in #3D from single monocular RGB videos

๐Ÿ˜ŽReview https://t.ly/070Tj
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.16670.pdf
๐Ÿ˜ŽProject vcai.mpi-inf.mpg.de/projects/Decaf
๐Ÿ‘8๐Ÿคฏ8๐Ÿ”ฅ3โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฑ Making LLaMA See and Draw ๐ŸŒฑ

๐Ÿ‘‰Tencent #AI planted a SEED of Vision in Large Language Model. Making LLaMA see 'n' draw stuff.

๐Ÿ˜ŽReview https://t.ly/QiCAv
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.01218.pdf
๐Ÿ˜ŽCode github.com/AILab-CVC/SEED
โค8๐Ÿ‘4๐Ÿคฏ3๐Ÿ”ฅ1
๐Ÿ”ฅVisual-Math Q&A: MathVista is out! ๐Ÿ”ฅ

๐Ÿ‘‰ MathVista is the ultimate benchmark designed to amalgamate challenges from diverse mathematical and visual tasks

๐Ÿ˜ŽReview https://t.ly/yfqHZ
๐Ÿ˜ŽPaper https://arxiv.org/pdf/2310.02255.pdf
๐Ÿ˜ŽProject https://mathvista.github.io/
๐Ÿ˜ŽCode github.com/lupantech/MathVista
โค8๐Ÿ‘3๐Ÿ”ฅ3๐Ÿพ2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’š๐Ÿ’™ Where Is OpenCV 5? ๐Ÿ’™๐Ÿ’š

๐Ÿ‘‰On October 24th, the organization is launching a crowdfunding campaign to raise funds for #OpenCV 5 development.

๐Ÿ‘†me in 2008 during my thesis work about face tracking; up to 50x faster than the previous SOTA. No chance to did it without OpenCV library and support from the community.

๐Ÿ”ฅSupport #OpenCV 5 to create the next-gen of researchers and scientists. Spread the voice: https://t.ly/UTukV
โค22๐Ÿ‘8๐Ÿ”ฅ3๐Ÿ’ฉ1
๐ŸŠSwimXYZ: Synthetic Swim๐ŸŠ

๐Ÿ‘‰SwimXYZ: synthetic dataset for swimming, monocular videos annotated with ground truth 2D and 3D joints

๐Ÿ˜ŽReview https://t.ly/F-rdF
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.04360.pdf
๐Ÿ˜ŽData g-fiche.github.io/research-pages/swimxyz
๐Ÿ”ฅ4๐Ÿ‘2โค1๐Ÿ˜ฑ1๐Ÿคฉ1
๐Ÿ“Š TextPSG: PSG from Text ๐Ÿ“Š

๐Ÿ‘‰A novel problem in #AI: Panoptic Scene Graph Generation from Purely Textual Descriptions (Caption-toPSG)

๐Ÿ˜ŽReview https://t.ly/UXEmk
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.07056.pdf
๐Ÿ˜ŽProject vis-www.cs.umass.edu/TextPSG
๐Ÿ˜ŽCode github.com/chengyzhao/TextPSG
๐Ÿ”ฅ9โค5๐Ÿ‘3๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ™‹ Full Human Motion ๐Ÿ™‹

๐Ÿ‘‰OmniControl by Google is novel framework for text-conditioned human motion generation model based on diffusion process

๐Ÿ˜ŽReview https://t.ly/F_0Ov
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.08580.pdf
๐Ÿ˜ŽProject neu-vi.github.io/omnicontrol/
๐Ÿ‘5๐Ÿคฏ3๐Ÿ”ฅ2๐Ÿ‘1๐Ÿ˜ฑ1
๐Ÿฆนโ€โ™€๏ธ Snap's Hyper-Realistic Human ๐Ÿฆนโ€โ™€๏ธ

๐Ÿ‘‰New diffusive #AI by Snap that generates in-the-wild human images with hyper-realism. Swipe the gallery, NUTS!๐Ÿ‘‡

๐Ÿ˜ŽGallery https://t.ly/cG74X
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.08579.pdf
๐Ÿ˜ŽProject snap-research.github.io/HyperHuman
๐Ÿ˜ŽCode github.com/snap-research/HyperHuman
๐Ÿ‘4๐Ÿ”ฅ1๐Ÿคฏ1๐Ÿ˜ฑ1๐Ÿคฉ1๐Ÿคฃ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘—AG3D clothed avatar from 2D๐Ÿ‘—

๐Ÿ‘‰The novel SOTA in adversarial generative of realistic 3D people

๐Ÿ˜ŽReview https://t.ly/vnJO7
๐Ÿ˜ŽProject https://zj-dong.github.io/AG3D
๐Ÿ˜ŽCode https://github.com/zj-dong/AG3D
๐Ÿ˜ŽPaper zj-dong.github.io/AG3D/assets/paper.pdf
โค7๐Ÿ‘4๐Ÿ”ฅ2๐Ÿฅฐ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒฑPose-Format: All-in-One Pose๐ŸŒฑ

๐Ÿ‘‰ Pose-format: a comprehensive toolkit designed for human pose: unified, flexible, and easy-to-use

๐Ÿ˜ŽReview https://t.ly/rFrhq
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.09066.pdf
๐Ÿ˜ŽCode github.com/sign-language-processing/pose
๐Ÿ”ฅ9๐Ÿคฏ4๐Ÿ‘3๐Ÿ˜ฑ2โšก1๐Ÿ’ฉ1
๐Ÿ˜ป CatFLW: Cat Neural Landmarks ๐Ÿ˜ป

๐Ÿ‘‰Landmark convolution neural network-based model for cat faces

๐Ÿ˜ŽReview https://t.ly/Y3mQ8
๐Ÿ˜ŽPaper arxiv.org/pdf/2305.04232.pdf
๐Ÿ˜ŽDataset www.tech4animals.org/catflw
๐Ÿฅฐ17โค4๐Ÿ‘3๐Ÿ˜ฑ1๐Ÿคฉ1๐Ÿ˜1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿก4K4D: Real-Time 4D at 4K๐Ÿก

๐Ÿ‘‰THE new SOTA in view synthesis of dynamic 3D scenes at 4K. 30x faster, up to 400 FPS. Nuts!

๐Ÿ˜ŽReview https://t.ly/6ddQh
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.11448.pdf
๐Ÿ˜ŽProject zju3dv.github.io/4k4d/
๐Ÿ˜ŽCode github.com/zju3dv/4K4D
๐Ÿ”ฅ8๐Ÿ‘5๐Ÿคฏ5โค1๐Ÿ˜ฑ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ›ฃ๏ธ Holistic Parking Detection (YOLO) ๐Ÿ›ฃ๏ธ

๐Ÿ‘‰ One-step Holistic Parking Slot Network: a tailor-made adaptation of YOLOv4 algorithm for all-shaped parking slot detection

๐Ÿ˜ŽReview https://t.ly/2l4ZG
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.11629.pdf
๐Ÿ”ฅ8๐Ÿคฏ6โค4๐Ÿคฉ3๐Ÿ‘1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿˆ Cutie: VOS with heavy occlusions๐Ÿˆ

๐Ÿ‘‰Cutie: novel VOS for challenging scenarios with heavy occlusions & distractors

๐Ÿ˜ŽReview https://t.ly/W3FR-
๐Ÿ˜ŽPaper arxiv.org/pdf/2310.12982.pdf
๐Ÿ˜ŽProject https://hkchengrex.com/Cutie
๐Ÿ˜ŽCode https://github.com/hkchengrex/Cutie
๐Ÿ‘13๐Ÿคฃ3โค1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿงก Rotoscoping Prince Of Persia (1985) ๐Ÿงก

๐Ÿ‘‰ A rare footage for the animation of Prince of Persia (1989). Damn Romantic.

๐Ÿ˜Ž More https://t.ly/xJife
โค17๐Ÿ‘2๐Ÿ‘2๐Ÿฅฐ1