AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
95 photos
235 videos
11 files
1.26K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🪷 Diffusive Consistent Video Editing 🪷

👉 Weizmann Institute of Science unveils TokenFlow, a novel text-to-image diffusion model for text-driven video editing

😎Review https://t.ly/ru8km
😎Paper arxiv.org/pdf/2307.10373.pdf
😎Project diffusion-tokenflow.github.io
😎Code github.com/omerbt/TokenFlow
9👍6🔥2🤯1😱1😢1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥🔥 #META's DINOv2 is now commercial! 🔥🔥

👉Universal features for image classification, instance retrieval, video understanding, depth & semantic segmentation. Now suitable for commercial.

😎Review https://t.ly/LNrGy
😎Paper arxiv.org/pdf/2304.07193.pdf
😎Code github.com/facebookresearch/dinov2
😎Demo dinov2.metademolab.com/
🔥15👍31🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧄FreeMan: towards #3D Humans 🧄

👉FreeMan: the first large-scale, real-world, multi-view dataset for #3D human pose estimation. 11M frames!

😎Review https://t.ly/ICxpA
😎Paper arxiv.org/pdf/2309.05073.pdf
😎Project wangjiongw.github.io/freeman
👏6🤯4🥰1
🦊 MagiCapture: HD Multi-Concept Portrait 🦊

👉KAIST unveils MagiCapture: integrating subject and style concepts to generate high-resolution portrait images using just a few subject and style references

😎Review https://t.ly/c9rOo
😎Paper https://arxiv.org/pdf/2309.06895.pdf
5🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
Dynamic NeRFs for Soccer

👉SoccerNeRF: first attempt of "cheap" NeRF applied to football for reconstructing soccer replays in space and time.

😎Review https://t.ly/Ywcvk
😎Paper arxiv.org/pdf/2309.06802.pdf
😎Project https://soccernerfs.isach.be/
😎Code github.com/iSach/SoccerNeRFs
🔥84👍3🤩2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
☢️ GlueStick: Graph Neural Matching ☢️

👉GlueStick is joint deep matcher for points and lines that leverages the connectivity information between nodes to better glue them together

😎Review https://t.ly/Atxqo
😎Paper arxiv.org/pdf/2304.02008.pdf
😎Code https://github.com/cvg/GlueStick
🔥11👍41🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫀CPR-Coach: Neural Cardiopulmonary Resuscitation🫀

👉CPR-Coach: fine-grained action recognition in cardiopulmonary resuscitation

😎Review https://t.ly/Qbg4K
😎Paper arxiv.org/pdf/2309.11718.pdf
😎Code github.com/Shunli-Wang/CPR-Coach
😎Project shunli-wang.github.io/CPR-Coach
7🔥3👏1
🧪 NeuralLabeling with NeRF 🧪

👉Annotating a scene by generating segmentation masks, affordance maps, 2D bounding boxes, 3D BB, 6DOF poses, depth & meshes.

😎Review https://t.ly/1GPsj
😎Paper arxiv.org/pdf/2309.11966.pdf
😎Code github.com/FlorisE/neural-labeling
😎Project florise.github.io/neural_labeling_web
👍5🤯3🔥21🥰1
🍟 DE-ViT: detecting everything via DINOv2 🍟

👉DE-ViT: open-set object detector based on DINOv2 backbone. It's the new SOTA on COCO & LVIS dataset

😎Review https://t.ly/_DAmt
😎Paper arxiv.org/pdf/2309.12969.pdf
😎Code https://github.com/mlzxy/devit
🔥8👍41🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🛵CoTracker: fast transformer-tracker🛵

👉META's CoTracker is a fast transformer-based model that can track any point in a video

😎Review https://t.ly/M36A_
😎Paper arxiv.org/pdf/2307.07635.pdf
😎Project https://co-tracker.github.io/
😎Code github.com/facebookresearch/co-tracker
7👍4🤯2🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🌬️ Neural Blowing in Still Photos 🌬️

👉 A novel approach to animate human hair (and clothes) in a still portraits

😎Review https://t.ly/HKG0t
😎Paper arxiv.org/pdf/2309.14207.pdf
😎Project nevergiveu.github.io/AutomaticHairBlowing
👍6🤯3🔥1👏1😍1🤣1
This media is not supported in your browser
VIEW IN TELEGRAM
🌮 OW Indoor Segmentation 🌮

👉3D-OWIS is a novel open-world 3D indoor instance segmentation method (with auto-labeling scheme) to separate known/unknown category labels

😎Review https://t.ly/-7ALf
😎Paper arxiv.org/pdf/2309.14338.pdf
😎Code github.com/aminebdj/3D-OWIS
👍6🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧱 Generating Scenes from Touch 🧱

👉#AI for synthesizing images from tactile signals (and vice versa) and apply it to a number of visuo-tactile synthesis tasks

😎Review https://t.ly/Gxr0L
😎Paper https://arxiv.org/pdf/2309.15117.pdf
😎Project https://fredfyyang.github.io/vision-from-touch
😎Code https://github.com/fredfyyang/vision-from-touch
🤯9👍61🔥1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
Decaf: 3D Face-Hand Interactions

👉The first learning-based MoCap to track human hands interacting with human faces in #3D from single monocular RGB videos

😎Review https://t.ly/070Tj
😎Paper arxiv.org/pdf/2309.16670.pdf
😎Project vcai.mpi-inf.mpg.de/projects/Decaf
👍8🤯8🔥31👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌱 Making LLaMA See and Draw 🌱

👉Tencent #AI planted a SEED of Vision in Large Language Model. Making LLaMA see 'n' draw stuff.

😎Review https://t.ly/QiCAv
😎Paper arxiv.org/pdf/2310.01218.pdf
😎Code github.com/AILab-CVC/SEED
8👍4🤯3🔥1
🔥Visual-Math Q&A: MathVista is out! 🔥

👉 MathVista is the ultimate benchmark designed to amalgamate challenges from diverse mathematical and visual tasks

😎Review https://t.ly/yfqHZ
😎Paper https://arxiv.org/pdf/2310.02255.pdf
😎Project https://mathvista.github.io/
😎Code github.com/lupantech/MathVista
8👍3🔥3🍾2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💚💙 Where Is OpenCV 5? 💙💚

👉On October 24th, the organization is launching a crowdfunding campaign to raise funds for #OpenCV 5 development.

👆me in 2008 during my thesis work about face tracking; up to 50x faster than the previous SOTA. No chance to did it without OpenCV library and support from the community.

🔥Support #OpenCV 5 to create the next-gen of researchers and scientists. Spread the voice: https://t.ly/UTukV
22👍8🔥3💩1
🏊SwimXYZ: Synthetic Swim🏊

👉SwimXYZ: synthetic dataset for swimming, monocular videos annotated with ground truth 2D and 3D joints

😎Review https://t.ly/F-rdF
😎Paper arxiv.org/pdf/2310.04360.pdf
😎Data g-fiche.github.io/research-pages/swimxyz
🔥4👍21😱1🤩1
📊 TextPSG: PSG from Text 📊

👉A novel problem in #AI: Panoptic Scene Graph Generation from Purely Textual Descriptions (Caption-toPSG)

😎Review https://t.ly/UXEmk
😎Paper arxiv.org/pdf/2310.07056.pdf
😎Project vis-www.cs.umass.edu/TextPSG
😎Code github.com/chengyzhao/TextPSG
🔥95👍3🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🙋 Full Human Motion 🙋

👉OmniControl by Google is novel framework for text-conditioned human motion generation model based on diffusion process

😎Review https://t.ly/F_0Ov
😎Paper arxiv.org/pdf/2310.08580.pdf
😎Project neu-vi.github.io/omnicontrol/
👍5🤯3🔥2👏1😱1