AI with Papers - Artificial Intelligence & Deep Learning

AI with Papers - Artificial Intelligence & Deep Learning pinned a GIF

09:21

This media is not supported in your browser

⛈️ SMITE: SEGMENT IN TIME ⛈️

👉SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced 💙

👉Review https://t.ly/w6aWJ
👉Paper arxiv.org/pdf/2410.18538
👉Project segment-me-in-time.github.io/
👉Repo github.com/alimohammadiamirhossein/smite

🤯11❤4🤩1

8.33K viewsedited 10:49

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫐 Blendify: #Python + Blender 🫐

👉Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code released💙

👉Review https://t.ly/l0crA
👉Paper https://arxiv.org/pdf/2410.17858
👉Code https://virtualhumans.mpi-inf.mpg.de/blendify/

🤩13👍4🔥4❤2👏1

7.74K viewsedited 07:53

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 D-FINE: new SOTA Detector 🔥

👉D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available 💙

👉Review https://t.ly/aw9fN
👉Paper https://arxiv.org/pdf/2410.13842
👉Code https://github.com/Peterande/D-FINE

❤16👍3👏1🤯1

7.93K views08:00

AI with Papers - Artificial Intelligence & Deep Learning

🔫 Free-Moving Reconstruction 🔫 👉EPFL (+#MagicLeap) unveils a novel approach for reconstructing free-moving object from monocular RGB clip. Free interaction with objects in front of a moving cam without relying on any prior, and optimizes the sequence globally…

🔥🔥 The code is out 🔥🔥

👉Code https://github.com/HaixinShi/fmov_pose

GitHub

GitHub - HaixinShi/fmov_pose: This is the official repo for the implementation of Free-Moving Object Reconstruction and Pose Estimation…

This is the official repo for the implementation of Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera(AAAI 2025). - HaixinShi/fmov_pose

👍1

8.56K views08:13

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍜 REM: Segment What You Describe 🍜

👉REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced 💙

👉Review https://t.ly/OyVtV
👉Paper arxiv.org/pdf/2410.23287
👉Project https://miccooper9.github.io/projects/ReferEverything/

🔥18❤4👍3🤩2🤯1😍1

9.67K viewsedited 07:30

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

☀️ Universal Relightable Avatars ☀️

👉#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!

👉Review https://t.ly/U-ESX
👉Paper arxiv.org/pdf/2410.24223
👉Project junxuan-li.github.io/urgca-website

❤11🔥5⚡1👍1

7.75K views07:39

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🏣 CityGaussianV2: Large-Scale City 🏣

👉A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code released💙

👉Review https://t.ly/Xgn59
👉Paper arxiv.org/pdf/2411.00771
👉Project dekuliutesla.github.io/CityGaussianV2/
👉Code github.com/DekuLiuTesla/CityGaussian

👍15🔥9❤2👏1

8.36K views07:22

AI with Papers - Artificial Intelligence & Deep Learning

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

💪 Muscles in Time Dataset 💪

👉Muscles in Time (MinT) is a large-scale synthetic muscle activation dataset. MinT contains 9+ hours of simulation data covering 227 subjects and 402 simulated muscle strands. Code & Dataset available soon 💙

👉Review https://t.ly/108g6
👉Paper arxiv.org/pdf/2411.00128
👉Project davidschneider.ai/mint
👉Code github.com/simplexsigil/MusclesInTime

🔥8❤3👍3

7.39K views08:24

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧠 Single Neuron Reconstruction 🧠

👉SIAT unveils NeuroFly, a framework for large-scale single neuron reconstruction. Formulating neuron reconstruction task as a 3-stage streamlined workflow: automatic segmentation - connection - manual proofreading. Bridging computer vision and neuroscience 💙

👉Review https://t.ly/Y5Xu0
👉Paper https://arxiv.org/pdf/2411.04715
👉Repo github.com/beanli161514/neurofly

❤4🔥1🤩1

7.5K views09:30

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫠 X-Portrait 2: SOTA(?) Portrait Animation 🫠

👉ByteDance unveils a preview of X-Portrait2, the new SOTA expression encoder model that implicitly encodes every minuscule expressions from the input by training it on large-scale datasets. Impressive results but no paper & code announced.

👉Review https://t.ly/8Owh9 [UPDATE]
👉Paper ?
👉Project byteaigc.github.io/X-Portrait2/
👉Repo ?

🔥13🤯5👍4❤1👏1

7.71K viewsedited 10:43

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

❄️Don’t Look Twice: ViT by RLT❄️

👉CMU unveils RLT: speeding up the video transformers inspired by run-length encoding for data compression. Speed the training up and reducing the token count by up to 80%! Source Code announced 💙

👉Review https://t.ly/ccSwN
👉Paper https://lnkd.in/d6VXur_q
👉Project https://lnkd.in/d4tXwM5T
👉Repo TBA

🔥9👍3❤1🤩1

7.86K viewsedited 13:47

AI with Papers - Artificial Intelligence & Deep Learning

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

🐔SeedEdit: foundational T2I🐔

👉ByteDance unveils a novel T2I foundational model capable of delivering stable, high-aesthetic image edits which maintain image quality through unlimited rounds of editing instructions. No code announced but a Demo is online💙

👉Review https://t.ly/hPlnN
👉Paper https://arxiv.org/pdf/2411.06686
👉Project team.doubao.com/en/special/seededit
🤗Demo https://huggingface.co/spaces/ByteDance/SeedEdit-APP

🔥10❤6🤩1

7.59K viewsedited 07:51

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 4 NanoSeconds inference 🔥

👉LogicTreeNet: convolutional differentiable logic gate net. with logic gate tree kernels: Computer Vision into differentiable LGNs. Up to 6100% smaller than SOTA, inference in 4 NANOsecs!

👉Review https://t.ly/GflOW
👉Paper https://lnkd.in/dAZQr3dW
👉Full clip https://lnkd.in/dvDJ3j-u

🔥29🤯12👍1🤩1

8.22K views07:54

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛥️ Global Tracklet Association MOT 🛥️

👉A novel universal, model-agnostic method designed to refine and enhance tracklet association for single-camera MOT. Suitable for datasets such as SportsMOT, SoccerNet & similar. Source code released💙

👉Review https://t.ly/gk-yh
👉Paper https://lnkd.in/dvXQVKFw
👉Repo https://lnkd.in/dEJqiyWs

👍10🔥4❤2

8.24K views07:32

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧶 MagicQuill: super-easy Diffusion Editing 🧶

👉MagicQuill is a novel system designed to support users in smart editing of images. Robust UI/UX (e.g., inserting/erasing objects, colors, etc.) under a multimodal LLM to anticipate user intentions in real time. Code & Demos released 💙

👉Review https://t.ly/hJyLa
👉Paper https://arxiv.org/pdf/2411.09703
👉Project https://magicquill.art/demo/
👉Repo https://github.com/magic-quill/magicquill
👉Demo https://huggingface.co/spaces/AI4Editing/MagicQuill

🤩7🔥4❤3👍2

8.85K viewsedited 13:47

AI with Papers - Artificial Intelligence & Deep Learning

0:06

This media is not supported in your browser

VIEW IN TELEGRAM

🧰 EchoMimicV2: Semi-body Human 🧰

👉Alipay (ANT Group) unveils EchoMimicV2, the novel SOTA half-body human animation via APD-Harmonization. See clip with audio (ZH/ENG). Code & Demo announced💙

👉Review https://t.ly/enLxJ
👉Paper arxiv.org/pdf/2411.10061
👉Project antgroup.github.io/ai/echomimic_v2/
👉Repo-v2 github.com/antgroup/echomimic_v2
👉Repo-v1 https://github.com/antgroup/echomimic

❤5🔥5👏2

9.09K views10:31

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

⚔️SAMurai: SAM for Tracking⚔️

👉UWA unveils SAMURAI, an enhanced adaptation of SAM 2 specifically designed for visual object tracking. New SOTA! Code under Apache 2.0💙

👉Review https://t.ly/yGU0P
👉Paper https://arxiv.org/pdf/2411.11922
👉Repo https://github.com/yangchris11/samurai
👉Project https://yangchris11.github.io/samurai/

🔥20❤6😍2⚡1👏1🤯1

8.5K views07:40

AI with Papers - Artificial Intelligence & Deep Learning

0:07

This media is not supported in your browser

VIEW IN TELEGRAM

🦖Dino-X: Unified Obj-Centric LVM🦖

👉Unified vision model for Open-World Detection, Segmentation, Phrase Grounding, Visual Counting, Pose, Prompt-Free Detection/Recognition, Dense Caption, & more. Demo & API announced 💙

👉Review https://t.ly/CSQon
👉Paper https://lnkd.in/dc44ZM8v
👉Project https://lnkd.in/dehKJVvC
👉Repo https://lnkd.in/df8Kb6iz

🔥12🤯8❤4👍3🤩1

8.46K viewsedited 09:09

AI with Papers - Artificial Intelligence & Deep Learning

🌎All Languages Matter: LMMs vs. 100 Lang.🌎

👉ALM-Bench aims to assess the next generation of massively multilingual multimodal models in a standardized way, pushing the boundaries of LMMs towards better cultural understanding and inclusivity. Code & Dataset 💙

👉Review https://t.ly/VsoJB
👉Paper https://lnkd.in/ddVVZfi2
👉Project https://lnkd.in/dpssaeRq
👉Code https://lnkd.in/dnbaJJE4
👉Dataset https://lnkd.in/drw-_95v

❤3👍1👏1🤩1

7.31K views08:27

About

Blog

Apps

Platform