AI with Papers - Artificial Intelligence & Deep Learning – Telegram

AI with Papers - Artificial Intelligence & Deep Learning

@AI_DeepLearning

15K subscribers

95 photos

237 videos

11 files

1.27K links

All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

Download Telegram

About

Blog

Apps

Platform

AI with Papers - Artificial Intelligence & Deep Learning

15K subscribers

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍜 REM: Segment What You Describe 🍜

👉REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced 💙

👉Review https://t.ly/OyVtV
👉Paper arxiv.org/pdf/2410.23287
👉Project https://miccooper9.github.io/projects/ReferEverything/

🔥18❤4👍3🤩2🤯1😍1

9.67K viewsedited 07:30

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

☀️ Universal Relightable Avatars ☀️

👉#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!

👉Review https://t.ly/U-ESX
👉Paper arxiv.org/pdf/2410.24223
👉Project junxuan-li.github.io/urgca-website

❤11🔥5⚡1👍1

7.75K views07:39

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🏣 CityGaussianV2: Large-Scale City 🏣

👉A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code released💙

👉Review https://t.ly/Xgn59
👉Paper arxiv.org/pdf/2411.00771
👉Project dekuliutesla.github.io/CityGaussianV2/
👉Code github.com/DekuLiuTesla/CityGaussian

👍15🔥9❤2👏1

8.36K views07:22

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

💪 Muscles in Time Dataset 💪

👉Muscles in Time (MinT) is a large-scale synthetic muscle activation dataset. MinT contains 9+ hours of simulation data covering 227 subjects and 402 simulated muscle strands. Code & Dataset available soon 💙

👉Review https://t.ly/108g6
👉Paper arxiv.org/pdf/2411.00128
👉Project davidschneider.ai/mint
👉Code github.com/simplexsigil/MusclesInTime

🔥8❤3👍3

7.38K views08:24

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧠 Single Neuron Reconstruction 🧠

👉SIAT unveils NeuroFly, a framework for large-scale single neuron reconstruction. Formulating neuron reconstruction task as a 3-stage streamlined workflow: automatic segmentation - connection - manual proofreading. Bridging computer vision and neuroscience 💙

👉Review https://t.ly/Y5Xu0
👉Paper https://arxiv.org/pdf/2411.04715
👉Repo github.com/beanli161514/neurofly

❤4🔥1🤩1

7.49K views09:30

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫠 X-Portrait 2: SOTA(?) Portrait Animation 🫠

👉ByteDance unveils a preview of X-Portrait2, the new SOTA expression encoder model that implicitly encodes every minuscule expressions from the input by training it on large-scale datasets. Impressive results but no paper & code announced.

👉Review https://t.ly/8Owh9 [UPDATE]
👉Paper ?
👉Project byteaigc.github.io/X-Portrait2/
👉Repo ?

🔥13🤯5👍4❤1👏1

7.71K viewsedited 10:43

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

❄️Don’t Look Twice: ViT by RLT❄️

👉CMU unveils RLT: speeding up the video transformers inspired by run-length encoding for data compression. Speed the training up and reducing the token count by up to 80%! Source Code announced 💙

👉Review https://t.ly/ccSwN
👉Paper https://lnkd.in/d6VXur_q
👉Project https://lnkd.in/d4tXwM5T
👉Repo TBA

🔥9👍3❤1🤩1

7.86K viewsedited 13:47

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🐔SeedEdit: foundational T2I🐔

👉ByteDance unveils a novel T2I foundational model capable of delivering stable, high-aesthetic image edits which maintain image quality through unlimited rounds of editing instructions. No code announced but a Demo is online💙

👉Review https://t.ly/hPlnN
👉Paper https://arxiv.org/pdf/2411.06686
👉Project team.doubao.com/en/special/seededit
🤗Demo https://huggingface.co/spaces/ByteDance/SeedEdit-APP

🔥10❤6🤩1

7.58K viewsedited 07:51

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 4 NanoSeconds inference 🔥

👉LogicTreeNet: convolutional differentiable logic gate net. with logic gate tree kernels: Computer Vision into differentiable LGNs. Up to 6100% smaller than SOTA, inference in 4 NANOsecs!

👉Review https://t.ly/GflOW
👉Paper https://lnkd.in/dAZQr3dW
👉Full clip https://lnkd.in/dvDJ3j-u

🔥29🤯12👍1🤩1

8.22K views07:54

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛥️ Global Tracklet Association MOT 🛥️

👉A novel universal, model-agnostic method designed to refine and enhance tracklet association for single-camera MOT. Suitable for datasets such as SportsMOT, SoccerNet & similar. Source code released💙

👉Review https://t.ly/gk-yh
👉Paper https://lnkd.in/dvXQVKFw
👉Repo https://lnkd.in/dEJqiyWs

👍10🔥4❤2

8.24K views07:32

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧶 MagicQuill: super-easy Diffusion Editing 🧶

👉MagicQuill is a novel system designed to support users in smart editing of images. Robust UI/UX (e.g., inserting/erasing objects, colors, etc.) under a multimodal LLM to anticipate user intentions in real time. Code & Demos released 💙

👉Review https://t.ly/hJyLa
👉Paper https://arxiv.org/pdf/2411.09703
👉Project https://magicquill.art/demo/
👉Repo https://github.com/magic-quill/magicquill
👉Demo https://huggingface.co/spaces/AI4Editing/MagicQuill

🤩7🔥4❤3👍2

8.85K viewsedited 13:47

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧰 EchoMimicV2: Semi-body Human 🧰

👉Alipay (ANT Group) unveils EchoMimicV2, the novel SOTA half-body human animation via APD-Harmonization. See clip with audio (ZH/ENG). Code & Demo announced💙

👉Review https://t.ly/enLxJ
👉Paper arxiv.org/pdf/2411.10061
👉Project antgroup.github.io/ai/echomimic_v2/
👉Repo-v2 github.com/antgroup/echomimic_v2
👉Repo-v1 https://github.com/antgroup/echomimic

❤5🔥5👏2

9.09K views10:31

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

⚔️SAMurai: SAM for Tracking⚔️

👉UWA unveils SAMURAI, an enhanced adaptation of SAM 2 specifically designed for visual object tracking. New SOTA! Code under Apache 2.0💙

👉Review https://t.ly/yGU0P
👉Paper https://arxiv.org/pdf/2411.11922
👉Repo https://github.com/yangchris11/samurai
👉Project https://yangchris11.github.io/samurai/

🔥20❤6😍2⚡1👏1🤯1

8.49K views07:40

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦖Dino-X: Unified Obj-Centric LVM🦖

👉Unified vision model for Open-World Detection, Segmentation, Phrase Grounding, Visual Counting, Pose, Prompt-Free Detection/Recognition, Dense Caption, & more. Demo & API announced 💙

👉Review https://t.ly/CSQon
👉Paper https://lnkd.in/dc44ZM8v
👉Project https://lnkd.in/dehKJVvC
👉Repo https://lnkd.in/df8Kb6iz

🔥12🤯8❤4👍3🤩1

8.45K viewsedited 09:09

AI with Papers - Artificial Intelligence & Deep Learning

🌎All Languages Matter: LMMs vs. 100 Lang.🌎

👉ALM-Bench aims to assess the next generation of massively multilingual multimodal models in a standardized way, pushing the boundaries of LMMs towards better cultural understanding and inclusivity. Code & Dataset 💙

👉Review https://t.ly/VsoJB
👉Paper https://lnkd.in/ddVVZfi2
👉Project https://lnkd.in/dpssaeRq
👉Code https://lnkd.in/dnbaJJE4
👉Dataset https://lnkd.in/drw-_95v

❤3👍1👏1🤩1

7.31K views08:27

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦙 EdgeCape: SOTA Agnostic Pose 🦙

👉EdgeCap: new SOTA in Category-Agnostic Pose Estimation (CAPE): finding keypoints across diverse object categories using only one or a few annotated support images. Source code released💙

👉Review https://t.ly/4TpAs
👉Paper https://arxiv.org/pdf/2411.16665
👉Project https://orhir.github.io/edge_cape/
👉Code https://github.com/orhir/EdgeCape

🔥10👍1🤯1

7.86K views13:55

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛟 StableAnimator: ID-aware Humans 🛟

👉StableAnimator: first e2e ID-preserving diffusion for HQ videos without any post-processing. Input: single image + sequence of poses. Insane results!

👉Review https://t.ly/JDtL3
👉Paper https://arxiv.org/pdf/2411.17697
👉Project francis-rings.github.io/StableAnimator/
👉Code github.com/Francis-Rings/StableAnimator

👍12❤3🤯2🔥1

8.24K views09:25

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧶SOTA track-by-propagation🧶

👉SambaMOTR is a novel e2e model (based on Samba) for long-range dependencies and interactions between tracklets to handle complex motion patterns / occlusions. Code in Jan. 25 💙

👉Review https://t.ly/QSQ8L
👉Paper arxiv.org/pdf/2410.01806
👉Project sambamotr.github.io/
👉Repo https://lnkd.in/dRDX6nk2

❤5🔥2🤯1

8.68K viewsedited 13:26

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

👺HiFiVFS: Extreme Face Swapping👺

👉HiFiVFS: HQ face swapping videos even in extremely challenging scenarios (occlusion, makeup, lights, extreme poses, etc.). Impressive results, no code announced😢

👉Review https://t.ly/ea8dU
👉Paper https://arxiv.org/pdf/2411.18293
👉Project https://cxcx1996.github.io/HiFiVFS

🤯13❤2🔥2👍1👏1🤩1

9.08K viewsedited 09:12

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥Video Depth without Video Models🔥

👉RollingDepth: turning a single-image latent diffusion model (LDM) into the novel SOTA depth estimator. It works better than dedicated model for depth 🤯 Code under Apache💙

👉Review https://t.ly/R4LqS
👉Paper https://arxiv.org/pdf/2411.19189
👉Project https://rollingdepth.github.io/
👉Repo https://github.com/prs-eth/rollingdepth

🔥14🤯4👍2🤩1

8.52K views07:58