AI with Papers - Artificial Intelligence & Deep Learning
15K subscribers
96 photos
238 videos
11 files
1.27K links
All the AI with papers. Every day fresh updates on Deep Learning, Machine Learning, and Computer Vision (with Papers).

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
☀️ GS + Depth = SOTA ☀️

👉DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soon💙

👉Review https://t.ly/87HuH
👉Paper arxiv.org/abs/2410.13862
👉Project haofeixu.github.io/depthsplat/
👉Code github.com/cvg/depthsplat
🤯9🔥831👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥BitNet: code of 1-bit LLM released🔥

👉BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released 💙

👉Review https://t.ly/3G2LA
👉Paper arxiv.org/pdf/2310.11453
👉Code https://lnkd.in/duPADJVb
🔥215🤯2👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧿 Look Ma, no markers 🧿

👉#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset released💙

👉Review https://t.ly/5fN0g
👉Paper arxiv.org/pdf/2410.11520
👉Project microsoft.github.io/SynthMoCap/
👉Repo github.com/microsoft/SynthMoCap
🤯16👍10🔥3😱31👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪁 PL2Map: efficient neural 2D-3D 🪁

👉PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences

👉Review https://t.ly/D-bVD
👉Paper arxiv.org/pdf/2402.18011
👉Project https://thpjp.github.io/pl2map
👉Code https://github.com/ais-lab/pl2map
🔥14🤯8👍21🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌻 Plant Camouflage Detection🌻

👉PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released 💙

👉Review https://t.ly/pYFX4
👉Paper arxiv.org/pdf/2410.17598
👉Code github.com/yjybuaa/PlantCamo
11👍6🤯4👏1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
⛈️ SMITE: SEGMENT IN TIME ⛈️

👉SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced 💙

👉Review https://t.ly/w6aWJ
👉Paper arxiv.org/pdf/2410.18538
👉Project segment-me-in-time.github.io/
👉Repo github.com/alimohammadiamirhossein/smite
🤯114🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Blendify: #Python + Blender 🫐

👉Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code released💙

👉Review https://t.ly/l0crA
👉Paper https://arxiv.org/pdf/2410.17858
👉Code https://virtualhumans.mpi-inf.mpg.de/blendify/
🤩13👍4🔥42👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 D-FINE: new SOTA Detector 🔥

👉D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available 💙

👉Review https://t.ly/aw9fN
👉Paper https://arxiv.org/pdf/2410.13842
👉Code https://github.com/Peterande/D-FINE
16👍3👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜 REM: Segment What You Describe 🍜

👉REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced 💙

👉Review https://t.ly/OyVtV
👉Paper arxiv.org/pdf/2410.23287
👉Project https://miccooper9.github.io/projects/ReferEverything/
🔥184👍3🤩2🤯1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
☀️ Universal Relightable Avatars ☀️

👉#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!

👉Review https://t.ly/U-ESX
👉Paper arxiv.org/pdf/2410.24223
👉Project junxuan-li.github.io/urgca-website
11🔥51👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🏣 CityGaussianV2: Large-Scale City 🏣

👉A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code released💙

👉Review https://t.ly/Xgn59
👉Paper arxiv.org/pdf/2411.00771
👉Project dekuliutesla.github.io/CityGaussianV2/
👉Code github.com/DekuLiuTesla/CityGaussian
👍15🔥92👏1
This media is not supported in your browser
VIEW IN TELEGRAM
💪 Muscles in Time Dataset 💪

👉Muscles in Time (MinT) is a large-scale synthetic muscle activation dataset. MinT contains 9+ hours of simulation data covering 227 subjects and 402 simulated muscle strands. Code & Dataset available soon 💙

👉Review https://t.ly/108g6
👉Paper arxiv.org/pdf/2411.00128
👉Project davidschneider.ai/mint
👉Code github.com/simplexsigil/MusclesInTime
🔥83👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🧠 Single Neuron Reconstruction 🧠

👉SIAT unveils NeuroFly, a framework for large-scale single neuron reconstruction. Formulating neuron reconstruction task as a 3-stage streamlined workflow: automatic segmentation - connection - manual proofreading. Bridging computer vision and neuroscience 💙

👉Review https://t.ly/Y5Xu0
👉Paper https://arxiv.org/pdf/2411.04715
👉Repo github.com/beanli161514/neurofly
4🔥1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫠 X-Portrait 2: SOTA(?) Portrait Animation 🫠

👉ByteDance unveils a preview of X-Portrait2, the new SOTA expression encoder model that implicitly encodes every minuscule expressions from the input by training it on large-scale datasets. Impressive results but no paper & code announced.

👉Review https://t.ly/8Owh9 [UPDATE]
👉Paper ?
👉Project byteaigc.github.io/X-Portrait2/
👉Repo ?
🔥13🤯5👍41👏1
This media is not supported in your browser
VIEW IN TELEGRAM
❄️Don’t Look Twice: ViT by RLT❄️

👉CMU unveils RLT: speeding up the video transformers inspired by run-length encoding for data compression. Speed the training up and reducing the token count by up to 80%! Source Code announced 💙

👉Review https://t.ly/ccSwN
👉Paper https://lnkd.in/d6VXur_q
👉Project https://lnkd.in/d4tXwM5T
👉Repo TBA
🔥9👍31🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐔SeedEdit: foundational T2I🐔

👉ByteDance unveils a novel T2I foundational model capable of delivering stable, high-aesthetic image edits which maintain image quality through unlimited rounds of editing instructions. No code announced but a Demo is online💙

👉Review https://t.ly/hPlnN
👉Paper https://arxiv.org/pdf/2411.06686
👉Project team.doubao.com/en/special/seededit
🤗Demo https://huggingface.co/spaces/ByteDance/SeedEdit-APP
🔥106🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 4 NanoSeconds inference 🔥

👉LogicTreeNet: convolutional differentiable logic gate net. with logic gate tree kernels: Computer Vision into differentiable LGNs. Up to 6100% smaller than SOTA, inference in 4 NANOsecs!

👉Review https://t.ly/GflOW
👉Paper https://lnkd.in/dAZQr3dW
👉Full clip https://lnkd.in/dvDJ3j-u
🔥29🤯12👍1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🛥️ Global Tracklet Association MOT 🛥️

👉A novel universal, model-agnostic method designed to refine and enhance tracklet association for single-camera MOT. Suitable for datasets such as SportsMOT, SoccerNet & similar. Source code released💙

👉Review https://t.ly/gk-yh
👉Paper https://lnkd.in/dvXQVKFw
👉Repo https://lnkd.in/dEJqiyWs
👍10🔥42