This media is not supported in your browser
VIEW IN TELEGRAM
π RELI11D: Multimodal Humans π
πRELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soonπ
πReview https://t.ly/5EG6X
πPaper https://lnkd.in/ep6Utcik
πProject https://lnkd.in/eDhNHYBb
πRELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soonπ
πReview https://t.ly/5EG6X
πPaper https://lnkd.in/ep6Utcik
πProject https://lnkd.in/eDhNHYBb
β€3π₯2
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ ECoDepth: SOTA Diffusive Mono-Depth π₯
πNew SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released π
πReview https://t.ly/s2pbB
πPaper https://lnkd.in/eYt5yr_q
πCode https://lnkd.in/eEcyPQcd
πNew SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released π
πReview https://t.ly/s2pbB
πPaper https://lnkd.in/eYt5yr_q
πCode https://lnkd.in/eEcyPQcd
π₯11π4β€3β‘1
AI with Papers - Artificial Intelligence & Deep Learning
π¦ DINO-based Video Tracking π¦ πThe Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)π πReview https://t.ly/_GIMT πPaper https://lnkd.in/dsGVDcar πProject dino-tracker.github.io/β¦
GitHub
GitHub - AssafSinger94/dino-tracker: Official Pytorch Implementation for βDINO-Tracker: Taming DINO for Self-Supervised Point Trackingβ¦
Official Pytorch Implementation for βDINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Videoβ (ECCV 2024) - AssafSinger94/dino-tracker
π10β€2
This media is not supported in your browser
VIEW IN TELEGRAM
π·οΈ Gen-NeRF2NeRF Translation π·οΈ
πGenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
πReview https://t.ly/VMWAH
πPaper arxiv.org/pdf/2404.02788.pdf
πProject xiangyueliu.github.io/GenN2N/
πCode github.com/Lxiangyue/GenN2N
πGenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
πReview https://t.ly/VMWAH
πPaper arxiv.org/pdf/2404.02788.pdf
πProject xiangyueliu.github.io/GenN2N/
πCode github.com/Lxiangyue/GenN2N
π€―4β€3π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
πiSeg: Interactive 3D Segmentationπ
π iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
πReview https://t.ly/tyFnD
πPaper https://lnkd.in/dydAz8zp
πProject https://lnkd.in/de-h6SRi
πCode (coming)
π iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
πReview https://t.ly/tyFnD
πPaper https://lnkd.in/dydAz8zp
πProject https://lnkd.in/de-h6SRi
πCode (coming)
β€7π2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π Neural Bodies with Clothes π
πNeural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
πReview https://t.ly/Un1wc
πProject https://lnkd.in/dhDG6FF5
πPaper https://lnkd.in/dhcfK7jZ
πCode https://lnkd.in/dQvXWysP
πNeural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
πReview https://t.ly/Un1wc
πProject https://lnkd.in/dhDG6FF5
πPaper https://lnkd.in/dhcfK7jZ
πCode https://lnkd.in/dQvXWysP
π₯7π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π BodyMAP: human body & pressure π
π#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming π
πReview https://t.ly/8926S
πProject bodymap3d.github.io/
πPaper https://lnkd.in/gCxH4ev3
πCode https://lnkd.in/gaifdy3q
π#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming π
πReview https://t.ly/8926S
πProject bodymap3d.github.io/
πPaper https://lnkd.in/gCxH4ev3
πCode https://lnkd.in/gaifdy3q
β€8π€―4β‘1π1π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π§ XComposer2: 4K Vision-Language π§
πInternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840Γ1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released π
πReview https://t.ly/GCHsz
πPaper arxiv.org/pdf/2404.06512.pdf
πCode github.com/InternLM/InternLM-XComposer
πInternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840Γ1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released π
πReview https://t.ly/GCHsz
πPaper arxiv.org/pdf/2404.06512.pdf
πCode github.com/InternLM/InternLM-XComposer
π₯°7β‘2π1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ Flying w/ Photons: Neural Render βοΈ
πNovel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
πReview https://t.ly/ZqL3a
πPaper arxiv.org/pdf/2404.06493.pdf
πProject anaghmalik.com/FlyingWithPhotons/
πCode github.com/anaghmalik/FlyingWithPhotons
πNovel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
πReview https://t.ly/ZqL3a
πPaper arxiv.org/pdf/2404.06493.pdf
πProject anaghmalik.com/FlyingWithPhotons/
πCode github.com/anaghmalik/FlyingWithPhotons
π€―6β‘3β€2π1π€£1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ Tracking Any 2D Pixels in 3D βοΈ
π SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
πReview https://t.ly/B28Cj
πPaper https://lnkd.in/d8ers_nm
πProject https://lnkd.in/deHjtZuE
πCode https://lnkd.in/dMe3TvFT
π SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
πReview https://t.ly/B28Cj
πPaper https://lnkd.in/d8ers_nm
πProject https://lnkd.in/deHjtZuE
πCode https://lnkd.in/dMe3TvFT
β€10π₯5β‘1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺYOLO-CIANNA: Neural Astroπͺ
π CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released π
πReview https://t.ly/441XS
πPaper arxiv.org/pdf/2402.05925.pdf
πCode github.com/Deyht/CIANNA
πWiki github.com/Deyht/CIANNA/wiki
π CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released π
πReview https://t.ly/441XS
πPaper arxiv.org/pdf/2402.05925.pdf
πCode github.com/Deyht/CIANNA
πWiki github.com/Deyht/CIANNA/wiki
π7β‘5β€4π₯2π₯°2
This media is not supported in your browser
VIEW IN TELEGRAM
π§€Neuro MusculoSkeletal-MANOπ§€
πSJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced π
πReview https://t.ly/HOQrn
πPaper arxiv.org/pdf/2404.10227.pdf
πProject https://ms-mano.robotflow.ai/
πCode announced (no repo yet)
πSJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced π
πReview https://t.ly/HOQrn
πPaper arxiv.org/pdf/2404.10227.pdf
πProject https://ms-mano.robotflow.ai/
πCode announced (no repo yet)
π₯3β‘1β€1π1π1
This media is not supported in your browser
VIEW IN TELEGRAM
β½SoccerNET: Athlete Trackingβ½
πSoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
πReview https://t.ly/Mdu9s
πPaper arxiv.org/pdf/2404.11335.pdf
πCode github.com/SoccerNet/sn-gamestate
πSoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
πReview https://t.ly/Mdu9s
πPaper arxiv.org/pdf/2404.11335.pdf
πCode github.com/SoccerNet/sn-gamestate
β€9π8π₯3β‘2π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π² Articulated Objs from MonoClips π²
πREACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video
πReview https://t.ly/REuM8
πPaper https://lnkd.in/d6PWagij
πProject https://lnkd.in/dpg3x4tm
πRepo https://lnkd.in/dRZWj6_N
πREACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video
πReview https://t.ly/REuM8
πPaper https://lnkd.in/d6PWagij
πProject https://lnkd.in/dpg3x4tm
πRepo https://lnkd.in/dRZWj6_N
π€―6π1π₯1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺΌ All You Need is SAM (+Flow) πͺΌ
πOxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced π
πReview https://t.ly/ZRYtp
πPaper https://lnkd.in/d4XqkEGF
πProject https://lnkd.in/dHpmx3FF
πRepo coming: https://github.com/Jyxarthur/
πOxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced π
πReview https://t.ly/ZRYtp
πPaper https://lnkd.in/d4XqkEGF
πProject https://lnkd.in/dHpmx3FF
πRepo coming: https://github.com/Jyxarthur/
β€12π7π₯2π€―2
This media is not supported in your browser
VIEW IN TELEGRAM
π 6Img-to-3D driving scenarios π
πEPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics
πReview https://shorturl.at/dZ018
πPaper arxiv.org/pdf/2404.12378.pdf
πProject 6img-to-3d.github.io/
πCode github.com/continental/6Img-to-3D
πEPFL (+ Continental) unveils 6Img-to-3D, novel transformer-based encoder-renderer method to create 3D onbounded outdoor driving scenarios with only six pics
πReview https://shorturl.at/dZ018
πPaper arxiv.org/pdf/2404.12378.pdf
πProject 6img-to-3d.github.io/
πCode github.com/continental/6Img-to-3D
π₯5β€1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πΉ Physics-Based 3D Video-Gen πΉ
πPhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects
πReview https://t.ly/zxXf9
πPaper arxiv.org/pdf/2404.13026.pdf
πProject physdreamer.github.io/
πCode github.com/a1600012888/PhysDreamer
πPhysDreamer, a physics-based approach that leverages the object dynamics priors learned by video generation models. It enables realistic 3D interaction with objects
πReview https://t.ly/zxXf9
πPaper arxiv.org/pdf/2404.13026.pdf
πProject physdreamer.github.io/
πCode github.com/a1600012888/PhysDreamer
π14β€9π€―4π1
This media is not supported in your browser
VIEW IN TELEGRAM
π‘ NER-Net: Seeing at Night-Time π‘
πHuazhong (+Beijing) unveils a novel event-based nighttime imaging solution under non-uniform illumination, plus a paired multi-illumination level real-world dataset. Repo online, code coming π
πReview https://t.ly/Z9JMJ
πPaper arxiv.org/pdf/2404.11884.pdf
πRepo github.com/Liu-haoyue/NER-Net
πClip https://www.youtube.com/watch?v=zpfTLCF1Kw4
πHuazhong (+Beijing) unveils a novel event-based nighttime imaging solution under non-uniform illumination, plus a paired multi-illumination level real-world dataset. Repo online, code coming π
πReview https://t.ly/Z9JMJ
πPaper arxiv.org/pdf/2404.11884.pdf
πRepo github.com/Liu-haoyue/NER-Net
πClip https://www.youtube.com/watch?v=zpfTLCF1Kw4
π€―3π₯2β€1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π FlowMap: dense depth video π
πMIT (+CSAIL) unveils FlowMap, a novel E2E differentiable method that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Source Code released π
πReview https://t.ly/CBH48
πPaper arxiv.org/pdf/2404.15259.pdf
πProject cameronosmith.github.io/flowmap
πCode github.com/dcharatan/flowmap
πMIT (+CSAIL) unveils FlowMap, a novel E2E differentiable method that solves for precise camera poses, camera intrinsics, and perframe dense depth of a video sequence. Source Code released π
πReview https://t.ly/CBH48
πPaper arxiv.org/pdf/2404.15259.pdf
πProject cameronosmith.github.io/flowmap
πCode github.com/dcharatan/flowmap
π₯18β€3π2
This media is not supported in your browser
VIEW IN TELEGRAM
πTELA: Text to 3D Clothed Humanπ
π TELA is a novel approach for the new task of clothing disentangled 3D human model generation from texts. This novel approach unleashes the potential of many downstream applications (e.g., virtual try-on).
πReview https://t.ly/6N7JV
πPaper https://arxiv.org/pdf/2404.16748
πProject https://jtdong.com/tela_layer/
πCode https://github.com/DongJT1996/TELA
π TELA is a novel approach for the new task of clothing disentangled 3D human model generation from texts. This novel approach unleashes the potential of many downstream applications (e.g., virtual try-on).
πReview https://t.ly/6N7JV
πPaper https://arxiv.org/pdf/2404.16748
πProject https://jtdong.com/tela_layer/
πCode https://github.com/DongJT1996/TELA
π5π₯4π€―3π1πΎ1