This media is not supported in your browser
VIEW IN TELEGRAM
π ULTRA-Realistic Avatar π
πNovel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
πReview https://t.ly/B3BEu
πProject https://lnkd.in/dkUQHFEV
πPaper https://lnkd.in/dtEQxrBu
πCode coming π©·
πNovel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
πReview https://t.ly/B3BEu
πProject https://lnkd.in/dkUQHFEV
πPaper https://lnkd.in/dtEQxrBu
πCode coming π©·
π©17β€5π2π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯Lumiere: SOTA video-genπ₯
π#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.
πReview https://t.ly/nalJR
πPaper https://lnkd.in/d-PvrGjT
πProject https://t.ly/gK8hz
π#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.
πReview https://t.ly/nalJR
πPaper https://lnkd.in/d-PvrGjT
πProject https://t.ly/gK8hz
π₯18β€4π3π2π€©2π₯°1π€―1π©1
This media is not supported in your browser
VIEW IN TELEGRAM
π§ͺ SUPIR: SOTA restoration π§ͺ
πSUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics
πReview https://t.ly/wgObH
πProject https://supir.xpixel.group/
πPaper https://lnkd.in/dZPYcUuq
πDemo coming π©· but no code announced :(
πSUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics
πReview https://t.ly/wgObH
πProject https://supir.xpixel.group/
πPaper https://lnkd.in/dZPYcUuq
πDemo coming π©· but no code announced :(
β€8π₯4π₯°1πΎ1
This media is not supported in your browser
VIEW IN TELEGRAM
π«§ SAM + Open Models π«§
πGrounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.
πReview https://t.ly/FwasQ
πPaper arxiv.org/pdf/2401.14159.pdf
πCode github.com/IDEA-Research/Grounded-Segment-Anything
πGrounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.
πReview https://t.ly/FwasQ
πPaper arxiv.org/pdf/2401.14159.pdf
πCode github.com/IDEA-Research/Grounded-Segment-Anything
π₯9π2π1πΎ1
This media is not supported in your browser
VIEW IN TELEGRAM
π’"Virtual Try-All" by #Amazon π’
π#Amazon announces βDiffuse to Chooseβ: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.
πReview https://t.ly/at07Y
πPaper https://lnkd.in/dxR7nGtd
πProject diffuse2choose.github.io/
π#Amazon announces βDiffuse to Chooseβ: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.
πReview https://t.ly/at07Y
πPaper https://lnkd.in/dxR7nGtd
πProject diffuse2choose.github.io/
β€15π7π€―4π₯1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
𦩠WildRGB-D: Objects in the Wild π¦©
π#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.
πReview https://t.ly/WCqVz
πData github.com/wildrgbd/wildrgbd
πPaper arxiv.org/pdf/2401.12592.pdf
πProject wildrgbd.github.io/
π#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.
πReview https://t.ly/WCqVz
πData github.com/wildrgbd/wildrgbd
πPaper arxiv.org/pdf/2401.12592.pdf
πProject wildrgbd.github.io/
π9β€3π₯2π1π€©1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πEasyVolcap: Accelerating Neural Volumetricπ
πNovel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering
πReview https://t.ly/8BISl
πPaper arxiv.org/pdf/2312.06575.pdf
πCode github.com/zju3dv/EasyVolcap
πNovel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering
πReview https://t.ly/8BISl
πPaper arxiv.org/pdf/2312.06575.pdf
πCode github.com/zju3dv/EasyVolcap
π₯10π2β€1π₯°1π1π€©1
This media is not supported in your browser
VIEW IN TELEGRAM
π Rock-Track announced! π
πRock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.
πReview https://t.ly/hC0ak
πRepo, coming: https://lnkd.in/dtDkPwCC
πPaper coming
πRock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.
πReview https://t.ly/hC0ak
πRepo, coming: https://lnkd.in/dtDkPwCC
πPaper coming
π4π4π₯2β€1π₯°1
π§ 350+ Free #AI Courses by #Googleπ§
π350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.
β πππ§ππ«πππ’π―π ππ
β ππ§ππ«π¨ ππ¨ ππππ¬
β ππ π°π’ππ‘ ππ
β ππππ, ππ, ππ
β πππ¬π©π¨π§π¬π’ππ₯π ππ
πReview: https://t.ly/517Dr
πFull list: https://www.cloudskillsboost.google/catalog?page=1
π350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.
β πππ§ππ«πππ’π―π ππ
β ππ§ππ«π¨ ππ¨ ππππ¬
β ππ π°π’ππ‘ ππ
β ππππ, ππ, ππ
β πππ¬π©π¨π§π¬π’ππ₯π ππ
πReview: https://t.ly/517Dr
πFull list: https://www.cloudskillsboost.google/catalog?page=1
β€13π3π2πΎ2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
π Diffutoon: new SOTA video π
πDiffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!
πReview https://t.ly/sim2O
πPaper https://lnkd.in/dPcSnAUu
πCode https://lnkd.in/d9B_dGrf
πProject https://lnkd.in/dpcsJcX2
πDiffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!
πReview https://t.ly/sim2O
πPaper https://lnkd.in/dPcSnAUu
πCode https://lnkd.in/d9B_dGrf
πProject https://lnkd.in/dpcsJcX2
π₯19β€3π€―3π1π₯°1π€©1π©1πΎ1
π₯ RANSAC -> PARSAC (neural) π₯
πNeural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released π
πReview https://t.ly/r9ngg
πPaper https://lnkd.in/dadQ4Qec
πCode https://lnkd.in/dYp6gADd
πNeural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released π
πReview https://t.ly/r9ngg
πPaper https://lnkd.in/dadQ4Qec
πCode https://lnkd.in/dYp6gADd
β€14π3β‘1π₯°1π1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ SEELE: "moving" the subjects β‘οΈ
πSubject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the imageβs fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks
πReview https://t.ly/4FS4H
πPaper arxiv.org/pdf/2401.16861.pdf
πProject yikai-wang.github.io/seele/
πSubject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the imageβs fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks
πReview https://t.ly/4FS4H
πPaper arxiv.org/pdf/2401.16861.pdf
πProject yikai-wang.github.io/seele/
π20β€3π€―3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π ADΞER: Event-Camera Suite π
πADΞER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΞER (Address, Decimation, Ξt Event Representation) video streams. Source code (RUST) released π
πReview https://t.ly/w5_KC
πPaper arxiv.org/pdf/2401.17151.pdf
πRepo github.com/ac-freeman/adder-codec-rs
πADΞER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΞER (Address, Decimation, Ξt Event Representation) video streams. Source code (RUST) released π
πReview https://t.ly/w5_KC
πPaper arxiv.org/pdf/2401.17151.pdf
πRepo github.com/ac-freeman/adder-codec-rs
β€7π3π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦(add) Anything in Any Videoπ¦
π XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released π
πReview https://t.ly/UYhl0
πCode https://lnkd.in/gyi7Dhkn
πPaper https://lnkd.in/gXyAJ6GZ
πProject https://lnkd.in/gVA5vduD
π XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released π
πReview https://t.ly/UYhl0
πCode https://lnkd.in/gyi7Dhkn
πPaper https://lnkd.in/gXyAJ6GZ
πProject https://lnkd.in/gVA5vduD
π₯12π€―6π5π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π¬ ABS: SOTA collision-free π¬
πABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) π
πReview https://t.ly/AYu-Z
πPaper arxiv.org/pdf/2401.17583.pdf
πProject agile-but-safe.github.io/
πRepo github.com/LeCAR-Lab/ABS
πABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) π
πReview https://t.ly/AYu-Z
πPaper arxiv.org/pdf/2401.17583.pdf
πProject agile-but-safe.github.io/
πRepo github.com/LeCAR-Lab/ABS
π11π3π1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π Bootstrapping TAP π
π#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released π
πReview https://t.ly/-S_ZL
πPaper arxiv.org/pdf/2402.00847.pdf
πCode https://github.com/google-deepmind/tapnet
π#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released π
πReview https://t.ly/-S_ZL
πPaper arxiv.org/pdf/2402.00847.pdf
πCode https://github.com/google-deepmind/tapnet
π₯5π3π₯°1π€©1
π₯Py4AI 2x Speakers, 2x Ticketsπ₯
β Doubling the speakers (6 -> 12!)
β A new track (2 tracks in parallel)
β A new batch of 100 tickets!
π More: https://t.ly/WmVrM
β Doubling the speakers (6 -> 12!)
β A new track (2 tracks in parallel)
β A new batch of 100 tickets!
π More: https://t.ly/WmVrM
β€7π2π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ΅ HASSOD Object Detection πͺ΅
π HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.
πReview https://t.ly/66qHF
πPaper arxiv.org/pdf/2402.03311.pdf
πProject hassod-neurips23.github.io/
πRepo github.com/Shengcao-Cao/HASSOD
π HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.
πReview https://t.ly/66qHF
πPaper arxiv.org/pdf/2402.03311.pdf
πProject hassod-neurips23.github.io/
πRepo github.com/Shengcao-Cao/HASSOD
π₯13β€5π3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π΅ G-Splatting Portraits π΅
πFrom monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction
πReview https://t.ly/fq71w
πPaper https://arxiv.org/pdf/2402.03723.pdf
πProject shahrukhathar.github.io/2024/02/05/Rig3DGS.html
πFrom monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction
πReview https://t.ly/fq71w
πPaper https://arxiv.org/pdf/2402.03723.pdf
πProject shahrukhathar.github.io/2024/02/05/Rig3DGS.html
π₯13β€3π1π₯°1
This media is not supported in your browser
VIEW IN TELEGRAM
π Up to 69x Faster SAM π
πEfficientViT-SAM is a new family of accelerated Segment Anything Models. The same old SAMβs lightweight prompt encoder and mask decoder, while replacing the heavy image encoder with EfficientViT. Up to 69x faster, source code released. Authors: Tsinghua, MIT & #Nvidia
πReview https://t.ly/zGiE9
πPaper arxiv.org/pdf/2402.05008.pdf
πCode github.com/mit-han-lab/efficientvit
πEfficientViT-SAM is a new family of accelerated Segment Anything Models. The same old SAMβs lightweight prompt encoder and mask decoder, while replacing the heavy image encoder with EfficientViT. Up to 69x faster, source code released. Authors: Tsinghua, MIT & #Nvidia
πReview https://t.ly/zGiE9
πPaper arxiv.org/pdf/2402.05008.pdf
πCode github.com/mit-han-lab/efficientvit
π₯19π7β€4π₯°1