This media is not supported in your browser
VIEW IN TELEGRAM
π¦ TimeLens++: Event-based Interpolation π¦
πNovel event-based interpolation with non-linear flow & multi-scale fusion
ππ’π π‘π₯π’π π‘ππ¬:
β Novel motion spline estimator
β Non-linear continuous event/frames flow
β Multi-feature fusion, gated compression
β Novel hybrid dataset with 100+ videos
More: https://bit.ly/3yJyY6g
πNovel event-based interpolation with non-linear flow & multi-scale fusion
ππ’π π‘π₯π’π π‘ππ¬:
β Novel motion spline estimator
β Non-linear continuous event/frames flow
β Multi-feature fusion, gated compression
β Novel hybrid dataset with 100+ videos
More: https://bit.ly/3yJyY6g
π₯16π4
This media is not supported in your browser
VIEW IN TELEGRAM
πͺ°NUWA-Infinity is out!πͺ°
πβ generation by #Microsoft: arbitrarily-sized HD images and long videos π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Unconditional Image Gen.
β Text-to-Image/Text-to-Clip
β Animation / Out-painting
β Hi-res, arbitrary long clip
β NCP for patches caching
More: https://bit.ly/3zmBf9f
πβ generation by #Microsoft: arbitrarily-sized HD images and long videos π€―
ππ’π π‘π₯π’π π‘ππ¬:
β Unconditional Image Gen.
β Text-to-Image/Text-to-Clip
β Animation / Out-painting
β Hi-res, arbitrary long clip
β NCP for patches caching
More: https://bit.ly/3zmBf9f
π₯7π2β€1π1π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ #AIwithPapers: we are 3,500+! π₯
ππ Ready for YOLO 10, 11, Ο, β, Ξ¨, and more? The more we are, the faster we catch'em all ππ
π Invite your friends -> https://t.me/AI_DeepLearning
ππ Ready for YOLO 10, 11, Ο, β, Ξ¨, and more? The more we are, the faster we catch'em all ππ
π Invite your friends -> https://t.me/AI_DeepLearning
π12β€10π5π₯3
This media is not supported in your browser
VIEW IN TELEGRAM
π·π·OMNI3D: #3D Objects in the Wildπ·π·
π#3D detection: 234k images, 3M+ instances & 97 categories
ππ’π π‘π₯π’π π‘ππ¬:
β OMNI3D from publicly released dataset
β 234k pics, 3M+ annotation with 3D box
β 97 categories such as sofa, table, cars
β Fast (450x) and exact algorithm for IoU
β Cube R-CNN: novel 3D object detector
More: https://bit.ly/3cznjzG
π#3D detection: 234k images, 3M+ instances & 97 categories
ππ’π π‘π₯π’π π‘ππ¬:
β OMNI3D from publicly released dataset
β 234k pics, 3M+ annotation with 3D box
β 97 categories such as sofa, table, cars
β Fast (450x) and exact algorithm for IoU
β Cube R-CNN: novel 3D object detector
More: https://bit.ly/3cznjzG
π11
This media is not supported in your browser
VIEW IN TELEGRAM
πΉMultiface Neural Rendering πΉ
πA new multi-view, Hi-Res data collected at #META Reality Labs for neural face
ππ’π π‘π₯π’π π‘ππ¬:
β Mugsy, large scale multi-cam apparatus
β High-Res sync facial performance
β Closing the gap in accessing HQ data
β Suitable for #VR & #mixedreality
More: https://bit.ly/3b6XfeL
πA new multi-view, Hi-Res data collected at #META Reality Labs for neural face
ππ’π π‘π₯π’π π‘ππ¬:
β Mugsy, large scale multi-cam apparatus
β High-Res sync facial performance
β Closing the gap in accessing HQ data
β Suitable for #VR & #mixedreality
More: https://bit.ly/3b6XfeL
π€―8π3
This media is not supported in your browser
VIEW IN TELEGRAM
πDEVIANT: SOTA in mono-3D detectionπ
πA novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild
ππ’π π‘π₯π’π π‘ππ¬:
β Michigan + #Meta + Ford π€―
β Depth-equi. + scale equiv. steerable
β New SOTA on KITTI & Waymo
β Ok cross-dataset -> generalization
More: https://bit.ly/3OEFtgK
πA novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild
ππ’π π‘π₯π’π π‘ππ¬:
β Michigan + #Meta + Ford π€―
β Depth-equi. + scale equiv. steerable
β New SOTA on KITTI & Waymo
β Ok cross-dataset -> generalization
More: https://bit.ly/3OEFtgK
π₯16π2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π§± Assembling #LEGO with #AI π§±
πStep-by-step assembly manual created by human into machine-interpretable instructions
ππ’π π‘π₯π’π π‘ππ¬:
β Stanford + MIT + #Google π€―
β MEPNet: Manual-to-Executable-Plan Net
β Manual to machine-executable plan
β 2D manual - 3D geometric shape
β Reasoning on 3D alignments of legos
More: https://bit.ly/3PCwn5C
πStep-by-step assembly manual created by human into machine-interpretable instructions
ππ’π π‘π₯π’π π‘ππ¬:
β Stanford + MIT + #Google π€―
β MEPNet: Manual-to-Executable-Plan Net
β Manual to machine-executable plan
β 2D manual - 3D geometric shape
β Reasoning on 3D alignments of legos
More: https://bit.ly/3PCwn5C
π₯9β€3
This media is not supported in your browser
VIEW IN TELEGRAM
πNew SOTA in UDA Semantic Seg.π
πHRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + MPG + KU Leuven π€―
β HRDA: multi-res approach for UDA
β Manageable GPU memory footprint
β Small objects & fine segmentation detail
β New SOTA on GTA and Synthia dataset
More: https://bit.ly/3cKtDEp
πHRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA
ππ’π π‘π₯π’π π‘ππ¬:
β ETH + MPG + KU Leuven π€―
β HRDA: multi-res approach for UDA
β Manageable GPU memory footprint
β Small objects & fine segmentation detail
β New SOTA on GTA and Synthia dataset
More: https://bit.ly/3cKtDEp
π€―8π1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ SemAbs: 3D Scene Understanding βοΈ
πFramework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities
ππ’π π‘π₯π’π π‘ππ¬:
β 2D VLMs with 3D reasoning skills
β ViTs Efficient MS Relevancy Extraction
β Novel Open-World understanding tasks
β Completing partially observed objects
β Finding hidden objects from language
More: https://bit.ly/3PYYk7d
πFramework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities
ππ’π π‘π₯π’π π‘ππ¬:
β 2D VLMs with 3D reasoning skills
β ViTs Efficient MS Relevancy Extraction
β Novel Open-World understanding tasks
β Completing partially observed objects
β Finding hidden objects from language
More: https://bit.ly/3PYYk7d
π₯7β€1π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ TinyCD: Neural Change Detection π¦
πTinyCD: new SOTA in change detection with up to 150x fewer parameters.
ππ’π π‘π₯π’π π‘ππ¬:
β SOTA with up to 150X fewer params
β Mixing blocks for s.t. cross-correlation
β PW-MLP for pixel wise classification
β MAMB: novel block for skip connection
More: https://bit.ly/3zFEngk
πTinyCD: new SOTA in change detection with up to 150x fewer parameters.
ππ’π π‘π₯π’π π‘ππ¬:
β SOTA with up to 150X fewer params
β Mixing blocks for s.t. cross-correlation
β PW-MLP for pixel wise classification
β MAMB: novel block for skip connection
More: https://bit.ly/3zFEngk
β€16π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ 3D-Aware "StyleGANv2" version π¦
πUpgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changesπ€―
ππ’π π‘π₯π’π π‘ππ¬:
β MPI-like 3D-aware GAN w/ single-view
β GMPI: generative multiplane image
β 2D GAN 3D-aware with a minimal changes
β Encoding 3D-aware inductive biases
More: https://bit.ly/3OJ5gnS
πUpgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changesπ€―
ππ’π π‘π₯π’π π‘ππ¬:
β MPI-like 3D-aware GAN w/ single-view
β GMPI: generative multiplane image
β 2D GAN 3D-aware with a minimal changes
β Encoding 3D-aware inductive biases
More: https://bit.ly/3OJ5gnS
π€―6π4β€1
This media is not supported in your browser
VIEW IN TELEGRAM
πΊ NeRF-ing "The Big Bang Theory" πΊ
πBerkeley unveils an approach for accurate estimation of actorβs 3D pose & location
ππ’π π‘π₯π’π π‘ππ¬:
β Input: images across the whole season
β 3D context (i.e. cams, structure, body)
β Integrating context in 3D estimation
β Re-ID, gaze, cinematography, pic editing
β Knock, Knock, Penny!
More: https://bit.ly/3OLuaUb
πBerkeley unveils an approach for accurate estimation of actorβs 3D pose & location
ππ’π π‘π₯π’π π‘ππ¬:
β Input: images across the whole season
β 3D context (i.e. cams, structure, body)
β Integrating context in 3D estimation
β Re-ID, gaze, cinematography, pic editing
β Knock, Knock, Penny!
More: https://bit.ly/3OLuaUb
π₯7π€―5π₯°2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
π©ShAPO: SOTA in object understandingπ©
πJoint multi-object detection, #3D texture, 6D object pose & size estimation.
ππ’π π‘π₯π’π π‘ππ¬:
β Disentangled shape & appearance
β Efficient octree-based differentiable
β Object-centric understanding pipeline
β Detection, reconstruction , 6D & size
β SOTA in reconstruction & pose est.
More: https://bit.ly/3oHN5EQ
πJoint multi-object detection, #3D texture, 6D object pose & size estimation.
ππ’π π‘π₯π’π π‘ππ¬:
β Disentangled shape & appearance
β Efficient octree-based differentiable
β Object-centric understanding pipeline
β Detection, reconstruction , 6D & size
β SOTA in reconstruction & pose est.
More: https://bit.ly/3oHN5EQ
π7π€―1
This media is not supported in your browser
VIEW IN TELEGRAM
ποΈ CityNeRF: Neural Rendering of City Scenes ποΈ
πProgressive NeRF model and training set on city-scenes
ππ’π π‘π₯π’π π‘ππ¬:
β BungeeNeRF: novel progressive NeRF
β Details on drastically varied scales
β Growing with residual block structure
β Inclusive multi-level data supervision
More: https://bit.ly/3cS9vk7
πProgressive NeRF model and training set on city-scenes
ππ’π π‘π₯π’π π‘ππ¬:
β BungeeNeRF: novel progressive NeRF
β Details on drastically varied scales
β Growing with residual block structure
β Inclusive multi-level data supervision
More: https://bit.ly/3cS9vk7
π₯°7π3π€―3π±1
This media is not supported in your browser
VIEW IN TELEGRAM
π¦π¦ Rewriting Geometry of GAN π¦π¦
πDrive GAN synthesizing many unseen objects with the desired shape
ππ’π π‘π₯π’π π‘ππ¬:
β User-friendly "warping" with geometry
β Low-rank update to layer for editing
β Latent augmentation based on style-mix
β Endless objects with defined changes
β Latent space interpolation, image editing
More: https://bit.ly/3zIfOj8
πDrive GAN synthesizing many unseen objects with the desired shape
ππ’π π‘π₯π’π π‘ππ¬:
β User-friendly "warping" with geometry
β Low-rank update to layer for editing
β Latent augmentation based on style-mix
β Endless objects with defined changes
β Latent space interpolation, image editing
More: https://bit.ly/3zIfOj8
π8π±7π3π2β€1π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
ππ GAUDI: the Neural Architect ππ
πNovel generative model for immersive 3D scenes from a moving camera
ππ’π π‘π₯π’π π‘ππ¬:
β Hundreds of thousands pics/scenes
β Novel denoising optimization objective
β New SOTA across multiple datasets
β Un/conditional on images/text
More: https://bit.ly/3Bt65ye
πNovel generative model for immersive 3D scenes from a moving camera
ππ’π π‘π₯π’π π‘ππ¬:
β Hundreds of thousands pics/scenes
β Novel denoising optimization objective
β New SOTA across multiple datasets
β Un/conditional on images/text
More: https://bit.ly/3Bt65ye
π₯6
This media is not supported in your browser
VIEW IN TELEGRAM
πNeDDF: the NeRF evolution!π
πNovel 3D representation that reciprocally constrains distance & density fields
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF provides no distance
β Extending for arbitrary density
β Density via dist-field & gradient
β Alleviating the instability
More: https://bit.ly/3Bte8LC
πNovel 3D representation that reciprocally constrains distance & density fields
ππ’π π‘π₯π’π π‘ππ¬:
β NeRF provides no distance
β Extending for arbitrary density
β Density via dist-field & gradient
β Alleviating the instability
More: https://bit.ly/3Bte8LC
π7
Media is too big
VIEW IN TELEGRAM
π₯AND/OR: Composable Diffusion Modelsπ₯
πNovel neural compositional generation via Composable Diffusion Models
ππ’π π‘π₯π’π π‘ππ¬:
β DM as energy-based models
β Connecting diffusion models
β Conjunction & negation, on top of DM
β Zero-shot combinatorial generalization
More: https://bit.ly/3PYv1Cs
πNovel neural compositional generation via Composable Diffusion Models
ππ’π π‘π₯π’π π‘ππ¬:
β DM as energy-based models
β Connecting diffusion models
β Conjunction & negation, on top of DM
β Zero-shot combinatorial generalization
More: https://bit.ly/3PYv1Cs
π€―5π3β€2