This media is not supported in your browser
VIEW IN TELEGRAM
☀️ 4D Neural Relightable Humans ☀️
👉Relighting4D: free-viewpoints relighting of humans under unknown illuminations
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Relight dynamic, free viewpoints
✅Disentangled reflectance/geometry
✅SOTA on synthetic/real datasets
✅Code/models under MIT License
More: https://bit.ly/3RF3yH9
👉Relighting4D: free-viewpoints relighting of humans under unknown illuminations
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Relight dynamic, free viewpoints
✅Disentangled reflectance/geometry
✅SOTA on synthetic/real datasets
✅Code/models under MIT License
More: https://bit.ly/3RF3yH9
🔥9👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🍰 Long-Term Object Segmentation 🍰
👉XMem: object segmentation for long clips with unified feature memory stores
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Inspired by Atkinson–Shiffrin model
✅Stores with different temporal scales
✅Memory consolidation algorithm
✅Compact/powerful long-term memory
✅Source code and models available
More: https://bit.ly/3PP0EOn
👉XMem: object segmentation for long clips with unified feature memory stores
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Inspired by Atkinson–Shiffrin model
✅Stores with different temporal scales
✅Memory consolidation algorithm
✅Compact/powerful long-term memory
✅Source code and models available
More: https://bit.ly/3PP0EOn
🤯16👍5👏3
AI with Papers - Artificial Intelligence & Deep Learning
🦔 CogVideo: insane text-to-clip 🦔 👉CogVideo: 9B-parameters world's first large scale open-source text-to-video 😵 𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬: ✅Largest open-source T2C transformer ✅Finetuning of text-to-image model ✅Multi-frame-rate hierarchical training ✅From pretrained…
This media is not supported in your browser
VIEW IN TELEGRAM
🔥🔥 Update 🔥🔥
👉Code https://github.com/THUDM/CogVideo
👉Demo https://wudao.aminer.cn/cogvideo/
More: https://bit.ly/3yP86BQ
👉Code https://github.com/THUDM/CogVideo
👉Demo https://wudao.aminer.cn/cogvideo/
More: https://bit.ly/3yP86BQ
🔥5❤4👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Grand Unification of Object Tracking🔥
👉UNICORN: unified method for SOT, MOT, VOS, & MOTS with a single neural net. 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Great unification for 4 tracking tasks
✅Bridging methods / pixel-wise corresp.
✅SOTA on 8 challenging benchmarks
✅Source code under MIT License
More: https://bit.ly/3o74h6g
👉UNICORN: unified method for SOT, MOT, VOS, & MOTS with a single neural net. 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Great unification for 4 tracking tasks
✅Bridging methods / pixel-wise corresp.
✅SOTA on 8 challenging benchmarks
✅Source code under MIT License
More: https://bit.ly/3o74h6g
👍13🔥3🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥OmniBenchmark: CV beyond ImageNet🔥
👉 21 realms, 7,000+ concepts and 1M+ images. Far beyond ImageNet!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅vs. ImageNet: 2.5x realms, 9x concepts
✅Conciseness: no concept overlapping
✅ReCo: Relational Contrastive Learning
✅New supervised contrastive learning SOTA
More: https://bit.ly/3RJRKU0
👉 21 realms, 7,000+ concepts and 1M+ images. Far beyond ImageNet!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅vs. ImageNet: 2.5x realms, 9x concepts
✅Conciseness: no concept overlapping
✅ReCo: Relational Contrastive Learning
✅New supervised contrastive learning SOTA
More: https://bit.ly/3RJRKU0
🔥11🤩3
This media is not supported in your browser
VIEW IN TELEGRAM
💣 HD Neural Avatar @130FPS 💣
👉Samsung unveils MegaPortraits: novel one-shot creation of HD neural human avatar
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅One-shot neural avatars, SOTA up 512p
✅"Upgrading" to megapixel via more pics
✅First Neural Head Avatars in HD
✅Up to to 130 FPS via #GPU
More: https://bit.ly/3oboWWT
👉Samsung unveils MegaPortraits: novel one-shot creation of HD neural human avatar
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅One-shot neural avatars, SOTA up 512p
✅"Upgrading" to megapixel via more pics
✅First Neural Head Avatars in HD
✅Up to to 130 FPS via #GPU
More: https://bit.ly/3oboWWT
🔥22👍1👏1
AI with Papers - Artificial Intelligence & Deep Learning
🧠 Bias in #AI, explained simple 🧠 👉Asking DallE-Mini to help me to show what the BIAS in #AI is 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐞𝐝 𝐒𝐚𝐦𝐩𝐥𝐞𝐬: ✅Best eng.->men/Caucasians ✅Best doctors->men/Caucasians ✅Top CEOs->men/Caucasians ✅Chef, kitchen->men/Caucasians ✅Rich People->only Caucasians…
🔥Important update from #OpenAI🔥
👉 https://openai.com/blog/reducing-bias-and-improving-safety-in-dall-e-2/
👉 https://openai.com/blog/reducing-bias-and-improving-safety-in-dall-e-2/
Openai
Reducing bias and improving safety in DALL·E 2
Today, we are implementing a new technique so that DALL·E generates images of people that more accurately reflect the diversity of the world’s population.
👍10❤2
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TimeLens++: Event-based Interpolation 🦚
👉Novel event-based interpolation with non-linear flow & multi-scale fusion
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel motion spline estimator
✅Non-linear continuous event/frames flow
✅Multi-feature fusion, gated compression
✅Novel hybrid dataset with 100+ videos
More: https://bit.ly/3yJyY6g
👉Novel event-based interpolation with non-linear flow & multi-scale fusion
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel motion spline estimator
✅Non-linear continuous event/frames flow
✅Multi-feature fusion, gated compression
✅Novel hybrid dataset with 100+ videos
More: https://bit.ly/3yJyY6g
🔥16👍4
This media is not supported in your browser
VIEW IN TELEGRAM
🪰NUWA-Infinity is out!🪰
👉∞ generation by #Microsoft: arbitrarily-sized HD images and long videos 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Unconditional Image Gen.
✅Text-to-Image/Text-to-Clip
✅Animation / Out-painting
✅Hi-res, arbitrary long clip
✅NCP for patches caching
More: https://bit.ly/3zmBf9f
👉∞ generation by #Microsoft: arbitrarily-sized HD images and long videos 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Unconditional Image Gen.
✅Text-to-Image/Text-to-Clip
✅Animation / Out-painting
✅Hi-res, arbitrary long clip
✅NCP for patches caching
More: https://bit.ly/3zmBf9f
🔥7👍2❤1👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 #AIwithPapers: we are 3,500+! 🔥
💙💛 Ready for YOLO 10, 11, π, ∞, Ψ, and more? The more we are, the faster we catch'em all 💙💛
😈 Invite your friends -> https://t.me/AI_DeepLearning
💙💛 Ready for YOLO 10, 11, π, ∞, Ψ, and more? The more we are, the faster we catch'em all 💙💛
😈 Invite your friends -> https://t.me/AI_DeepLearning
👍12❤10😁5🔥3
This media is not supported in your browser
VIEW IN TELEGRAM
🎷🎷OMNI3D: #3D Objects in the Wild🎷🎷
👉#3D detection: 234k images, 3M+ instances & 97 categories
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅OMNI3D from publicly released dataset
✅234k pics, 3M+ annotation with 3D box
✅97 categories such as sofa, table, cars
✅Fast (450x) and exact algorithm for IoU
✅Cube R-CNN: novel 3D object detector
More: https://bit.ly/3cznjzG
👉#3D detection: 234k images, 3M+ instances & 97 categories
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅OMNI3D from publicly released dataset
✅234k pics, 3M+ annotation with 3D box
✅97 categories such as sofa, table, cars
✅Fast (450x) and exact algorithm for IoU
✅Cube R-CNN: novel 3D object detector
More: https://bit.ly/3cznjzG
👍11
This media is not supported in your browser
VIEW IN TELEGRAM
👹Multiface Neural Rendering 👹
👉A new multi-view, Hi-Res data collected at #META Reality Labs for neural face
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Mugsy, large scale multi-cam apparatus
✅High-Res sync facial performance
✅Closing the gap in accessing HQ data
✅Suitable for #VR & #mixedreality
More: https://bit.ly/3b6XfeL
👉A new multi-view, Hi-Res data collected at #META Reality Labs for neural face
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Mugsy, large scale multi-cam apparatus
✅High-Res sync facial performance
✅Closing the gap in accessing HQ data
✅Suitable for #VR & #mixedreality
More: https://bit.ly/3b6XfeL
🤯8👍3
This media is not supported in your browser
VIEW IN TELEGRAM
💄DEVIANT: SOTA in mono-3D detection💄
👉A novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Michigan + #Meta + Ford 🤯
✅Depth-equi. + scale equiv. steerable
✅New SOTA on KITTI & Waymo
✅Ok cross-dataset -> generalization
More: https://bit.ly/3OEFtgK
👉A novel Depth EquiVarIAnt NeTwork for 3D monocular detection in the wild
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Michigan + #Meta + Ford 🤯
✅Depth-equi. + scale equiv. steerable
✅New SOTA on KITTI & Waymo
✅Ok cross-dataset -> generalization
More: https://bit.ly/3OEFtgK
🔥16👍2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
🧱 Assembling #LEGO with #AI 🧱
👉Step-by-step assembly manual created by human into machine-interpretable instructions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Stanford + MIT + #Google 🤯
✅MEPNet: Manual-to-Executable-Plan Net
✅Manual to machine-executable plan
✅2D manual - 3D geometric shape
✅Reasoning on 3D alignments of legos
More: https://bit.ly/3PCwn5C
👉Step-by-step assembly manual created by human into machine-interpretable instructions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Stanford + MIT + #Google 🤯
✅MEPNet: Manual-to-Executable-Plan Net
✅Manual to machine-executable plan
✅2D manual - 3D geometric shape
✅Reasoning on 3D alignments of legos
More: https://bit.ly/3PCwn5C
🔥9❤3
This media is not supported in your browser
VIEW IN TELEGRAM
🎃New SOTA in UDA Semantic Seg.🎃
👉HRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ETH + MPG + KU Leuven 🤯
✅HRDA: multi-res approach for UDA
✅Manageable GPU memory footprint
✅Small objects & fine segmentation detail
✅New SOTA on GTA and Synthia dataset
More: https://bit.ly/3cKtDEp
👉HRDA: multi-res Unsupervised Domain Adaptive Semantic Seg. -> SOTA
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ETH + MPG + KU Leuven 🤯
✅HRDA: multi-res approach for UDA
✅Manageable GPU memory footprint
✅Small objects & fine segmentation detail
✅New SOTA on GTA and Synthia dataset
More: https://bit.ly/3cKtDEp
🤯8👍1
This media is not supported in your browser
VIEW IN TELEGRAM
⚗️ SemAbs: 3D Scene Understanding ⚗️
👉Framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅2D VLMs with 3D reasoning skills
✅ViTs Efficient MS Relevancy Extraction
✅Novel Open-World understanding tasks
✅Completing partially observed objects
✅Finding hidden objects from language
More: https://bit.ly/3PYYk7d
👉Framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅2D VLMs with 3D reasoning skills
✅ViTs Efficient MS Relevancy Extraction
✅Novel Open-World understanding tasks
✅Completing partially observed objects
✅Finding hidden objects from language
More: https://bit.ly/3PYYk7d
🔥7❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦚 TinyCD: Neural Change Detection 🦚
👉TinyCD: new SOTA in change detection with up to 150x fewer parameters.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅SOTA with up to 150X fewer params
✅Mixing blocks for s.t. cross-correlation
✅PW-MLP for pixel wise classification
✅MAMB: novel block for skip connection
More: https://bit.ly/3zFEngk
👉TinyCD: new SOTA in change detection with up to 150x fewer parameters.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅SOTA with up to 150X fewer params
✅Mixing blocks for s.t. cross-correlation
✅PW-MLP for pixel wise classification
✅MAMB: novel block for skip connection
More: https://bit.ly/3zFEngk
❤16👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦊 3D-Aware "StyleGANv2" version 🦊
👉Upgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changes🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅MPI-like 3D-aware GAN w/ single-view
✅GMPI: generative multiplane image
✅2D GAN 3D-aware with a minimal changes
✅Encoding 3D-aware inductive biases
More: https://bit.ly/3OJ5gnS
👉Upgrading StyleGANv2 into a novel 3D-aware GAN with just a minimal set of changes🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅MPI-like 3D-aware GAN w/ single-view
✅GMPI: generative multiplane image
✅2D GAN 3D-aware with a minimal changes
✅Encoding 3D-aware inductive biases
More: https://bit.ly/3OJ5gnS
🤯6👍4❤1