✨Prithvi-Complimentary Adaptive Fusion Encoder (CAFE): unlocking full-potential for flood inundation mapping
📝 Summary:
Prithvi-CAFE improves flood mapping by integrating a pretrained Geo-Foundation Model encoder with a parallel CNN branch featuring attention modules. This hybrid approach effectively captures both global context and critical local details, achieving state-of-the-art results on Sen1Flood11 and Floo...
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02315
• PDF: https://arxiv.org/pdf/2601.02315
• Github: https://github.com/Sk-2103/Prithvi-CAFE
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#FloodMapping #DeepLearning #GeoAI #RemoteSensing #ComputerVision
📝 Summary:
Prithvi-CAFE improves flood mapping by integrating a pretrained Geo-Foundation Model encoder with a parallel CNN branch featuring attention modules. This hybrid approach effectively captures both global context and critical local details, achieving state-of-the-art results on Sen1Flood11 and Floo...
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02315
• PDF: https://arxiv.org/pdf/2601.02315
• Github: https://github.com/Sk-2103/Prithvi-CAFE
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#FloodMapping #DeepLearning #GeoAI #RemoteSensing #ComputerVision
This media is not supported in your browser
VIEW IN TELEGRAM
✨ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors
📝 Summary:
ExposeAnyone is a self-supervised diffusion model for deepfake detection that personalizes to subjects and uses reconstruction errors to measure identity distance. It significantly outperforms prior methods on unseen manipulations, including Sora2 videos, and is robust to real-world corruptions.
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02359
• PDF: https://arxiv.org/pdf/2601.02359
• Github: https://mapooon.github.io/ExposeAnyonePage/
✨ Datasets citing this paper:
• https://huggingface.co/datasets/mapooon/S2CFP
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#DeepfakeDetection #DiffusionModels #ComputerVision #AITechnology #ForgeryDetection
📝 Summary:
ExposeAnyone is a self-supervised diffusion model for deepfake detection that personalizes to subjects and uses reconstruction errors to measure identity distance. It significantly outperforms prior methods on unseen manipulations, including Sora2 videos, and is robust to real-world corruptions.
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02359
• PDF: https://arxiv.org/pdf/2601.02359
• Github: https://mapooon.github.io/ExposeAnyonePage/
✨ Datasets citing this paper:
• https://huggingface.co/datasets/mapooon/S2CFP
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#DeepfakeDetection #DiffusionModels #ComputerVision #AITechnology #ForgeryDetection
❤2
✨RGS-SLAM: Robust Gaussian Splatting SLAM with One-Shot Dense Initialization
📝 Summary:
RGS-SLAM is a robust Gaussian-splatting SLAM framework that uses a one-shot, correspondence-to-Gaussian initialization with DINOv3 descriptors. This method improves stability, accelerates convergence, and yields higher rendering fidelity and accuracy compared to existing systems.
🔹 Publication Date: Published on Dec 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00705
• PDF: https://arxiv.org/pdf/2601.00705
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#SLAM #GaussianSplatting #ComputerVision #Robotics #DeepLearning
📝 Summary:
RGS-SLAM is a robust Gaussian-splatting SLAM framework that uses a one-shot, correspondence-to-Gaussian initialization with DINOv3 descriptors. This method improves stability, accelerates convergence, and yields higher rendering fidelity and accuracy compared to existing systems.
🔹 Publication Date: Published on Dec 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00705
• PDF: https://arxiv.org/pdf/2601.00705
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#SLAM #GaussianSplatting #ComputerVision #Robotics #DeepLearning
👍1
✨Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction
📝 Summary:
Gen3R combines reconstruction and video diffusion models to generate 3D scenes. It produces RGB videos and 3D geometry by aligning geometric and appearance latents. This achieves state-of-the-art results and improves reconstruction robustness.
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04090
• PDF: https://arxiv.org/pdf/2601.04090
• Project Page: https://xdimlab.github.io/Gen3R/
• Github: https://xdimlab.github.io/Gen3R/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#3DGeneration #DiffusionModels #ComputerVision #3DReconstruction #DeepLearning
📝 Summary:
Gen3R combines reconstruction and video diffusion models to generate 3D scenes. It produces RGB videos and 3D geometry by aligning geometric and appearance latents. This achieves state-of-the-art results and improves reconstruction robustness.
🔹 Publication Date: Published on Jan 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04090
• PDF: https://arxiv.org/pdf/2601.04090
• Project Page: https://xdimlab.github.io/Gen3R/
• Github: https://xdimlab.github.io/Gen3R/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#3DGeneration #DiffusionModels #ComputerVision #3DReconstruction #DeepLearning
👍1
✨RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes
📝 Summary:
RL-AWB is a novel framework combining statistical methods with deep reinforcement learning for improved nighttime auto white balance. It is the first RL approach for color constancy, mimicking expert tuning. This method shows superior generalization across various lighting conditions, and a new m...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05249
• PDF: https://arxiv.org/pdf/2601.05249
• Project Page: https://ntuneillee.github.io/research/rl-awb/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#ReinforcementLearning #ComputerVision #ImageProcessing #AutoWhiteBalance #LowLightImaging
📝 Summary:
RL-AWB is a novel framework combining statistical methods with deep reinforcement learning for improved nighttime auto white balance. It is the first RL approach for color constancy, mimicking expert tuning. This method shows superior generalization across various lighting conditions, and a new m...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05249
• PDF: https://arxiv.org/pdf/2601.05249
• Project Page: https://ntuneillee.github.io/research/rl-awb/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#ReinforcementLearning #ComputerVision #ImageProcessing #AutoWhiteBalance #LowLightImaging
❤2
✨Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset
📝 Summary:
This paper introduces IMDD-1M, a large dataset of 1 million industrial defect image-text pairs. It enables training a vision-language foundation model tailored for industrial use. This model achieves comparable performance with less data for specialized tasks, promoting data-efficient quality ins...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24160
• PDF: https://arxiv.org/pdf/2512.24160
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#IndustrialAI #VisionLanguageModel #DefectDetection #MultimodalAI #ComputerVision
📝 Summary:
This paper introduces IMDD-1M, a large dataset of 1 million industrial defect image-text pairs. It enables training a vision-language foundation model tailored for industrial use. This model achieves comparable performance with less data for specialized tasks, promoting data-efficient quality ins...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24160
• PDF: https://arxiv.org/pdf/2512.24160
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#IndustrialAI #VisionLanguageModel #DefectDetection #MultimodalAI #ComputerVision
✨ProFuse: Efficient Cross-View Context Fusion for Open-Vocabulary 3D Gaussian Splatting
📝 Summary:
ProFuse enhances open-vocabulary 3DGS understanding via an efficient, context-aware framework. It uses a pre-registration phase to fuse semantic features onto Gaussians for cross-view coherence, completing semantic attachment twice as fast as SOTA.
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04754
• PDF: https://arxiv.org/pdf/2601.04754
• Project Page: https://chiou1203.github.io/ProFuse/
• Github: https://chiou1203.github.io/ProFuse/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#3DGaussianSplatting #ComputerVision #OpenVocabulary #3DReconstruction #DeepLearning
📝 Summary:
ProFuse enhances open-vocabulary 3DGS understanding via an efficient, context-aware framework. It uses a pre-registration phase to fuse semantic features onto Gaussians for cross-view coherence, completing semantic attachment twice as fast as SOTA.
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.04754
• PDF: https://arxiv.org/pdf/2601.04754
• Project Page: https://chiou1203.github.io/ProFuse/
• Github: https://chiou1203.github.io/ProFuse/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#3DGaussianSplatting #ComputerVision #OpenVocabulary #3DReconstruction #DeepLearning
✨RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes
📝 Summary:
RL-AWB is a novel framework for nighttime auto white balance. It combines statistical methods with deep reinforcement learning, mimicking expert tuning to improve color constancy in low-light scenes. The method shows superior generalization across various lighting conditions and includes a new mu...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05249
• PDF: https://arxiv.org/pdf/2601.05249
• Project Page: https://ntuneillee.github.io/research/rl-awb/
• Github: https://github.com/BrianChen1120/RL-AWB
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#ReinforcementLearning #DeepLearning #ComputerVision #ImageProcessing #AWB
📝 Summary:
RL-AWB is a novel framework for nighttime auto white balance. It combines statistical methods with deep reinforcement learning, mimicking expert tuning to improve color constancy in low-light scenes. The method shows superior generalization across various lighting conditions and includes a new mu...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05249
• PDF: https://arxiv.org/pdf/2601.05249
• Project Page: https://ntuneillee.github.io/research/rl-awb/
• Github: https://github.com/BrianChen1120/RL-AWB
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#ReinforcementLearning #DeepLearning #ComputerVision #ImageProcessing #AWB
✨RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation
📝 Summary:
Collecting diverse robot manipulation data is challenging. This paper introduces visual identity prompting, using exemplar images to guide diffusion models for generating multi-view, temporally coherent data. This augmented data improves robot policy performance in both simulation and real-world ...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05241
• PDF: https://arxiv.org/pdf/2601.05241
• Project Page: https://robovip.github.io/RoboVIP/
• Github: https://robovip.github.io/RoboVIP/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#Robotics #AI #GenerativeAI #ComputerVision #MachineLearning
📝 Summary:
Collecting diverse robot manipulation data is challenging. This paper introduces visual identity prompting, using exemplar images to guide diffusion models for generating multi-view, temporally coherent data. This augmented data improves robot policy performance in both simulation and real-world ...
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05241
• PDF: https://arxiv.org/pdf/2601.05241
• Project Page: https://robovip.github.io/RoboVIP/
• Github: https://robovip.github.io/RoboVIP/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#Robotics #AI #GenerativeAI #ComputerVision #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
✨Plenoptic Video Generation
📝 Summary:
PlenopticDreamer addresses multi-view video re-rendering inconsistency by synchronizing generative hallucinations. It uses an autoregressive model with camera-guided retrieval to ensure spatio-temporal coherence, achieving state-of-the-art results with high fidelity.
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05239
• PDF: https://arxiv.org/pdf/2601.05239
• Project Page: https://research.nvidia.com/labs/dir/plenopticdreamer/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#PlenopticVideo #GenerativeAI #VideoGeneration #ComputerVision #DeepLearning
📝 Summary:
PlenopticDreamer addresses multi-view video re-rendering inconsistency by synchronizing generative hallucinations. It uses an autoregressive model with camera-guided retrieval to ensure spatio-temporal coherence, achieving state-of-the-art results with high fidelity.
🔹 Publication Date: Published on Jan 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05239
• PDF: https://arxiv.org/pdf/2601.05239
• Project Page: https://research.nvidia.com/labs/dir/plenopticdreamer/
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#PlenopticVideo #GenerativeAI #VideoGeneration #ComputerVision #DeepLearning
✨ViTNT-FIQA: Training-Free Face Image Quality Assessment with Vision Transformers
📝 Summary:
ViTNT-FIQA is a training-free method for face image quality assessment using Vision Transformers. It measures the stability of patch embeddings across intermediate blocks with a single forward pass. High-quality images show stable feature evolution, achieving competitive results efficiently.
🔹 Publication Date: Published on Jan 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05741
• PDF: https://arxiv.org/pdf/2601.05741
• Github: https://github.com/gurayozgur/ViTNT-FIQA
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#VisionTransformers #FaceQuality #ComputerVision #DeepLearning #AI
📝 Summary:
ViTNT-FIQA is a training-free method for face image quality assessment using Vision Transformers. It measures the stability of patch embeddings across intermediate blocks with a single forward pass. High-quality images show stable feature evolution, achieving competitive results efficiently.
🔹 Publication Date: Published on Jan 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05741
• PDF: https://arxiv.org/pdf/2601.05741
• Github: https://github.com/gurayozgur/ViTNT-FIQA
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#VisionTransformers #FaceQuality #ComputerVision #DeepLearning #AI
❤2
✨Forest Before Trees: Latent Superposition for Efficient Visual Reasoning
📝 Summary:
Laser introduces Dynamic Windowed Alignment Learning DWAL for visual reasoning. This method maintains global feature superposition, achieving state-of-the-art performance with significantly reduced computational costs and high efficiency.
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06803
• PDF: https://arxiv.org/pdf/2601.06803
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#VisualReasoning #MachineLearning #AIResearch #ComputerVision #EfficientAI
📝 Summary:
Laser introduces Dynamic Windowed Alignment Learning DWAL for visual reasoning. This method maintains global feature superposition, achieving state-of-the-art performance with significantly reduced computational costs and high efficiency.
🔹 Publication Date: Published on Jan 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06803
• PDF: https://arxiv.org/pdf/2601.06803
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#VisualReasoning #MachineLearning #AIResearch #ComputerVision #EfficientAI
❤1
✨FlyPose: Towards Robust Human Pose Estimation From Aerial Views
📝 Summary:
FlyPose is a lightweight, real-time aerial human pose estimation system. It achieves significantly improved accuracy through multi-dataset training and performs efficiently on UAVs. A new challenging dataset, FlyPose-104, is also released.
🔹 Publication Date: Published on Jan 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05747
• PDF: https://arxiv.org/pdf/2601.05747
• Github: https://github.com/farooqhassaan/FlyPose
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#HumanPoseEstimation #UAV #ComputerVision #DeepLearning #AI
📝 Summary:
FlyPose is a lightweight, real-time aerial human pose estimation system. It achieves significantly improved accuracy through multi-dataset training and performs efficiently on UAVs. A new challenging dataset, FlyPose-104, is also released.
🔹 Publication Date: Published on Jan 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.05747
• PDF: https://arxiv.org/pdf/2601.05747
• Github: https://github.com/farooqhassaan/FlyPose
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#HumanPoseEstimation #UAV #ComputerVision #DeepLearning #AI
❤1