πΉ Title: MedSAMix: A Training-Free Model Merging Approach for Medical Image Segmentation
πΉ Publication Date: Published on Aug 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.11032
β’ PDF: https://arxiv.org/pdf/2508.11032
β’ Github: https://github.com/podismine/MedSAMix
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.11032
β’ PDF: https://arxiv.org/pdf/2508.11032
β’ Github: https://github.com/podismine/MedSAMix
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Semantic IDs for Joint Generative Search and Recommendation
πΉ Publication Date: Published on Aug 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.10478
β’ PDF: https://arxiv.org/pdf/2508.10478
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.10478
β’ PDF: https://arxiv.org/pdf/2508.10478
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
πΉ Publication Date: Published on Aug 13
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.09789
β’ PDF: https://arxiv.org/pdf/2508.09789
πΉ Datasets citing this paper:
β’ https://huggingface.co/datasets/marcodena/video-recs-describe-what-you-see
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 13
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.09789
β’ PDF: https://arxiv.org/pdf/2508.09789
πΉ Datasets citing this paper:
β’ https://huggingface.co/datasets/marcodena/video-recs-describe-what-you-see
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13998
β’ PDF: https://arxiv.org/pdf/2508.13998
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13998
β’ PDF: https://arxiv.org/pdf/2508.13998
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13804
β’ PDF: https://arxiv.org/pdf/2508.13804
β’ Project Page: https://maciejskorski.github.io/moral-foundations-llm-eval
β’ Github: https://github.com/maciejskorski/moral-foundations-llm-eval
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13804
β’ PDF: https://arxiv.org/pdf/2508.13804
β’ Project Page: https://maciejskorski.github.io/moral-foundations-llm-eval
β’ Github: https://github.com/maciejskorski/moral-foundations-llm-eval
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
πΉ Publication Date: Published on Aug 16
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12040
β’ PDF: https://arxiv.org/pdf/2508.12040
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 16
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12040
β’ PDF: https://arxiv.org/pdf/2508.12040
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12903
β’ PDF: https://arxiv.org/pdf/2508.12903
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12903
β’ PDF: https://arxiv.org/pdf/2508.12903
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
πΉ Publication Date: Published on Aug 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13186
β’ PDF: https://arxiv.org/pdf/2508.13186
β’ Github: https://github.com/MMBrowseComp/MM-BrowseComp
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13186
β’ PDF: https://arxiv.org/pdf/2508.13186
β’ Github: https://github.com/MMBrowseComp/MM-BrowseComp
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
β€1
πΉ Title: CAMAR: Continuous Actions Multi-Agent Routing
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12845
β’ PDF: https://arxiv.org/pdf/2508.12845
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12845
β’ PDF: https://arxiv.org/pdf/2508.12845
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12800
β’ PDF: https://arxiv.org/pdf/2508.12800
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.12800
β’ PDF: https://arxiv.org/pdf/2508.12800
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
β€1
Forwarded from Machine Learning with Python
π₯ $10.000 WITH LISA!
Lisa earned $200,000 in a month, and now itβs YOUR TURN!
Sheβs made trading SO SIMPLE that anyone can do it.
βοΈJust copy her signals every day
βοΈFollow her trades step by step
βοΈEarn $1,000+ in your first week β GUARANTEED!
π¨ BONUS: Lisa is giving away $10,000 to her subscribers!
Donβt miss this once-in-a-lifetime opportunity. Free access for the first 500 people only!
π CLICK HERE TO JOIN NOW π
Lisa earned $200,000 in a month, and now itβs YOUR TURN!
Sheβs made trading SO SIMPLE that anyone can do it.
βοΈJust copy her signals every day
βοΈFollow her trades step by step
βοΈEarn $1,000+ in your first week β GUARANTEED!
π¨ BONUS: Lisa is giving away $10,000 to her subscribers!
Donβt miss this once-in-a-lifetime opportunity. Free access for the first 500 people only!
π CLICK HERE TO JOIN NOW π
πΉ Title: Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
πΉ Publication Date: Published on Aug 1
πΉ Abstract: Foundation-Sec-8B-Instruct is a cybersecurity-focused LLM designed for chat-style interactions and instruction-following, outperforming other models in cybersecurity tasks while matching their instruction-following capabilities. AI-generated summary Large language models ( LLMs ) have shown remarkable success across many domains, yet their integration into cybersecurity applications remains limited due to a lack of general-purpose cybersecurity data, representational complexity, and safety and regulatory concerns. To address this gap, we previously introduced Foundation-Sec-8B , a cybersecurity -focused LLM suitable for fine-tuning on downstream tasks. That model, however, was not designed for chat-style interactions or instruction-following . In this report, we release Foundation-Sec-8B -Instruct: a model specifically trained for general-purpose cybersecurity dialogue . Built on Foundation-Sec-8B , it combines domain-specific knowledge with instruction-following , conversational capabilities , and alignment with human preferences to produce high-quality, relevant responses. Comprehensive evaluations show that Foundation-Sec-8B -Instruct outperforms Llama 3.1-8B-Instruct on a range of cybersecurity tasks while matching its instruction-following performance. It is also competitive with GPT-4o-mini on cyber threat intelligence and instruction-following tasks. We envision Foundation-Sec-8B -Instruct becoming an indispensable assistant in the daily workflows of cybersecurity professionals. We release the model publicly at https://huggingface.co/fdtn-ai/ Foundation-Sec-8B -Instruct.
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.01059
β’ PDF: https://arxiv.org/pdf/2508.01059
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 1
πΉ Abstract: Foundation-Sec-8B-Instruct is a cybersecurity-focused LLM designed for chat-style interactions and instruction-following, outperforming other models in cybersecurity tasks while matching their instruction-following capabilities. AI-generated summary Large language models ( LLMs ) have shown remarkable success across many domains, yet their integration into cybersecurity applications remains limited due to a lack of general-purpose cybersecurity data, representational complexity, and safety and regulatory concerns. To address this gap, we previously introduced Foundation-Sec-8B , a cybersecurity -focused LLM suitable for fine-tuning on downstream tasks. That model, however, was not designed for chat-style interactions or instruction-following . In this report, we release Foundation-Sec-8B -Instruct: a model specifically trained for general-purpose cybersecurity dialogue . Built on Foundation-Sec-8B , it combines domain-specific knowledge with instruction-following , conversational capabilities , and alignment with human preferences to produce high-quality, relevant responses. Comprehensive evaluations show that Foundation-Sec-8B -Instruct outperforms Llama 3.1-8B-Instruct on a range of cybersecurity tasks while matching its instruction-following performance. It is also competitive with GPT-4o-mini on cyber threat intelligence and instruction-following tasks. We envision Foundation-Sec-8B -Instruct becoming an indispensable assistant in the daily workflows of cybersecurity professionals. We release the model publicly at https://huggingface.co/fdtn-ai/ Foundation-Sec-8B -Instruct.
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.01059
β’ PDF: https://arxiv.org/pdf/2508.01059
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Rapidly Adapting to New Voice Spoofing: Few-Shot Detection of Synthesized Speech Under Distribution Shifts
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13320
β’ PDF: https://arxiv.org/pdf/2508.13320
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 18
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13320
β’ PDF: https://arxiv.org/pdf/2508.13320
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Retrieval-augmented reasoning with lean language models
πΉ Publication Date: Published on Aug 15
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.11386
β’ PDF: https://arxiv.org/pdf/2508.11386
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 15
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.11386
β’ PDF: https://arxiv.org/pdf/2508.11386
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: StrandDesigner: Towards Practical Strand Generation with Sketch Guidance
πΉ Publication Date: Published on Aug 3
πΉ Abstract: A sketch-based strand generation model using a learnable upsampling strategy and multi-scale adaptive conditioning mechanism outperforms existing methods in realism and precision for hair strand generation. AI-generated summary Realistic hair strand generation is crucial for applications like computer graphics and virtual reality. While diffusion models can generate hairstyles from text or images, these inputs lack precision and user-friendliness. Instead, we propose the first sketch-based strand generation model, which offers finer control while remaining user-friendly. Our framework tackles key challenges, such as modeling complex strand interactions and diverse sketch patterns, through two main innovations: a learnable strand upsampling strategy that encodes 3D strands into multi-scale latent spaces , and a multi-scale adaptive conditioning mechanism using a transformer with diffusion heads to ensure consistency across granularity levels. Experiments on several benchmark datasets show our method outperforms existing approaches in realism and precision. Qualitative results further confirm its effectiveness. Code will be released at [GitHub](https://github.com/fighting-Zhang/StrandDesigner).
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.01650
β’ PDF: https://arxiv.org/pdf/2508.01650
β’ Github: https://github.com/fighting-Zhang/StrandDesigner
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 3
πΉ Abstract: A sketch-based strand generation model using a learnable upsampling strategy and multi-scale adaptive conditioning mechanism outperforms existing methods in realism and precision for hair strand generation. AI-generated summary Realistic hair strand generation is crucial for applications like computer graphics and virtual reality. While diffusion models can generate hairstyles from text or images, these inputs lack precision and user-friendliness. Instead, we propose the first sketch-based strand generation model, which offers finer control while remaining user-friendly. Our framework tackles key challenges, such as modeling complex strand interactions and diverse sketch patterns, through two main innovations: a learnable strand upsampling strategy that encodes 3D strands into multi-scale latent spaces , and a multi-scale adaptive conditioning mechanism using a transformer with diffusion heads to ensure consistency across granularity levels. Experiments on several benchmark datasets show our method outperforms existing approaches in realism and precision. Qualitative results further confirm its effectiveness. Code will be released at [GitHub](https://github.com/fighting-Zhang/StrandDesigner).
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.01650
β’ PDF: https://arxiv.org/pdf/2508.01650
β’ Github: https://github.com/fighting-Zhang/StrandDesigner
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
πΉ Publication Date: Published on Aug 16
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.11987
β’ PDF: https://arxiv.org/pdf/2508.11987
β’ Project Page: https://futurex-ai.github.io/
πΉ Datasets citing this paper:
β’ https://huggingface.co/datasets/futurex-ai/Futurex-Online
β’ https://huggingface.co/datasets/futurex-ai/Futurex-Past
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 16
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.11987
β’ PDF: https://arxiv.org/pdf/2508.11987
β’ Project Page: https://futurex-ai.github.io/
πΉ Datasets citing this paper:
β’ https://huggingface.co/datasets/futurex-ai/Futurex-Online
β’ https://huggingface.co/datasets/futurex-ai/Futurex-Past
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13491
β’ PDF: https://arxiv.org/pdf/2508.13491
πΉ Datasets citing this paper:
β’ https://huggingface.co/datasets/NextGenWhu/FinCDM-FinEval-KQA
β’ https://huggingface.co/datasets/NextGenWhu/FinCDM-CPA-KQA
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.13491
β’ PDF: https://arxiv.org/pdf/2508.13491
πΉ Datasets citing this paper:
β’ https://huggingface.co/datasets/NextGenWhu/FinCDM-FinEval-KQA
β’ https://huggingface.co/datasets/NextGenWhu/FinCDM-CPA-KQA
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization
πΉ Publication Date: Published on Aug 20
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.14811
β’ PDF: https://arxiv.org/pdf/2508.14811
β’ Project Page: https://aim-uofa.github.io/Tinker/
β’ Github: https://github.com/aim-uofa/Tinker
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 20
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.14811
β’ PDF: https://arxiv.org/pdf/2508.14811
β’ Project Page: https://aim-uofa.github.io/Tinker/
β’ Github: https://github.com/aim-uofa/Tinker
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: RynnEC: Bringing MLLMs into Embodied World
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.14160
β’ PDF: https://arxiv.org/pdf/2508.14160
β’ Github: https://github.com/alibaba-damo-academy/RynnEC
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.14160
β’ PDF: https://arxiv.org/pdf/2508.14160
β’ Github: https://github.com/alibaba-damo-academy/RynnEC
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Multimodal Referring Segmentation: A Survey
πΉ Publication Date: Published on Aug 1
πΉ Abstract: A survey of multimodal referring segmentation techniques, covering advancements in convolutional neural networks, transformers, and large language models for segmenting objects in images, videos, and 3D scenes based on text or audio instructions. AI-generated summary Multimodal referring segmentation aims to segment target objects in visual scenes, such as images, videos, and 3D scenes, based on referring expressions in text or audio format. This task plays a crucial role in practical applications requiring accurate object perception based on user instructions. Over the past decade, it has gained significant attention in the multimodal community, driven by advances in convolutional neural networks , transformers , and large language models , all of which have substantially improved multimodal perception capabilities. This paper provides a comprehensive survey of multimodal referring segmentation . We begin by introducing this field's background, including problem definitions and commonly used datasets. Next, we summarize a unified meta architecture for referring segmentation and review representative methods across three primary visual scenes, including images, videos, and 3D scenes. We further discuss Generalized Referring Expression (GREx) methods to address the challenges of real-world complexity, along with related tasks and practical applications. Extensive performance comparisons on standard benchmarks are also provided. We continually track related works at https://github.com/henghuiding/Awesome-Multimodal-Referring-Segmentation.
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.00265
β’ PDF: https://arxiv.org/pdf/2508.00265
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 1
πΉ Abstract: A survey of multimodal referring segmentation techniques, covering advancements in convolutional neural networks, transformers, and large language models for segmenting objects in images, videos, and 3D scenes based on text or audio instructions. AI-generated summary Multimodal referring segmentation aims to segment target objects in visual scenes, such as images, videos, and 3D scenes, based on referring expressions in text or audio format. This task plays a crucial role in practical applications requiring accurate object perception based on user instructions. Over the past decade, it has gained significant attention in the multimodal community, driven by advances in convolutional neural networks , transformers , and large language models , all of which have substantially improved multimodal perception capabilities. This paper provides a comprehensive survey of multimodal referring segmentation . We begin by introducing this field's background, including problem definitions and commonly used datasets. Next, we summarize a unified meta architecture for referring segmentation and review representative methods across three primary visual scenes, including images, videos, and 3D scenes. We further discuss Generalized Referring Expression (GREx) methods to address the challenges of real-world complexity, along with related tasks and practical applications. Extensive performance comparisons on standard benchmarks are also provided. We continually track related works at https://github.com/henghuiding/Awesome-Multimodal-Referring-Segmentation.
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.00265
β’ PDF: https://arxiv.org/pdf/2508.00265
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Title: Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
πΉ Publication Date: Published on Aug 20
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.14896
β’ PDF: https://arxiv.org/pdf/2508.14896
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT
πΉ Publication Date: Published on Aug 20
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2508.14896
β’ PDF: https://arxiv.org/pdf/2508.14896
πΉ Datasets citing this paper:
No datasets found
πΉ Spaces citing this paper:
No spaces found
==================================
For more data science resources:
β https://t.me/DataScienceT