发布日期: 2025-09-11

更新日期: 2025-10-07

文章字数: 1.1k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-09-11 更新

GCRPNet: Graph-Enhanced Contextual and Regional Perception Network for Salient Object Detection in Optical Remote Sensing Images

Authors:Mengyu Ren, Yutong Li, Hua Li, Runmin Cong, Sam Kwong

Salient object detection (SOD) in optical remote sensing images (ORSIs) faces numerous challenges, including significant variations in target scales and low contrast between targets and the background. Existing methods based on vision transformers (ViTs) and convolutional neural networks (CNNs) architectures aim to leverage both global and local features, but the difficulty in effectively integrating these heterogeneous features limits their overall performance. To overcome these limitations, we propose a graph-enhanced contextual and regional perception network (GCRPNet), which builds upon the Mamba architecture to simultaneously capture long-range dependencies and enhance regional feature representation. Specifically, we employ the visual state space (VSS) encoder to extract multi-scale features. To further achieve deep guidance and enhancement of these features, we first design a difference-similarity guided hierarchical graph attention module (DS-HGAM). This module strengthens cross-layer interaction capabilities between features of different scales while enhancing the model’s structural perception,allowing it to distinguish between foreground and background more effectively. Then, we design the LEVSS block as the decoder of GCRPNet. This module integrates our proposed adaptive scanning strategy and multi-granularity collaborative attention enhancement module (MCAEM). It performs adaptive patch scanning on feature maps processed via multi-scale convolutions, thereby capturing rich local region information and enhancing Mamba’s local modeling capability. Extensive experimental results demonstrate that the proposed model achieves state-of-the-art performance, validating its effectiveness and superiority.

在光学遥感图像（ORSIs）中，显著性目标检测（SOD）面临诸多挑战，包括目标尺度变化显著以及目标与背景之间的对比度低。现有的基于视觉转换器（ViTs）和卷积神经网络（CNNs）架构的方法旨在利用全局和局部特征，但有效整合这些异构特征的困难限制了它们的整体性能。为了克服这些局限性，我们提出了图增强上下文和区域感知网络（GCRPNet），它基于Mamba架构，同时捕获长距离依赖关系并增强区域特征表示。具体来说，我们采用视觉状态空间（VSS）编码器提取多尺度特征。为了进一步实现这些特征的深度引导和增强，我们首先设计了一个差异相似性引导分层图注意力模块（DS-HGAM）。该模块加强了不同尺度特征之间的跨层交互能力，增强了模型的结构感知能力，使其能够更有效地区分前景和背景。然后，我们设计了LEVSS块作为GCRPNet的解码器。该模块结合了我们的自适应扫描策略和多粒度协同注意力增强模块（MCAEM）。它对通过多尺度卷积处理的特征图进行自适应斑块扫描，从而捕获丰富的局部区域信息，增强Mamba的局部建模能力。大量的实验结果证明了所提模型达到了最先进的性能，验证了其有效性和优越性。

论文及项目相关链接

PDF

Summary

本文提出一种基于图增强的上下文与区域感知网络（GCRPNet），用于解决光学遥感图像中的显著目标检测问题。通过Mamba架构，网络能够同时捕捉长程依赖关系并增强区域特征表示。采用视觉状态空间（VSS）编码器提取多尺度特征，通过差异相似度引导层次图注意力模块（DS-HGAM）实现特征深度引导和增强，提高跨层交互能力和结构感知能力。此外，还设计了LEVSS块作为GCRPNet的解码器，通过自适应扫描策略和多粒度协同注意力增强模块（MCAEM）捕捉丰富的局部区域信息，提高模型的局部建模能力。实验结果表明，该模型实现了卓越的性能。

Key Takeaways

面临光学遥感图像中显著目标检测的挑战，包括目标尺度变化和背景对比度低。
现有方法基于ViTs和CNNs，旨在利用全局和局部特征，但异质特征整合困难限制性能。
提出GCRPNet模型，结合Mamba架构，同时捕捉长程依赖并增强区域特征表示。
使用VSS编码器提取多尺度特征，DS-HGAM模块增强跨层交互和模型的结构感知能力。
LEVSS块作为解码器，结合自适应扫描策略和MCAEM模块，提高局部建模能力。

Cool Papers

点此查看论文截图

Kedreamix

https://kedreamix.github.io/Talk2Paper/Paper/2025-09-11/%E6%A3%80%E6%B5%8B_%E5%88%86%E5%89%B2_%E8%B7%9F%E8%B8%AA/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !

检测/分割/跟踪

R1_Reasoning

R1_Reasoning 方向最新论文已更新，请持续关注 Update in 2025-09-12 A Survey of Reinforcement Learning for Large Reasoning Models

2025-09-12 R1_Reasoning

R1_Reasoning

Vision Transformer

Vision Transformer 方向最新论文已更新，请持续关注 Update in 2025-09-11 Benchmarking Vision Transformers and CNNs for Thermal Photovoltaic Fault Detection with Explainable AI Validation

2025-09-11 Vision Transformer

Vision Transformer