发布日期: 2025-09-18

更新日期: 2025-10-07

文章字数: 1.1k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-09-18 更新

Instance-Guided Class Activation Mapping for Weakly Supervised Semantic Segmentation

Authors:Ali Torabi, Sanjog Gaihre, MD Mahbubur Rahman, Yaqoob Majeed

Weakly Supervised Semantic Segmentation (WSSS) addresses the challenge of training segmentation models using only image-level annotations, eliminating the need for expensive pixel-level labeling. While existing methods struggle with precise object boundary localization and often focus only on the most discriminative regions, we propose IG-CAM (Instance-Guided Class Activation Mapping), a novel approach that leverages instance-level cues and influence functions to generate high-quality, boundary-aware localization maps. Our method introduces three key innovations: (1) Instance-Guided Refinement that uses ground truth segmentation masks to guide CAM generation, ensuring complete object coverage rather than just discriminative parts; (2) Influence Function Integration that captures the relationship between training samples and model predictions, leading to more robust feature representations; and (3) Multi-Scale Boundary Enhancement that employs progressive refinement strategies to achieve sharp, precise object boundaries. IG-CAM achieves state-of-the-art performance on the PASCAL VOC 2012 dataset with an mIoU of 82.3% before post-processing, which further improves to 86.6% after applying Conditional Random Field (CRF) refinement, significantly outperforming previous WSSS methods. Our approach demonstrates superior localization accuracy, with complete object coverage and precise boundary delineation, while maintaining computational efficiency. Extensive ablation studies validate the contribution of each component, and qualitative comparisons across 600 diverse images showcase the method’s robustness and generalization capability. The results establish IG-CAM as a new benchmark for weakly supervised semantic segmentation, offering a practical solution for scenarios where pixel-level annotations are unavailable or prohibitively expensive.

弱监督语义分割（WSSS）旨在解决仅使用图像级注释训练分割模型所面临的挑战，从而消除了对昂贵的像素级标签的需求。虽然现有方法在精确的目标边界定位方面存在困难，并且往往只关注最具辨识度的区域，我们提出了IG-CAM（实例引导类激活映射），这是一种利用实例级线索和影响函数生成高质量、边界感知定位映射的新方法。我们的方法引入了三个关键创新点：（1）实例引导细化，使用真实分割掩膜来指导CAM生成，确保对对象的完整覆盖，而不仅仅是具有辨识力的部分；（2）影响函数集成，捕捉训练样本和模型预测之间的关系，导致更稳健的特征表示；（3）多尺度边界增强，采用渐进细化策略，实现清晰、精确的对象边界。IG-CAM在PASCAL VOC 2012数据集上实现了最新的性能，在mIoU指标上达到了82.3%，在采用条件随机场（CRF）细化后进一步提高到86.6%，显著优于之前的WSSS方法。我们的方法展示了出色的定位精度，具有完整的对象覆盖和精确的边界描绘，同时保持了计算效率。广泛的消融研究验证了每个组件的贡献，对600张不同图像的定性比较展示了该方法的稳健性和泛化能力。结果确立了IG-CAM作为弱监督语义分割的新基准，为像素级注释不可用或过于昂贵的情况下提供了实际解决方案。

论文及项目相关链接

PDF

Summary
基于图像级注释的训练语义分割模型的挑战，提出了IG-CAM方法，利用实例级线索和影响函数生成高质量、边界感知的定位图。该方法引入三项关键技术创新，包括实例引导细化、影响函数集成和多尺度边界增强。IG-CAM在PASCAL VOC 2012数据集上取得了最先进的性能，mIoU达到82.3%，在应用条件随机场（CRF）细化后进一步提升至86.6%，显著优于先前的WSSS方法。该方法展现出卓越的定位精度和计算效率。

Key Takeaways