发布日期: 2025-06-04

更新日期: 2025-07-06

文章字数: 1.9k

阅读时长: 7 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-06-04 更新

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

Authors:Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh

Few-shot Semantic Segmentation addresses the challenge of segmenting objects in query images with only a handful of annotated examples. However, many previous state-of-the-art methods either have to discard intricate local semantic features or suffer from high computational complexity. To address these challenges, we propose a new Few-shot Semantic Segmentation framework based on the Transformer architecture. Our approach introduces the spatial transformer decoder and the contextual mask generation module to improve the relational understanding between support and query images. Moreover, we introduce a multi scale decoder to refine the segmentation mask by incorporating features from different resolutions in a hierarchical manner. Additionally, our approach integrates global features from intermediate encoder stages to improve contextual understanding, while maintaining a lightweight structure to reduce complexity. This balance between performance and efficiency enables our method to achieve competitive results on benchmark datasets such as PASCAL-5^i and COCO-20^i in both 1-shot and 5-shot settings. Notably, our model with only 1.5 million parameters demonstrates competitive performance while overcoming limitations of existing methodologies.

少数语义分割（Few-shot Semantic Segmentation）应对了在只有少数标注样本的情况下对查询图像中的对象进行分割的挑战。然而，许多现有的前沿方法要么不得不放弃复杂的局部语义特征，要么面临计算复杂度高的困境。为了解决这些挑战，我们提出了一种基于Transformer架构的少数语义分割新框架。我们的方法引入了空间变换解码器和上下文掩模生成模块，以提高对支持图像和查询图像之间关系的理解。此外，我们引入了多尺度解码器，通过分层方式融入不同分辨率的特征来优化分割掩模。同时，我们的方法整合了中间编码器阶段的全局特征，以提高上下文理解，同时保持轻量级结构以降低复杂度。性能和效率之间的这种平衡使我们的方法在PASCAL-5i和COCO-20i等基准数据集上能够在单例和多例设置中都取得有竞争力的结果。值得注意的是，我们的模型仅有150万个参数，却展现了有竞争力的性能，克服了现有方法的局限性。

论文及项目相关链接

PDF

Summary

基于Transformer架构的少样本语义分割框架，通过空间变换解码器、上下文掩膜生成模块和多尺度解码器提高了对支持图像和查询图像间关系性的理解，通过融入中间编码器阶段的全局特征来改善上下文理解，同时保持轻量化结构以降低复杂性。此框架在PASCAL-5^i和COCO-20^i等基准数据集上实现了竞争性的结果。

Key Takeaways

提出基于Transformer架构的少样本语义分割框架，解决仅有少量标注样本时的图像分割挑战。
通过空间变换解码器和上下文掩膜生成模块，改善支持图像和查询图像之间的关系理解。
引入多尺度解码器，以层次方式融入不同分辨率的特征，优化分割掩膜。
融合中间编码器阶段的全局特征，增强上下文理解。
框架设计保持轻量化，以降低计算复杂性。
在PASCAL-5^i和COCO-20^i等基准数据集上实现具有竞争力的结果。

Cool Papers

点此查看论文截图

PADetBench: Towards Benchmarking Physical Attacks against Object Detection

Authors:Jiawei Lian, Jianhong Pan, Lefan Wang, Yi Wang, Lap-Pui Chau, Shaohui Mei

Physical attacks against object detection have gained increasing attention due to their significant practical implications. However, conducting physical experiments is extremely time-consuming and labor-intensive. Moreover, physical dynamics and cross-domain transformation are challenging to strictly regulate in the real world, leading to unaligned evaluation and comparison, severely hindering the development of physically robust models. To accommodate these challenges, we explore utilizing realistic simulation to thoroughly and rigorously benchmark physical attacks with fairness under controlled physical dynamics and cross-domain transformation. This resolves the problem of capturing identical adversarial images that cannot be achieved in the real world. Our benchmark includes 20 physical attack methods, 48 object detectors, comprehensive physical dynamics, and evaluation metrics. We also provide end-to-end pipelines for dataset generation, detection, evaluation, and further analysis. In addition, we perform 8064 groups of evaluation based on our benchmark, which includes both overall evaluation and further detailed ablation studies for controlled physical dynamics. Through these experiments, we provide in-depth analyses of physical attack performance and physical adversarial robustness, draw valuable observations, and discuss potential directions for future research. Codebase: https://github.com/JiaweiLian/Benchmarking_Physical_Attack

针对目标检测的物理攻击因其在实际应用中的重要意义而备受关注。然而，进行物理实验极其耗费时间和人力。此外，物理动态和跨域转换在现实世界中难以严格规定，导致评估与比较无法对齐，严重阻碍物理稳健型模型的发展。为了应对这些挑战，我们探索利用现实仿真来在控制物理动态和跨域转换的公平条件下，全面严格地评估物理攻击。这解决了在真实世界中无法捕获相同对抗图像的问题。我们的基准测试包括20种物理攻击方法、48种目标检测器、全面的物理动态和评估指标。我们还为数据集生成、检测、评估和进一步分析提供了端到端的管道。此外，我们基于基准测试进行了8064组评估，包括整体评价和针对控制物理动态的进一步详细消融研究。通过这些实验，我们对物理攻击性能和物理对抗稳健性进行了深入分析，观察并总结了有价值的见解，并讨论了未来研究的方向。代码库：https://github.com/JiaweiLian/Benchmarking_Physical_Attack

论文及项目相关链接

PDF

Summary

该文关注物理攻击对物体检测的影响，并指出真实物理实验的时间和人力成本高昂以及物理动态和跨域转换的挑战。为解决这些问题，研究利用现实仿真技术来严格评估物理攻击的效果，提供公平的比较环境，并包含物理动态和跨域转换的考量。该研究建立了一个包含多种物理攻击方法和物体检测的基准测试平台，提供数据集生成、检测、评估和进一步分析的全流程。基于该平台进行了大量实验，深入分析了物理攻击的性能和物理对抗稳健性，为未来的研究提供了有价值的观察和潜在方向。

Key Takeaways