发布日期: 2025-10-03

更新日期: 2025-11-27

文章字数: 1.9k

阅读时长: 7 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-10-03 更新

Source-Free Domain Adaptive Object Detection with Semantics Compensation

Authors:Song Tang, Jiuzheng Yang, Mao Ye, Boyu Wang, Yan Gan, Xiatian Zhu

Strong data augmentation is a fundamental component of state-of-the-art mean teacher-based Source-Free domain adaptive Object Detection (SFOD) methods, enabling consistency-based self-supervised optimization along weak augmentation. However, our theoretical analysis and empirical observations reveal a critical limitation: strong augmentation can inadvertently erase class-relevant components, leading to artificial inter-category confusion. To address this issue, we introduce Weak-to-strong Semantics Compensation (WSCo), a novel remedy that leverages weakly augmented images, which preserve full semantics, as anchors to enrich the feature space of their strongly augmented counterparts. Essentially, this compensates for the class-relevant semantics that may be lost during strong augmentation on the fly. Notably, WSCo can be implemented as a generic plug-in, easily integrable with any existing SFOD pipelines. Extensive experiments validate the negative impact of strong augmentation on detection performance, and the effectiveness of WSCo in enhancing the performance of previous detection models on standard benchmarks.

强大数据增强是最新基于无标签数据的无教师监督领域自适应对象检测的前沿技术的重要组成部分，其能够在弱增强数据基础上实现一致性自监督优化。然而，我们的理论分析和实证观察揭示了一个关键问题：强大数据增强可能会无意中删除与类别相关的组件，从而导致人工类别间混淆。为解决这一问题，我们引入了弱到强语义补偿（WSCo），这是一种新型补救方法，它通过利用保持完整语义的弱增强图像作为锚点来丰富强增强特征空间的数据表达，以此弥补强增强过程中可能丢失的类别相关语义信息。值得注意的是，WSCo可以作为通用插件实现，易于集成到任何现有的无标签领域自适应对象检测管道中。大量实验验证了强增强对检测性能的负面影响以及WSCo在标准基准上提升先前检测模型性能的有效性。

论文及项目相关链接

PDF

Summary
本文指出，强数据增强在基于均值教师的无源域自适应目标检测（SFOD）方法中起着重要作用，可实现基于一致性的自监督优化和弱增强。然而，我们的理论分析和实验观察揭示了其存在的关键局限：强增强可能会无意中消除与类别相关的组件，导致人为的跨类别混淆。为解决这一问题，本文引入了一种名为弱到强语义补偿（WSCo）的新方法，该方法利用弱增强图像作为锚点来丰富其强增强对应物的特征空间，从而补偿在实时强增强过程中可能丢失的与类别相关的语义信息。WSCo可作为一种通用插件实现，易于集成到任何现有的SFOD管道中。实验验证了强增强的负面影响以及WSCo在提高先前检测模型在标准基准测试上的性能方面的有效性。

Key Takeaways

强数据增强在基于均值教师的无源域自适应目标检测中扮演重要角色，但存在关键局限。
强增强可能会消除与类别相关的组件，导致跨类别混淆。
为解决这一问题，引入了一种名为弱到强语义补偿（WSCo）的新方法。
WSCo利用弱增强图像作为锚点来丰富特征空间。
WSCo可以作为一种通用插件实现，易于集成到现有的检测管道中。
实验验证了强增强的负面影响。

Cool Papers

点此查看论文截图

DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection

Authors:Junjie Guo, Chenqiang Gao, Fangcen Liu, Deyu Meng

Infrared-visible object detection aims to achieve robust object detection by leveraging the complementary information of infrared and visible image pairs. However, the commonly existing modality misalignment problem presents two challenges: fusing misalignment complementary features is difficult, and current methods cannot reliably locate objects in both modalities under misalignment conditions. In this paper, we propose a Decoupled Position Detection Transformer (DPDETR) to address these issues. Specifically, we explicitly define the object category, visible modality position, and infrared modality position to enable the network to learn the intrinsic relationships and output reliably positions of objects in both modalities. To fuse misaligned object features reliably, we propose a Decoupled Position Multispectral Cross-attention module that adaptively samples and aggregates multispectral complementary features with the constraint of infrared and visible reference positions. Additionally, we design a query-decoupled Multispectral Decoder structure to address the the conflict in feature focus among the three kinds of object information in our task and propose a Decoupled Position Contrastive DeNoising Training strategy to enhance the DPDETR’s ability to learn decoupled positions. Experiments on DroneVehicle and KAIST datasets demonstrate significant improvements compared to other state-of-the-art methods. The code will be released at https://github.com/gjj45/DPDETR

红外可见目标检测旨在通过利用红外和可见图像对的互补信息来实现稳健的目标检测。然而，普遍存在的模态不匹配问题带来了两个挑战：融合失配互补特征是困难的，并且现有方法无法在失配条件下在两个模态中可靠地定位目标。针对这些问题，本文提出了去耦位置检测转换器（DPDETR）。具体来说，我们明确定义了目标类别、可见模态位置和红外模态位置，以使得网络能够学习内在关系并可靠地输出两个模态中目标的位置。为了可靠地融合失配的目标特征，我们提出了去耦位置多光谱交叉注意力模块，该模块自适应采样并聚合多光谱互补特征，同时受到红外和可见参考位置的约束。此外，我们设计了查询去耦多光谱解码器结构，以解决我们任务中三种目标信息之间的特征焦点冲突问题，并提出了去耦位置对比降噪训练策略，以增强DPDETR学习去耦位置的能力。在DroneVehicle和KAIST数据集上的实验表明，与其他最先进的方法相比，该算法取得了显著的改进。代码将在https://github.com/gjj45/DPDETR上发布。

论文及项目相关链接

PDF

Summary

红外与可见光目标检测旨在结合两种图像的优势实现稳健的目标检测。针对模态不对齐问题，本文提出一种名为Decoupled Position Detection Transformer（DPDETR）的方法来解决。通过明确对象类别、可见光模态位置和红外模态位置的定义，网络能够学习内在关系并可靠输出两种模态中的目标位置。同时，提出了一种解耦位置多光谱交叉注意力模块，自适应采样并聚合多光谱互补特征。实验结果表明，与其他最先进的方法相比，该方法在DroneVehicle和KAIST数据集上有显著提高。

Key Takeaways

红外与可见光目标检测结合可实现稳健的目标检测。
Decoupled Position Detection Transformer（DPDETR）解决模态不对齐的问题。
定义对象类别、可见光模态位置和红外模态位置以强化网络性能。
解耦位置多光谱交叉注意力模块能够自适应采样并聚合多光谱特征。
查询解耦多光谱解码器结构解决任务中三种对象信息之间的特征冲突。
Decoupled Position Contrastive DeNoising Training策略提升DPDETR学习解耦位置的能力。

Cool Papers

点此查看论文截图

Kedreamix

https://kedreamix.github.io/Talk2Paper/Paper/2025-10-03/%E6%A3%80%E6%B5%8B_%E5%88%86%E5%89%B2_%E8%B7%9F%E8%B8%AA/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !

检测/分割/跟踪

无监督/半监督/对比学习

无监督/半监督/对比学习方向最新论文已更新，请持续关注 Update in 2025-10-03 Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

2025-10-03 无监督/半监督/对比学习

无监督/半监督/对比学习

Vision Transformer

Vision Transformer 方向最新论文已更新，请持续关注 Update in 2025-10-03 MMGeoLM Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models

2025-10-03 Vision Transformer

Vision Transformer