发布日期: 2025-11-18

更新日期: 2025-11-27

文章字数: 3.3k

阅读时长: 13 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-11-18 更新

OpenUS: A Fully Open-Source Foundation Model for Ultrasound Image Analysis via Self-Adaptive Masked Contrastive Learning

Authors:Xiaoyu Zheng, Xu Chen, Awais Rauf, Qifan Fu, Benedetta Monosi, Felice Rivellese, Myles J. Lewis, Shaogang Gong, Gregory Slabaugh

Ultrasound (US) is one of the most widely used medical imaging modalities, thanks to its low cost, portability, real-time feedback, and absence of ionizing radiation. However, US image interpretation remains highly operator-dependent and varies significantly across anatomical regions, acquisition protocols, and device types. These variations, along with unique challenges such as speckle, low contrast, and limited standardized annotations, hinder the development of generalizable, label-efficient ultrasound AI models. In this paper, we propose OpenUS, the first reproducible, open-source ultrasound foundation model built on a large collection of public data. OpenUS employs a vision Mamba backbone, capturing both local and global long-range dependencies across the image. To extract rich features during pre-training, we introduce a novel self-adaptive masking framework that combines contrastive learning with masked image modeling. This strategy integrates the teacher’s attention map with student reconstruction loss, adaptively refining clinically-relevant masking to enhance pre-training effectiveness. OpenUS also applies a dynamic learning schedule to progressively adjust the difficulty of the pre-training process. To develop the foundation model, we compile the largest to-date public ultrasound dataset comprising over 308K images from 42 publicly available datasets, covering diverse anatomical regions, institutions, imaging devices, and disease types. Our pre-trained OpenUS model can be easily adapted to specific downstream tasks by serving as a backbone for label-efficient fine-tuning. Code is available at https://github.com/XZheng0427/OpenUS.

超声（US）是最广泛使用的医学成像方式之一，这得益于其成本低、便携、实时反馈以及没有电离辐射。然而，超声图像解读仍然高度依赖操作员，并在不同的解剖区域、采集协议和设备类型之间存在显著差异。这些差异，以及特有的挑战（如斑点噪声、低对比度和有限的标准化注释），阻碍了通用、标签高效的超声人工智能模型的发展。在本文中，我们提出了OpenUS，这是基于大量公共数据构建的开源超声基础模型的首次尝试。OpenUS采用视觉Mamba骨干网络，能够捕捉图像中的局部和全局远程依赖性关系。为了在预训练期间提取丰富的特征，我们引入了一种新型自适应掩码框架，该框架结合了对比学习与掩码图像建模。该策略将教师的注意力图与学生重建损失相结合，自适应地细化与临床相关的掩码，以提高预训练的有效性。OpenUS还采用动态学习时间表来逐步调整预训练过程的难度。为了开发基础模型，我们编译了迄今为止最大的公共超声数据集，包含超过30.8万张图像，来自42个公开可用数据集，涵盖多种解剖区域、机构、成像设备和疾病类型。我们的预训练OpenUS模型可以很容易地适应特定的下游任务，作为标签高效微调的主干。代码可访问于 https://github.com/XZheng0427/OpenUS。

论文及项目相关链接

PDF

Summary

本文介绍了超声（US）作为医疗成像的一种常用模态的应用和挑战。针对超声图像解读存在的主观差异和技术难点，研究者提出一种基于公开数据集的可复现的开源超声基础模型OpenUS。OpenUS结合视觉Mamba架构捕捉图像的长程依赖关系，通过引入自适应掩码框架结合对比学习和掩码图像建模进行预训练，并结合教师注意力图和学生的重建损失进行自适应优化。此外，OpenUS还采用动态学习调度逐步调整预训练过程的难度。该模型可在公开超声数据集上进行训练，通过适应特定下游任务进行微调以实现标签效率的提高。相关代码已发布在GitHub上。

Key Takeaways

超声（US）是广泛使用的医疗成像模态，但图像解读存在主观差异和技术挑战。
OpenUS是一个基于公开数据集的开源超声基础模型，旨在解决这些问题。
OpenUS结合了视觉Mamba架构，能够捕捉图像的长程依赖关系。
引入自适应掩码框架结合对比学习和掩码图像建模进行预训练。
结合教师注意力图和学生的重建损失进行自适应优化，提高预训练效果。
OpenUS采用动态学习调度逐步调整预训练过程的难度。

Cool Papers

点此查看论文截图

Contrastive Integrated Gradients: A Feature Attribution-Based Method for Explaining Whole Slide Image Classification

Authors:Anh Mai Vu, Tuan L. Vo, Ngoc Lam Quang Bui, Nam Nguyen Le Binh, Akash Awasthi, Huy Quoc Vo, Thanh-Huy Nguyen, Zhu Han, Chandra Mohan, Hien Van Nguyen

Interpretability is essential in Whole Slide Image (WSI) analysis for computational pathology, where understanding model predictions helps build trust in AI-assisted diagnostics. While Integrated Gradients (IG) and related attribution methods have shown promise, applying them directly to WSIs introduces challenges due to their high-resolution nature. These methods capture model decision patterns but may overlook class-discriminative signals that are crucial for distinguishing between tumor subtypes. In this work, we introduce Contrastive Integrated Gradients (CIG), a novel attribution method that enhances interpretability by computing contrastive gradients in logit space. First, CIG highlights class-discriminative regions by comparing feature importance relative to a reference class, offering sharper differentiation between tumor and non-tumor areas. Second, CIG satisfies the axioms of integrated attribution, ensuring consistency and theoretical soundness. Third, we propose two attribution quality metrics, MIL-AIC and MIL-SIC, which measure how predictive information and model confidence evolve with access to salient regions, particularly under weak supervision. We validate CIG across three datasets spanning distinct cancer types: CAMELYON16 (breast cancer metastasis in lymph nodes), TCGA-RCC (renal cell carcinoma), and TCGA-Lung (lung cancer). Experimental results demonstrate that CIG yields more informative attributions both quantitatively, using MIL-AIC and MIL-SIC, and qualitatively, through visualizations that align closely with ground truth tumor regions, underscoring its potential for interpretable and trustworthy WSI-based diagnostics

在用于计算病理学的全幻灯片图像（WSI）分析中，可解释性至关重要。了解模型预测有助于建立对人工智能辅助诊断的信任。虽然集成梯度（IG）和相关归因方法已显示出潜力，但将其直接应用于WSIs却带来了挑战，因为它们的分辨率很高。这些方法可以捕捉模型的决策模式，但可能会忽略对区分肿瘤亚型至关重要的类判别信号。在这项工作中，我们引入了对比集成梯度（CIG），这是一种新的归因方法，通过在逻辑空间中计算对比梯度来提高可解释性。首先，CIG通过比较相对于参考类的特征重要性来突出类判别区域，从而更清晰地区分肿瘤和非肿瘤区域。其次，CIG满足集成归因的公理，确保一致性和理论合理性。第三，我们提出了两种归因质量指标，即MIL-AIC和MIL-SIC，用于衡量在访问重要区域时预测信息和模型信心的变化，特别是在弱监督下。我们在三个涵盖不同癌症类型的数据集上验证了CIG的有效性：CAMELYON16（淋巴结乳腺癌转移）、TCGA-RCC（肾细胞癌）和TCGA-Lung（肺癌）。实验结果表明，无论是在定量上（使用MIL-AIC和MIL-SIC）还是在定性上（通过紧密对齐地面真实肿瘤区域的可视化），CIG都能产生更有用的归因。这为基于WSI的可解释和可信赖的诊断提供了潜力。

论文及项目相关链接

PDF Accepted to WACV 2026

Summary

本文介绍了在病理计算分析中，全幻灯片图像（WSI）分析的可解释性至关重要。为提高模型预测的可信度，研究者提出了一种新的归因方法——对比集成梯度（CIG）。该方法通过计算逻辑空间中的对比梯度，提高了模型决策模式的可解释性。CIG不仅能突出显示相对于参考类的判别区域，还满足集成归因的公理，确保一致性和理论严谨性。此外，本文还提出了两种归因质量度量标准，用于测量重要区域的预测信息和模型信心的变化。实验结果表明，CIG在三种不同癌症类型的数据集上产生了更有信息量的归属，并可通过可视化紧密对齐真实肿瘤区域，为基于WSI的诊断提供了可解释和可信赖的潜力。

Key Takeaways

在全幻灯片图像（WSI）分析中，解释模型的预测对于建立对AI辅助诊断的信任至关重要。
对比集成梯度（CIG）是一种新的归因方法，旨在提高计算病理学中的可解释性。
CIG通过计算对比梯度来突出显示相对于参考类的判别区域，从而提高了模型决策模式的解释性。
CIG满足集成归因的公理，确保了方法和理论的一致性。
提出了两种归因质量度量标准（MIL-AIC和MIL-SIC），用于评估模型在不同区域的预测性能和信心变化。
实验结果表明，CIG在多种癌症类型的数据集上表现优异，生成的归属图能紧密对齐真实肿瘤区域。

Cool Papers

点此查看论文截图

Quantifying the Limits of Segmentation Foundation Models: Modeling Challenges in Segmenting Tree-Like and Low-Contrast Objects

Authors:Yixin Zhang, Nicholas Konz, Kevin Kramer, Maciej A. Mazurowski

Image segmentation foundation models (SFMs) like Segment Anything Model (SAM) have achieved impressive zero-shot and interactive segmentation across diverse domains. However, they struggle to segment objects with certain structures, particularly those with dense, tree-like morphology and low textural contrast from their surroundings. These failure modes are crucial for understanding the limitations of SFMs in real-world applications. To systematically study this issue, we introduce interpretable metrics quantifying object tree-likeness and textural separability. On carefully controlled synthetic experiments and real-world datasets, we show that SFM performance (\eg, SAM, SAM 2, HQ-SAM) noticeably correlates with these factors. We attribute these failures to SFMs misinterpreting local structure as global texture, resulting in over-segmentation or difficulty distinguishing objects from similar backgrounds. Notably, targeted fine-tuning fails to resolve this issue, indicating a fundamental limitation. Our study provides the first quantitative framework for modeling the behavior of SFMs on challenging structures, offering interpretable insights into their segmentation capabilities.

图像分割基础模型（SFMs），如任意分割模型（SAM），在不同领域实现了令人印象深刻的零样本和交互式分割。然而，它们在分割具有某些结构，特别是具有密集、树状形态以及与周围低纹理对比度的物体时遇到了困难。这些失败模式对于理解SFMs在现实世界应用中的局限性至关重要。为了系统地研究这个问题，我们引入了可解释的度量指标来量化对象树状程度和纹理可分离性。在精心控制的合成实验和真实世界数据集上，我们证明了SFM性能（例如SAM、SAM 2、HQ-SAM）与这些因素有明显的相关性。我们将这些失败归咎于SFMs误解了局部结构为全局纹理，导致过度分割或难以区分与类似背景的对象。值得注意的是，有针对性的微调未能解决这个问题，表明存在一个根本的局限性。我们的研究提供了针对具有挑战性结构的SFM行为的第一个定量框架，为他们的分割能力提供了可解释性的见解。

论文及项目相关链接

PDF Accepted at WACV 2026. Code: https://github.com/mazurowski-lab/SAMFailureMetrics

Summary：图像分割基础模型（SFMs）如Segment Anything Model（SAM）在跨域零样本和交互式分割方面取得了显著成果，但在处理具有特定结构的对象时存在局限性，特别是在处理具有密集树状形态和低纹理对比的对象时。本研究通过引入可解释的度量指标来量化对象树状形态和纹理分离程度，探讨了这一问题。研究结果表明，SFM性能与这些因素密切相关。这些问题是由于SFM误解局部结构为全局纹理而导致的过度分割或难以区分与相似背景的对象。

Key Takeaways：