发布日期: 2025-09-18

更新日期: 2025-10-07

文章字数: 3k

阅读时长: 12 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-09-18 更新

AREPAS: Anomaly Detection in Fine-Grained Anatomy with Reconstruction-Based Semantic Patch-Scoring

Authors:Branko Mitic, Philipp Seeböck, Helmut Prosch, Georg Langs

Early detection of newly emerging diseases, lesion severity assessment, differentiation of medical conditions and automated screening are examples for the wide applicability and importance of anomaly detection (AD) and unsupervised segmentation in medicine. Normal fine-grained tissue variability such as present in pulmonary anatomy is a major challenge for existing generative AD methods. Here, we propose a novel generative AD approach addressing this issue. It consists of an image-to-image translation for anomaly-free reconstruction and a subsequent patch similarity scoring between observed and generated image-pairs for precise anomaly localization. We validate the new method on chest computed tomography (CT) scans for the detection and segmentation of infectious disease lesions. To assess generalizability, we evaluate the method on an ischemic stroke lesion segmentation task in T1-weighted brain MRI. Results show improved pixel-level anomaly segmentation in both chest CTs and brain MRIs, with relative DICE score improvements of +1.9% and +4.4%, respectively, compared to other state-of-the-art reconstruction-based methods.

异常检测（AD）和无监督分割在医学中具有广泛的应用和重要性，例如新兴疾病的早期检测、病变严重程度评估、医疗条件鉴别和自动筛选等。肺解剖结构中存在的正常细粒度组织变化对现有生成式AD方法构成了巨大挑战。针对这一问题，我们提出了一种新型的生成式AD方法。它包括对无异常的重构进行图像到图像的翻译，以及观察图像和生成图像之间的补丁相似性评分，以精确定位异常。我们在胸部计算机断层扫描（CT扫描）上验证了新方法，用于检测并分割传染病病变。为了评估通用性，我们在加权脑MRI的缺血性卒中病变分割任务上对该方法进行了评估。结果表明，在胸部CT和脑部MRI中，新方法的像素级异常分割性能有所提升，与其他先进的重建方法相比，相对Dice得分分别提高了+1.9%和+4.4%。

论文及项目相关链接

PDF

Summary

本文介绍了一种新的生成式异常检测（AD）方法，解决了医学图像中正常精细组织变异带来的挑战。该方法包括异常重建的图像到图像转换和观察到的图像与生成的图像之间的补丁相似性评分，以实现精确异常定位。在胸部计算机断层扫描（CT）和T1加权脑MRI上验证了新方法，结果显示在像素级别的异常分割方面有所改善，与其他先进的重建方法相比，相对Dice得分分别提高了+1.9%和+4.4%。

Key Takeaways

异常检测（AD）和未监督分割在医学领域具有广泛的应用，如新兴疾病的早期检测、病变严重程度评估、医学条件鉴别和自动化筛选等。
正常精细组织变异是现有生成式AD方法面临的主要挑战之一。
本文提出了一种新的生成式AD方法，包括图像到图像的异常重建和补丁相似性评分，以实现精确异常定位。
新方法在胸部CT扫描上验证了感染疾病病变的检测和分割效果。
方法在T1加权脑MRI的缺血性卒中病变分割任务上进行了评估，以检验其通用性。
与其他先进的重建方法相比，新方法的像素级异常分割效果有所改善，相对Dice得分有所提高。

Cool Papers

点此查看论文截图

PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models

Authors:Wanru Zhuang, Wenbo Li, Zhibin Lan, Xu Han, Peng Li, Jinsong Su

Text Image Machine Translation (TIMT) aims to translate texts embedded within an image into another language. Current TIMT studies primarily focus on providing translations for all the text within an image, while neglecting to provide bounding boxes and covering limited scenarios. In this work, we extend traditional TIMT into position-aware TIMT (PATIMT), aiming to support fine-grained and layoutpreserving translation, which holds great practical value but remains largely unexplored. This task comprises two key sub-tasks: regionspecific translation and full-image translation with grounding. To support existing models on PATIMT and conduct fair evaluation, we construct the PATIMT benchmark (PATIMTBench), which consists of 10 diverse real-world scenarios. Specifically, we introduce an Adaptive Image OCR Refinement Pipeline, which adaptively selects appropriate OCR tools based on scenario and refines the results of text-rich images. To ensure evaluation reliability, we further construct a test set, which contains 1,200 high-quality instances manually annotated and reviewed by human experts. After fine-tuning on our data, compact Large Vision-Language Models (LVLMs) achieve state-of-the-art performance on both sub-tasks. Experimental results also highlight the scalability and generalizability of our training data

文本图像机器翻译（TIMT）旨在将嵌入图像中的文本翻译成另一种语言。目前的TIMT研究主要集中在为图像中的所有文本提供翻译，而忽视提供边界框，并且应用场景有限。在这项工作中，我们将传统的TIMT扩展为位置感知TIMT（PATIMT），旨在支持精细粒度和保持布局的翻译，这具有极大的实用价值，但仍有待广泛探索。该任务包括两个关键子任务：区域特定翻译和带有定位的全图像翻译。为了支持PATIMT的现有模型并进行公平评估，我们构建了PATIMT基准（PATIMTBench），它包括10个多样化的现实世界场景。具体来说，我们引入了一种自适应图像OCR精炼管道，该管道根据场景自适应选择适当的OCR工具，并对富含文本的图像的结果进行精炼。为了确保评估的可靠性，我们进一步构建了一个测试集，其中包含1200个高质量实例，这些实例经过人工标注并由人类专家审核。在我们的数据上进行微调后，紧凑的大型视觉语言模型（LVLMs）在两个子任务上都达到了最新性能。实验结果还突出了我们训练数据的可扩展性和通用性。

论文及项目相关链接

PDF

Summary

本文介绍了文本图像机器翻译（TIMT）的新发展，提出了一种位置感知的TIMT（PATIMT），旨在支持精细粒度和保持布局的翻译，这在实践中具有很高的价值但尚未被充分探索。为支持PATIMT的现有模型并进行公平评估，构建了PATIMT基准测试（PATIMTBench），包含10种多样的真实场景。此外，还引入了一种自适应图像OCR优化管道，可基于场景自适应选择OCR工具，优化富文本图像的识别结果。通过手动标注和专家审查的方式建立了测试集。使用本数据进行微调后，紧凑的大型视觉语言模型（LVLMs）在两个子任务上都取得了最佳性能。实验结果也证明了本数据集的扩展性和泛化性。

Key Takeaways

TIMT的目标是翻译嵌入图像中的文本内容到其他语言，但当前的研究主要集中在提供整个图像的翻译，忽略了边界框的使用，并且只涵盖有限的场景。
提出了一种新的方法——位置感知的文本图像机器翻译（PATIMT），旨在支持更精细和保持布局的翻译。
为了支持PATIMT的模型训练和评估，创建了一个名为PATIMTBench的基准测试，包含10种真实场景的多样化数据集。
引入了一种自适应图像OCR优化管道，能根据场景选择合适的OCR工具，提高富文本图像的识别准确性。
通过手动标注和专家审查的方式建立了可靠的测试集。
经过在PATIMTBench上的微调，紧凑的大型视觉语言模型（LVLMs）在特定和全面的图像翻译任务中都达到了最佳性能。

Cool Papers

点此查看论文截图

Sample-Aware Test-Time Adaptation for Medical Image-to-Image Translation

Authors:Irene Iele, Francesco Di Feola, Valerio Guarrasi, Paolo Soda

Image-to-image translation has emerged as a powerful technique in medical imaging, enabling tasks such as image denoising and cross-modality conversion. However, it suffers from limitations in handling out-of-distribution samples without causing performance degradation. To address this limitation, we propose a novel Test-Time Adaptation (TTA) framework that dynamically adjusts the translation process based on the characteristics of each test sample. Our method introduces a Reconstruction Module to quantify the domain shift and a Dynamic Adaptation Block that selectively modifies the internal features of a pretrained translation model to mitigate the shift without compromising the performance on in-distribution samples that do not require adaptation. We evaluate our approach on two medical image-to-image translation tasks: low-dose CT denoising and T1 to T2 MRI translation, showing consistent improvements over both the baseline translation model without TTA and prior TTA methods. Our analysis highlights the limitations of the state-of-the-art that uniformly apply the adaptation to both out-of-distribution and in-distribution samples, demonstrating that dynamic, sample-specific adjustment offers a promising path to improve model resilience in real-world scenarios. The code is available at: https://github.com/Sample-Aware-TTA/Code.

图像到图像的翻译技术已逐渐成为医学影像中的一项强大技术，能够完成图像降噪和跨模态转换等任务。然而，它在处理分布外样本时存在局限，可能导致性能下降。为了解决这一局限，我们提出了一种新型的测试时适应（TTA）框架，该框架根据每个测试样本的特性动态调整翻译过程。我们的方法引入了一个重建模块来量化域偏移，并出现了一个动态适应块，该块有选择地修改预训练翻译模型的内部特征，以减轻偏移，同时不损害对不需要适应的分布内样本的性能。我们在两个医学图像到图像翻译任务上评估了我们的方法：低剂量CT降噪和T1到T2的MRI翻译，与没有TTA的基线翻译模型和先前的TTA方法相比，显示出一致的优势。我们的分析强调了当前技术的局限性，即对分布外和分布内样本统一应用适应策略。结果表明，动态、针对样本的特定调整是提高模型在现实场景中适应性的一个很有前景的途径。代码可用在：https://github.com/Sample-Aware-TTA/Code。

论文及项目相关链接

PDF

Summary
：提出一种新颖的Test-Time Adaptation（TTA）框架，用于医学图像领域的图像到图像翻译任务。该框架能够根据测试样本的特性动态调整翻译过程，并引入重建模块和动态适配块来解决分布外样本的问题，同时保持对不需要适配的样本的性能。在两项医学图像翻译任务上进行了评估，并显示出相对于基准模型和现有TTA方法的改进。代码已公开。

Key Takeaways