发布日期: 2025-07-17

更新日期: 2025-08-09

文章字数: 1.2k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-07-17 更新

Focus on Texture: Rethinking Pre-training in Masked Autoencoders for Medical Image Classification

Authors:Chetan Madan, Aarjav Satia, Soumen Basu, Pankaj Gupta, Usha Dutta, Chetan Arora

Masked Autoencoders (MAEs) have emerged as a dominant strategy for self-supervised representation learning in natural images, where models are pre-trained to reconstruct masked patches with a pixel-wise mean squared error (MSE) between original and reconstructed RGB values as the loss. We observe that MSE encourages blurred image re-construction, but still works for natural images as it preserves dominant edges. However, in medical imaging, when the texture cues are more important for classification of a visual abnormality, the strategy fails. Taking inspiration from Gray Level Co-occurrence Matrix (GLCM) feature in Radiomics studies, we propose a novel MAE based pre-training framework, GLCM-MAE, using reconstruction loss based on matching GLCM. GLCM captures intensity and spatial relationships in an image, hence proposed loss helps preserve morphological features. Further, we propose a novel formulation to convert matching GLCM matrices into a differentiable loss function. We demonstrate that unsupervised pre-training on medical images with the proposed GLCM loss improves representations for downstream tasks. GLCM-MAE outperforms the current state-of-the-art across four tasks - gallbladder cancer detection from ultrasound images by 2.1%, breast cancer detection from ultrasound by 3.1%, pneumonia detection from x-rays by 0.5%, and COVID detection from CT by 0.6%. Source code and pre-trained models are available at: https://github.com/ChetanMadan/GLCM-MAE.

基于像素级的掩码自动编码器（MAEs）已成为在自然图像中进行自监督表示学习的主流策略。在这种策略中，模型会进行预训练以重建被掩码的区块，并使用原始和重建RGB值之间的像素级均方误差（MSE）作为损失函数。我们发现MSE鼓励模糊图像的重建，但在自然图像中仍然有效，因为它保留了主要的边缘。然而，在医学成像中，当纹理线索对于视觉异常的分类更为重要时，该策略会失效。我们从放射学研究中灰度共生矩阵（GLCM）特征中汲取灵感，提出了一种基于MAE的新预训练框架GLCM-MAE，该框架使用基于匹配GLCM的重建损失。GLCM能够捕捉图像中的强度和空间关系，因此所提出的损失有助于保留形态特征。此外，我们还提出了一种新型公式将匹配的GLCM矩阵转化为可微分的损失函数。我们证明了使用所提出GLCM损失对医学图像进行无监督预训练能够提高下游任务的表示能力。在四种任务上，GLCM-MAE的表现均超过了当前最先进的技术：从超声图像中检测胆囊癌提高了2.1%，从超声图像中检测乳腺癌提高了3.1%，从X射线图像中检测肺炎提高了0.5%，从CT图像中检测COVID提高了0.6%。相关源代码和预训练模型可在以下网址找到：https://github.com/ChetanMadan/GLCM-MAE。

论文及项目相关链接

PDF To appear at MICCAI 2025

摘要
基于灰度共生矩阵（GLCM）特征，提出一种新型MAE预训练框架GLCM-MAE，采用基于GLCM匹配的重建损失，改善医疗图像的无监督预训练效果。GLCM捕捉图像中的强度和空间关系，因此提出的损失有助于保留形态特征。此外，我们还提出了一种新的公式，将匹配的GLCM矩阵转化为可微分的损失函数。实验显示，在四个任务中，使用GLCM损失的预训练效果优于当前最先进技术，包括胆囊癌超声检测提升2.1%，乳腺癌超声检测提升3.1%，肺炎X光检测提升0.5%，以及新冠肺炎CT检测提升0.6%。

关键见解