GAN

发布日期: 2025-03-11

更新日期: 2025-05-14

文章字数: 3.2k

阅读时长: 12 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-03-11 更新

ZAugNet for Z-Slice Augmentation in Bio-Imaging

Authors:Alessandro Pasqui, Sajjad Mahdavi, Benoit Vianay, Alexandra Colin, Alex McDougall, Rémi Dumollard, Yekaterina A. Miroshnikova, Elsa Labrune, Hervé Turlier

Three-dimensional biological microscopy has significantly advanced our understanding of complex biological structures. However, limitations due to microscopy techniques, sample properties or phototoxicity often result in poor z-resolution, hindering accurate cellular measurements. Here, we introduce ZAugNet, a fast, accurate, and self-supervised deep learning method for enhancing z-resolution in biological images. By performing nonlinear interpolation between consecutive slices, ZAugNet effectively doubles resolution with each iteration. Compared on several microscopy modalities and biological objects, it outperforms competing methods on most metrics. Our method leverages a generative adversarial network (GAN) architecture combined with knowledge distillation to maximize prediction speed without compromising accuracy. We also developed ZAugNet+, an extended version enabling continuous interpolation at arbitrary distances, making it particularly useful for datasets with nonuniform slice spacing. Both ZAugNet and ZAugNet+ provide high-performance, scalable z-slice augmentation solutions for large-scale 3D imaging. They are available as open-source frameworks in PyTorch, with an intuitive Colab notebook interface for easy access by the scientific community.

三维生物显微镜极大地提高了我们对复杂生物结构的理解。然而，由于显微镜技术、样本特性或光毒性等限制，常常导致z分辨率不佳，阻碍了准确的细胞测量。在这里，我们引入了ZAugNet，这是一种快速、准确、自监督的深度学习方法，用于提高生物图像中的z分辨率。通过连续切片之间进行非线性插值，ZAugNet每次迭代都能有效地将分辨率提高一倍。与多种显微镜模式和生物对象相比，它在大多数指标上都优于其他方法。我们的方法利用生成对抗网络（GAN）架构，结合知识蒸馏技术，以最大化预测速度而不牺牲准确性。我们还开发了ZAugNet+的扩展版本，可实现任意距离的连续插值，对于非均匀切片间距的数据集特别有用。ZAugNet和ZAugNet+均提供高性能、可扩展的z切片增强解决方案，适用于大规模三维成像。它们作为PyTorch中的开源框架提供，并配有直观的Colab笔记本界面，方便科学界轻松访问。

论文及项目相关链接

PDF 17 pages, 9 figures, 1 table

Summary

三维生物显微镜技术对于复杂生物结构的研究起到了极大的推动作用，但由于显微镜技术、样本特性或光毒性等限制因素，常常导致z分辨率不佳，影响细胞测量的准确性。本文提出一种快速、准确、自监督的深度学习方法ZAugNet，它通过非线性插值连续切片增强图像z分辨率。相比多种显微镜模式和生物对象的比较测试，它在多数指标上超越了其他方法。利用生成对抗网络（GAN）架构结合知识蒸馏技术以提高预测速度且不损失精度。此外，还开发了ZAugNet+版本，支持任意距离的连续插值，适用于非均匀切片间距的数据集。ZAugNet系列方法为大规模三维成像提供了高性能、可扩展的z切片增强解决方案，以开源框架形式在PyTorch中提供，并配有直观的Colab笔记本界面，方便科学界使用。

Key Takeaways

ZAugNet方法利用深度学习技术增强三维生物图像的z分辨率，提高了对复杂生物结构理解的能力。
通过非线性插值连续切片，ZAugNet能够每次迭代有效将分辨率翻倍。
ZAugNet在多种显微镜模式和生物对象上的性能表现超过其他方法。
ZAugNet采用生成对抗网络（GAN）和知识蒸馏技术，实现了预测速度的快速提高且不会降低准确性。
ZAugNet+是ZAugNet的扩展版本，支持任意距离的连续插值，适用于具有非均匀切片间距的数据集。
ZAugNet系列方法提供高性能、可扩展的z切片增强解决方案，适用于大规模三维成像。

Cool Papers

点此查看论文截图

Generating Novel Brain Morphology by Deforming Learned Templates

Authors:Alan Q. Wang, Fangrui Huang, Bailey Trang, Wei Peng, Mohammad Abbasi, Kilian Pohl, Mert Sabuncu, Ehsan Adeli

Designing generative models for 3D structural brain MRI that synthesize morphologically-plausible and attribute-specific (e.g., age, sex, disease state) samples is an active area of research. Existing approaches based on frameworks like GANs or diffusion models synthesize the image directly, which may limit their ability to capture intricate morphological details. In this work, we propose a 3D brain MRI generation method based on state-of-the-art latent diffusion models (LDMs), called MorphLDM, that generates novel images by applying synthesized deformation fields to a learned template. Instead of using a reconstruction-based autoencoder (as in a typical LDM), our encoder outputs a latent embedding derived from both an image and a learned template that is itself the output of a template decoder; this latent is passed to a deformation field decoder, whose output is applied to the learned template. A registration loss is minimized between the original image and the deformed template with respect to the encoder and both decoders. Empirically, our approach outperforms generative baselines on metrics spanning image diversity, adherence with respect to input conditions, and voxel-based morphometry. Our code is available at https://github.com/alanqrwang/morphldm.

设计针对3D结构性脑MRI的生成模型，以合成形态上合理且具有特定属性（例如年龄、性别、疾病状态）的样本，是一个研究活跃领域。现有基于GAN或扩散模型等框架的方法直接合成图像，这可能限制了它们捕捉复杂形态细节的能力。在这项工作中，我们提出了一种基于最新潜在扩散模型（LDM）的3D脑MRI生成方法，称为MorphLDM。它通过应用合成变形场到一个学习到的模板来生成新图像。我们的编码器不输出基于重建的自编码器（如典型的LDM），而是输出一个由图像和学习到的模板共同派生的潜在嵌入，该模板本身就是模板解码器的输出；这个潜在因素被传递给变形场解码器，其输出被应用到学习到的模板上。通过最小化原始图像和变形模板之间的注册损失，关于编码器和两个解码器的损失都会被优化。从实证角度看，我们的方法在图像多样性、符合输入条件的能力和基于体素的形态测量等指标上的表现都超过了生成基准测试。我们的代码可在https://github.com/alanqrwang/morphldm上找到。

论文及项目相关链接

PDF

Summary
基于潜在扩散模型（LDM）的改进，提出了一种名为MorphLDM的3D脑MRI生成方法。该方法通过合成变形场应用于学习模板，生成新型图像。相较于传统LDM使用重建式自编码器，MorphLDM的编码器输出由图像和学习模板共同衍生的潜在嵌入，再传递给变形场解码器。通过最小化原始图像和变形模板之间的注册损失，以评估编码器和两个解码器的性能。经验表明，该方法在图像多样性、符合输入条件以及体素形态测量等方面优于基准生成模型。

Key Takeaways

设计用于3D结构脑MRI的生成模型是当前研究热点。
现有方法如GANs或扩散模型直接合成图像，可能难以捕捉复杂形态细节。
提出了一种基于最新潜在扩散模型（LDM）的3D脑MRI生成方法，名为MorphLDM。
MorphLDM通过合成变形场应用于学习模板生成新型图像。
与传统LDM不同，MorphLDM的编码器输出由图像和学习模板共同衍生。
注册损失被用于评估模型性能，包括图像多样性、符合输入条件以及体素形态测量。
方法在实证研究中表现出优于基准生成模型的性能，相关代码已公开分享。

Cool Papers

点此查看论文截图

Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

Authors:Yunfeng Diao, Naixin Zhai, Changtao Miao, Zitong Yu, Xingxing Wei, Xun Yang, Meng Wang

Recent advancements in image synthesis, particularly with the advent of GAN and Diffusion models, have amplified public concerns regarding the dissemination of disinformation. To address such concerns, numerous AI-generated Image (AIGI) Detectors have been proposed and achieved promising performance in identifying fake images. However, there still lacks a systematic understanding of the adversarial robustness of AIGI detectors. In this paper, we examine the vulnerability of state-of-the-art AIGI detectors against adversarial attack under white-box and black-box settings, which has been rarely investigated so far. To this end, we propose a new method to attack AIGI detectors. First, inspired by the obvious difference between real images and fake images in the frequency domain, we add perturbations under the frequency domain to push the image away from its original frequency distribution. Second, we explore the full posterior distribution of the surrogate model to further narrow this gap between heterogeneous AIGI detectors, e.g. transferring adversarial examples across CNNs and ViTs. This is achieved by introducing a novel post-train Bayesian strategy that turns a single surrogate into a Bayesian one, capable of simulating diverse victim models using one pre-trained surrogate, without the need for re-training. We name our method as Frequency-based Post-train Bayesian Attack, or FPBA. Through FPBA, we show that adversarial attack is truly a real threat to AIGI detectors, because FPBA can deliver successful black-box attacks across models, generators, defense methods, and even evade cross-generator detection, which is a crucial real-world detection scenario. The code will be shared upon acceptance.

近期图像合成领域的进展，特别是生成对抗网络（GAN）和扩散模型的涌现，加剧了公众对虚假信息传播的担忧。为了应对这些担忧，已经提出了许多人工智能生成图像（AIGI）检测器，并在识别虚假图像方面取得了令人瞩目的性能。然而，对于AIGI检测器的对抗性稳健性，目前还缺乏系统的理解。

在本文中，我们研究了最先进AIGI检测器在白盒和黑盒设置下对抗攻击的脆弱性，这一领域迄今为止很少被研究。为此，我们提出了一种攻击AIGI检测器的新方法。首先，受到真实图像和虚假图像在频域中的明显差异的启发，我们在频域中添加扰动，使图像远离其原始频率分布。其次，我们探索了代理模型的后验分布，以进一步缩小不同AIGI检测器之间的差距，例如在不同CNN和ViTs之间迁移对抗性示例。这是通过引入一种新的后训练贝叶斯策略实现的，该策略将单一代理转换为贝叶斯代理，能够使用单个预训练代理模拟多种受害者模型，而无需重新训练。我们将我们的方法命名为基于频率的后训练贝叶斯攻击（FPBA）。通过FPBA，我们证明了对抗性攻击确实对AIGI检测器构成威胁，因为FPBA可以在不同模型、生成器、防御方法和甚至逃避跨生成器检测的情况下成功进行黑盒攻击，这是现实世界检测的关键场景。代码将在接受后共享。

论文及项目相关链接

PDF

Summary
本文探讨了先进的图像合成技术（如GAN和Diffusion模型）带来的虚假信息问题。针对这一问题，AI生成的图像（AIGI）检测器已被提出并展现出良好的性能。然而，关于这些检测器的对抗鲁棒性尚未有系统的研究。本文研究了当前先进的AIGI检测器在白盒和黑盒设置下对抗攻击的脆弱性，并提出了一种新的攻击方法——基于频率的后训练贝叶斯攻击（FPBA）。该方法能够在不同模型、生成器、防御方法和跨生成器检测场景中成功实施攻击，显示了对AIGI检测器的真实威胁。

Key Takeaways