GAN

发布日期: 2025-11-16

更新日期: 2025-11-27

文章字数: 7.1k

阅读时长: 29 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-11-16 更新

OpenSR-SRGAN: A Flexible Super-Resolution Framework for Multispectral Earth Observation Data

Authors:Simon Donike, Cesar Aybar, Julio Contreras, Luis Gómez-Chova

We present OpenSR-SRGAN, an open and modular framework for single-image super-resolution in Earth Observation. The software provides a unified implementation of SRGAN-style models that is easy to configure, extend, and apply to multispectral satellite data such as Sentinel-2. Instead of requiring users to modify model code, OpenSR-SRGAN exposes generators, discriminators, loss functions, and training schedules through concise configuration files, making it straightforward to switch between architectures, scale factors, and band setups. The framework is designed as a practical tool and benchmark implementation rather than a state-of-the-art model. It ships with ready-to-use configurations for common remote sensing scenarios, sensible default settings for adversarial training, and built-in hooks for logging, validation, and large-scene inference. By turning GAN-based super-resolution into a configuration-driven workflow, OpenSR-SRGAN lowers the entry barrier for researchers and practitioners who wish to experiment with SRGANs, compare models in a reproducible way, and deploy super-resolution pipelines across diverse Earth-observation datasets.

我们推出了OpenSR-SRGAN，这是一个用于地球观测单图像超分辨率的开放和模块化框架。该软件为SRGAN风格的模型提供了统一的实现，易于配置、扩展，并适用于Sentinel-2等多光谱卫星数据。OpenSR-SRGAN通过简洁的配置文件公开生成器、鉴别器、损失函数和训练计划，而不是要求用户修改模型代码，这使得在架构、缩放因子和波段设置之间切换变得非常简单。该框架被设计为一个实用工具和基准实现，而非最先进的模型。它为常见的遥感场景提供了即用的配置，为对抗性训练提供了合理的默认设置，并内置了日志、验证和大场景推断的钩子。通过将基于GAN的超分辨率转换为配置驱动的工作流程，OpenSR-SRGAN降低了研究人员和实践者希望尝试SRGANs、以可复制的方式比较模型以及在多样化的地球观测数据集中部署超分辨率管道的入门门槛。

论文及项目相关链接

PDF

Summary

OpenSR-SRGAN是一个用于地球观测单幅图像超分辨率的开放模块化框架。它通过简洁的配置文件公开生成器、鉴别器、损失函数和训练计划，使用户可以轻松配置、扩展并应用于多光谱卫星数据。框架设计为一个实用工具和基准实现，提供通用的远程感应场景配置，具有日志记录、验证和大场景推理的内置钩子。通过转变为配置驱动的工作流程，OpenSR-SRGAN降低了研究人员和实践者进入门槛，可实验SRGANs，以可复制的方式比较模型，并在多样的地球观测数据集中部署超分辨率管道。

Key Takeaways

OpenSR-SRGAN是一个模块化框架，用于地球观测的单图像超分辨率。
通过简洁的配置文件，用户可以轻松配置、扩展模型，并应用于多光谱卫星数据。
该框架提供统一的SRGAN风格模型的实现，易于配置、扩展，并适用于多种遥感场景。
框架包含对抗性训练的默认设置和内置日志记录、验证以及大场景推理的钩子。
OpenSR-SRGAN降低了研究人员和实践者使用SRGANs的门槛，并允许他们进行实验和模型比较。
该框架旨在成为一个实用的工具和基准实现，而不仅仅是尖端模型。

Cool Papers

点此查看论文截图

GPDM: Generation-Prior Diffusion Model for Accelerated Direct Attenuation and Scatter Correction of Whole-body 18F-FDG PET

Authors:Min Jeong Cho, Hyeong Seok Shim, Sungyu Kim, Jae Sung Lee

Accurate attenuation and scatter corrections are crucial in positron emission tomography (PET) imaging for accurate visual interpretation and quantitative analysis. Traditional methods relying on computed tomography (CT) or magnetic resonance imaging (MRI) have limitations in accuracy, radiation exposure, and applicability. Deep neural networks provide potential approaches to estimating attenuation and scatter-corrected (ASC) PET from non-attenuation and non-scatter-corrected (NASC) PET images based on VAE or CycleGAN. However, the limitations inherent to conventional GAN-based methods, such as unstable training and mode collapse, need further advancements. To address these limitations and achieve more accurate attenuation and scatter corrections, we propose a novel framework for generating high-quality ASC PET images from NASC PET images: Generation-Prior Diffusion Model (GPDM). Our GPDM framework is based on the Denoising Diffusion Probabilistic Model (DDPM), but instead of starting sampling from an entirely different image distribution, it begins from a distribution similar to the target images we aim to generate. This similar distribution is referred to as the Generation-Prior. By leveraging this Generation-Prior, the GPDM framework effectively reduces the number of sampling steps and generates more refined ASC PET images. Our experimental results demonstrate that GPDM outperforms existing methods in generating ASC PET images, achieving superior accuracy while significantly reducing sampling time. These findings highlight the potential of GPDM to address the limitations of conventional methods and establish a new standard for efficient and accurate attenuation and scatter correction in PET imaging.

在正电子发射断层扫描（PET）成像中，准确的衰减和散射校正对于准确的视觉解读和定量分析至关重要。传统的方法依赖于计算机断层扫描（CT）或磁共振成像（MRI），但在准确性、辐射暴露和适用性方面存在局限性。深度神经网络提供了基于变分自编码器（VAE）或CycleGAN从非衰减和非散射校正（NASC）PET图像估计衰减和散射校正（ASC）PET的潜在方法。然而，传统基于生成对抗网络（GAN）的方法存在的固有局限性，如训练不稳定和模式崩溃，需要进一步改进。为了解决这些局限性，实现更准确的衰减和散射校正，我们提出了一种从NASC PET图像生成高质量ASC PET图像的新框架：生成优先扩散模型（GPDM）。我们的GPDM框架基于去噪扩散概率模型（DDPM），但不同于从一个完全不同的图像分布开始采样，它从一个与我们旨在生成的目标图像相似的分布开始。这个相似的分布被称为生成优先。通过利用这种生成优先权，GPDM框架有效地减少了采样步骤的数量，并生成了更精细的ASC PET图像。我们的实验结果表明，GPDM在生成ASC PET图像方面优于现有方法，实现了更高的准确性，同时显著减少了采样时间。这些发现突出了GPDM解决传统方法局限性的潜力，并为PET成像中的高效准确衰减和散射校正建立了新标准。

论文及项目相关链接

PDF 25 pages, 10 figures

Summary

基于深度神经网络，本文提出了一种新颖的框架——生成优先扩散模型（GPDM），用于从非衰减和非散射校正（NASC）PET图像生成高质量衰减和散射校正（ASC）PET图像。该方法基于去噪扩散概率模型（DDPM），从一个与目标图像相似的分布——生成优先分布开始，有效减少采样步骤并生成更精细的ASC PET图像。实验结果表明，GPDM在生成ASC PET图像方面优于现有方法，实现了更高的准确性和显著的采样时间减少。

Key Takeaways

准确衰减和散射校正在PET成像中至关重要，影响视觉解读和定量分析。
传统方法如CT和MRI在准确性、辐射暴露和适用性方面存在局限性。
深度神经网络提供潜力方法来估计衰减和散射校正的PET图像。
提出的GPDM框架基于去噪扩散概率模型，从生成优先分布开始，有效减少采样步骤。
GPDM框架实现了更精细的ASC PET图像生成，并在实验上证明其超越现有方法。
GPDM方法提高了准确性并显著减少了采样时间。

Cool Papers

点此查看论文截图

WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting

Authors:Kaitao Huang, Yan Yan, Jing-Hao Xue, Hanzi Wang

3D GAN inversion projects a single image into the latent space of a pre-trained 3D GAN to achieve single-shot novel view synthesis, which requires visible regions with high fidelity and occluded regions with realism and multi-view consistency. However, existing methods focus on the reconstruction of visible regions, while the generation of occluded regions relies only on the generative prior of 3D GAN. As a result, the generated occluded regions often exhibit poor quality due to the information loss caused by the low bit-rate latent code. To address this, we introduce the warping-and-inpainting strategy to incorporate image inpainting into 3D GAN inversion and propose a novel 3D GAN inversion method, WarpGAN. Specifically, we first employ a 3D GAN inversion encoder to project the single-view image into a latent code that serves as the input to 3D GAN. Then, we perform warping to a novel view using the depth map generated by 3D GAN. Finally, we develop a novel SVINet, which leverages the symmetry prior and multi-view image correspondence w.r.t. the same latent code to perform inpainting of occluded regions in the warped image. Quantitative and qualitative experiments demonstrate that our method consistently outperforms several state-of-the-art methods.

3D GAN反演项目将单个图像投影到预训练的3D GAN的潜在空间中，以实现单镜头新颖视角合成，这需要高度逼真的可见区域、现实感强的遮挡区域以及多视角一致性。然而，现有方法主要关注可见区域的重建，而遮挡区域的生成仅依赖于3D GAN的生成先验。因此，由于低比特率潜在代码导致的信息丢失，生成的遮挡区域通常质量较差。为了解决这个问题，我们引入了扭曲和修复策略，将图像修复融入到3D GAN反演中，并提出了一种新型的3D GAN反演方法——WarpGAN。具体来说，我们首先将单个视图图像通过3D GAN反演编码器投影到潜在代码中，该代码作为3D GAN的输入。然后，我们使用由3D GAN生成的深度图对图像进行视角扭曲。最后，我们开发了一种新型的SVINet，它利用对称先验知识和同一潜在代码的多视角图像对应关系，对扭曲图像中的遮挡区域进行修复。定量和定性实验表明，我们的方法始终优于几种最先进的方法。

论文及项目相关链接

PDF

Summary：本文提出一种基于战争与修复策略的新型三维生成对抗网络反演方法，简称WarpGAN。该方法将图像修复融入三维生成对抗网络反演过程中，以提高遮挡区域的生成质量。实验证明，该方法在性能上优于其他最新方法。

Key Takeaways：

3D GAN反演技术通过将单个图像投影到预训练的三维GAN潜在空间实现单镜头新颖视角合成。
现有方法主要关注可见区域的重建，而遮挡区域的生成依赖于三维GAN的生成先验。
由于低比特率潜在代码导致的信息丢失，生成的遮挡区域往往质量较差。
WarpGAN引入战争与修复策略，将图像修复融入三维GAN反演过程。
WarpGAN首先使用三维GAN反演编码器将单视图图像投影到潜在代码，作为输入给三维GAN。
使用深度地图进行视点变换，并采用新开发的SVINet进行遮挡区域修复。

Cool Papers

点此查看论文截图

KPLM-STA: Physically-Accurate Shadow Synthesis for Human Relighting via Keypoint-Based Light Modeling

Authors:Xinhui Yin, Qifei Li, Yilin Guo, Hongxia Xie, Xiaoli Zhang

Image composition aims to seamlessly integrate a foreground object into a background, where generating realistic and geometrically accurate shadows remains a persistent challenge. While recent diffusion-based methods have outperformed GAN-based approaches, existing techniques, such as the diffusion-based relighting framework IC-Light, still fall short in producing shadows with both high appearance realism and geometric precision, especially in composite images. To address these limitations, we propose a novel shadow generation framework based on a Keypoints Linear Model (KPLM) and a Shadow Triangle Algorithm (STA). KPLM models articulated human bodies using nine keypoints and one bounding block, enabling physically plausible shadow projection and dynamic shading across joints, thereby enhancing visual realism. STA further improves geometric accuracy by computing shadow angles, lengths, and spatial positions through explicit geometric formulations. Extensive experiments demonstrate that our method achieves state-of-the-art performance on shadow realism benchmarks, particularly under complex human poses, and generalizes effectively to multi-directional relighting scenarios such as those supported by IC-Light.

图像合成旨在将前景对象无缝地集成到背景中，其中生成真实且几何准确的阴影仍然是一个持续存在的挑战。虽然最近的扩散方法已经超越了基于GAN的方法，但现有技术（如基于扩散的重照明框架IC-Light）在生成具有高外观真实性和几何精度的阴影方面仍然表现不足，特别是在合成图像中。为了解决这些局限性，我们提出了一种基于关键点线性模型（KPLM）和阴影三角形算法（STA）的新型阴影生成框架。KPLM使用九个关键点和一个边界块对关节活动的人体进行建模，从而实现关节间的物理合理阴影投射和动态着色，增强了视觉真实性。STA通过明确的几何公式进一步提高了几何精度，通过计算阴影角度、长度和空间位置。大量实验表明，我们的方法在阴影真实性基准测试中达到了最新水平，特别是在复杂的姿态下，并且有效地推广到了多方向重照明场景，如IC-Light所支持的场景。

论文及项目相关链接

PDF

Summary
提出基于关键点线性模型（KPLM）和阴影三角算法（STA）的新型阴影生成框架，用于解决图像合成中阴影生成的问题。KPLM通过九个关键点和一块边界块模拟人体动作，实现动态关节阴影投影，提高视觉真实性。STA算法通过明确的几何公式计算阴影角度、长度和空间位置，进一步提高几何精度。该方法在阴影真实性基准测试中达到最佳性能，有效应对复杂人体姿态和多方向照明场景。

Key Takeaways

提出的框架基于KPLM和STA，旨在解决图像合成中阴影生成的挑战。
KPLM模型使用九个关键点和一块边界块来模拟人体动作，实现更真实的阴影投影。
STA算法通过几何公式计算阴影的角度、长度和空间位置，提高几何精度。
该方法在复杂人体姿态下的阴影真实性方面达到最佳性能。
与现有的扩散重光照框架IC-Light相比，该方法在多重光照场景中具有更好的通用性。
该方法能有效提高图像合成的整体质量，特别是在视觉真实性和几何精度方面。

Cool Papers

点此查看论文截图

AvatarTex: High-Fidelity Facial Texture Reconstruction from Single-Image Stylized Avatars

Authors:Yuda Qiu, Zitong Xiao, Yiwei Zuo, Zisheng Ye, Weikai Chen, Xiaoguang Han

We present AvatarTex, a high-fidelity facial texture reconstruction framework capable of generating both stylized and photorealistic textures from a single image. Existing methods struggle with stylized avatars due to the lack of diverse multi-style datasets and challenges in maintaining geometric consistency in non-standard textures. To address these limitations, AvatarTex introduces a novel three-stage diffusion-to-GAN pipeline. Our key insight is that while diffusion models excel at generating diversified textures, they lack explicit UV constraints, whereas GANs provide a well-structured latent space that ensures style and topology consistency. By integrating these strengths, AvatarTex achieves high-quality topology-aligned texture synthesis with both artistic and geometric coherence. Specifically, our three-stage pipeline first completes missing texture regions via diffusion-based inpainting, refines style and structure consistency using GAN-based latent optimization, and enhances fine details through diffusion-based repainting. To address the need for a stylized texture dataset, we introduce TexHub, a high-resolution collection of 20,000 multi-style UV textures with precise UV-aligned layouts. By leveraging TexHub and our structured diffusion-to-GAN pipeline, AvatarTex establishes a new state-of-the-art in multi-style facial texture reconstruction. TexHub will be released upon publication to facilitate future research in this field.

我们提出AvatarTex，这是一个高保真面部纹理重建框架，它能够从单张图像生成风格化和写实风格的纹理。现有方法由于缺少多样化的多风格数据集以及在非标准纹理中保持几何一致性的挑战，因而在风格化角色(avatars)上表现挣扎。为了解决这些局限性，AvatarTex引入了一种新型的三阶段扩散到生成对抗网络（GAN）管道。我们的关键见解是，虽然扩散模型在生成多样化纹理方面表现出色，但它们缺乏明确的UV约束，而GAN提供了一个结构良好的潜在空间，确保风格和拓扑一致性。通过整合这些优势，AvatarTex实现了高质量、拓扑对齐的纹理合成，具有艺术性和几何一致性。具体来说，我们的三阶段管道首先通过基于扩散的补全技术完成缺失纹理区域，然后使用基于GAN的潜在优化细化风格和结构的一致性，最后通过基于扩散的重绘技术增强细节。为了解决对风格化纹理数据集的需求，我们引入了TexHub，这是一个高分辨率的包含2万多个多风格UV纹理的精确UV对齐布局集合。通过利用TexHub和我们结构化扩散到GAN的管道，AvatarTex在风格化面部纹理重建方面达到了最新水平。TexHub将在发布时同步推出，以促进未来在这一领域的研究。

论文及项目相关链接

PDF 3DV 2026 Accepted

Summary

AvatarTex是一个高保真面部纹理重建框架，能够从单一图像生成个性化且逼真的纹理。针对现有方法在个性化头像纹理重建方面的不足，如缺乏多样化多风格数据集和维持非标准纹理几何一致性方面的挑战，AvatarTex提出了一个全新的三阶段扩散到生成对抗网络（GAN）的管道。通过整合扩散模型的多样化纹理生成能力和GAN的潜在空间结构，确保风格和拓扑一致性，实现了高质量、拓扑对齐的纹理合成，同时保证艺术性和几何一致性。具体来说，其流程包括基于扩散的纹理区域补全、基于GAN的潜在优化风格和结构一致性优化以及基于扩散的细节重塑三个阶段。为了应对对个性化纹理数据集的需求，引入了TexHub高分辨率多风格UV纹理库。利用TexHub和结构化扩散到GAN的管道，AvatarTex在个性化面部纹理重建方面取得了最新进展。TexHub将在发表时公开，以促进该领域未来的研究。

Key Takeaways

AvatarTex是一个用于高保真面部纹理重建的框架，可以从单一图像生成个性化的纹理。它通过独特的三阶段扩散到GAN的管道来解决现有方法的局限性。
扩散模型擅长生成多样化纹理，但缺乏明确的UV约束；而GAN提供了结构化的潜在空间以确保风格和拓扑一致性。AvatarTex融合了这两种模型的优势。
三阶段管道包括基于扩散的纹理区域补全、基于GAN的风格和结构一致性优化以及基于扩散的细节重塑。这使得其能够生成高质量、几何一致的纹理。
TexHub是一个高分辨率的多风格UV纹理库，解决了缺乏个性化纹理数据集的问题。它促进了面部纹理重建的研究进展。
AvatarTex利用TexHub和结构化扩散到GAN的管道实现了最新的个性化面部纹理重建技术。这将有助于推动相关领域的发展。

Cool Papers

点此查看论文截图

Future of AI Models: A Computational perspective on Model collapse

Authors:Trivikram Satharasi, S Sitharama Iyengar

Artificial Intelligence, especially Large Language Models (LLMs), has transformed domains such as software engineering, journalism, creative writing, academia, and media (Naveed et al. 2025; arXiv:2307.06435). Diffusion models like Stable Diffusion generate high-quality images and videos from text. Evidence shows rapid expansion: 74.2% of newly published webpages now contain AI-generated material (Ryan Law 2025), 30-40% of the active web corpus is synthetic (Spennemann 2025; arXiv:2504.08755), 52% of U.S. adults use LLMs for writing, coding, or research (Staff 2025), and audits find AI involvement in 18% of financial complaints and 24% of press releases (Liang et al. 2025). The underlying neural architectures, including Transformers (Vaswani et al. 2023; arXiv:1706.03762), RNNs, LSTMs, GANs, and diffusion networks, depend on large, diverse, human-authored datasets (Shi & Iyengar 2019). As synthetic content dominates, recursive training risks eroding linguistic and semantic diversity, producing Model Collapse (Shumailov et al. 2024; arXiv:2307.15043; Dohmatob et al. 2024; arXiv:2402.07712). This study quantifies and forecasts collapse onset by examining year-wise semantic similarity in English-language Wikipedia (filtered Common Crawl) from 2013 to 2025 using Transformer embeddings and cosine similarity metrics. Results reveal a steady rise in similarity before public LLM adoption, likely driven by early RNN/LSTM translation and text-normalization pipelines, though modest due to a smaller scale. Observed fluctuations reflect irreducible linguistic diversity, variable corpus size across years, finite sampling error, and an exponential rise in similarity after the public adoption of LLM models. These findings provide a data-driven estimate of when recursive AI contamination may significantly threaten data richness and model generalization.

人工智能，特别是大语言模型（LLM），已经改变了软件工程、新闻、创意写作、学术界和媒体等领域（Naveed等人，2025年；arXiv:2307.06435）。像Stable Diffusion这样的扩散模型可以根据文本生成高质量的图片和视频。证据表明其正在快速发展：74.2%的新发布网页现在包含AI生成的内容（Ryan Law 2025）；活跃网页的30-40%是合成的（Spennemann 2025；arXiv:2504.08755）；52%的美国成年人使用LLM进行写作、编码或研究（Staff 2025）；审计发现AI参与18%的金融投诉和24%的新闻稿（Liang等人，2025）。这些基础神经网络架构，包括Transformer（Vaswani等人，2023年；arXiv:1706.03762）、RNNs、LSTMs、GANs和扩散网络等，依赖于大规模、多样化的人类创作数据集（Shi & Iyengar 2019）。随着合成内容的主导，递归训练可能侵蚀语言和语义多样性，导致模型崩溃（Shumailov等人，2024年；arXiv:2307.15043；Dohmatob等人，arXiv:2402.07712）。本研究通过考察从2013年到2025年的英文维基百科（过滤后的Common Crawl）每年的语义相似性，采用Transformer嵌入和余弦相似性度量方法来量化并预测崩溃的发生。结果表明，在公开采用LLM之前，相似性稳步上升，这可能是由于早期的RNN/LSTM翻译和文本规范化管道所驱动，尽管由于规模较小而幅度较为温和。观察到的波动反映了不可约的语言多样性、每年不同的语料库大小、有限的采样误差以及公开采用LLM模型后的相似性指数级上升。这些发现提供了一个数据驱动的估计，表明递归AI污染可能严重威胁数据丰富性和模型泛化的时间点。

论文及项目相关链接

PDF Submitted to Springer Nature. Code Available at https://github.com/t-satharasi/AI-Modal-Collapse-Code-for-Reproduction.git

Summary
人工智能特别是大型语言模型（LLM）在软件工程、新闻业、创意写作、学术界和媒体等领域产生了深刻影响。扩散模型如Stable Diffusion能生成高质量图像和视频。据统计，AI生成的材料在新网页中占比迅速上升，LLM在日常写作、编码和研究中的使用也越来越广泛。然而，随着合成内容的主导，递归训练可能侵蚀语言和语义多样性，导致模型崩溃。本研究通过考察英语Wikipedia的语义相似性来预测崩溃发生的时机，发现公共LLM模型采用后相似性急剧上升。

Key Takeaways

大型语言模型（LLM）在多个领域产生了深远影响。
扩散模型如Stable Diffusion能生成高质量图像和视频。
AI生成的材料在新网页中的占比迅速上升。
LLM在日常写作、编码和研究中的使用广泛。
递归训练可能导致语言和语义多样性的侵蚀，进而引发模型崩溃。
研究通过考察英语Wikipedia的语义相似性来预测模型崩溃的时机。

Cool Papers

点此查看论文截图

Authors:Xiongri Shen, Jiaqi Wang, Yi Zhong, Zhenxi Song, Leilei Zhao, Yichen Wei, Lingyan Liang, Shuqiang Wang, Baiying Lei, Demao Deng, Zhiguo Zhang

Magnetic resonance imaging (MRI), especially functional MRI (fMRI) and diffusion MRI (dMRI), is essential for studying neurodegenerative diseases. However, missing modalities pose a major barrier to their clinical use. Although GAN- and diffusion model-based approaches have shown some promise in modality completion, they remain limited in fMRI-dMRI synthesis due to (1) significant BOLD vs. diffusion-weighted signal differences between fMRI and dMRI in time/gradient axis, and (2) inadequate integration of disease-related neuroanatomical patterns during generation. To address these challenges, we propose PDS, introducing two key innovations: (1) a pattern-aware dual-modal 3D diffusion framework for cross-modality learning, and (2) a tissue refinement network integrated with a efficient microstructure refinement to maintain structural fidelity and fine details. Evaluated on OASIS-3, ADNI, and in-house datasets, our method achieves state-of-the-art results, with PSNR/SSIM scores of 29.83 dB/90.84% for fMRI synthesis (+1.54 dB/+4.12% over baselines) and 30.00 dB/77.55% for dMRI synthesis (+1.02 dB/+2.2%). In clinical validation, the synthesized data show strong diagnostic performance, achieving 67.92%/66.02%/64.15% accuracy (NC vs. MCI vs. AD) in hybrid real-synthetic experiments. Code is available in \href{https://github.com/SXR3015/PDS}{PDS GitHub Repository}

磁共振成像（MRI），特别是功能磁共振成像（fMRI）和扩散加权磁共振成像（dMRI），对于研究神经退行性疾病至关重要。然而，缺失的模式对它们的临床应用构成了重大障碍。尽管基于生成对抗网络（GAN）和扩散模型的方法在模式补全方面显示出一些前景，但由于核磁共振成像与扩散加权信号在时间/梯度轴上的显著差异以及生成过程中疾病相关神经解剖模式的整合不足，它们在fMRI-dMRI合成中仍受到限制。为了应对这些挑战，我们提出了PDS方法，并引入了两个关键创新点：（1）一个模式感知的双模态三维扩散框架，用于跨模态学习；（2）一个组织细化网络，与高效的微观结构细化相结合，以保持结构保真度和细节。在OASIS-3、ADNI和内部数据集上的评估表明，我们的方法达到了业界先进水平，其中fMRI合成的峰值信噪比和结构相似性指数分别为29.83 dB/90.84%（相对于基线提高了1.54 dB/4.12%），dMRI合成的分别为30.00 dB/77.55%（提高了1.02 dB/2.2%）。在临床验证中，合成数据表现出强大的诊断性能，在混合真实合成实验中达到了对正常对照组与轻度认知障碍组及阿尔茨海默病组的67.92%%/66.02%%/64.15%准确率。代码可以在\href{https://github.com/SXR3015/PDS}{PDS GitHub Repository}找到。

论文及项目相关链接

PDF

Summary

本文介绍了在神经退行性疾病研究中，磁共振成像（MRI）特别是功能MRI（fMRI）和扩散MRI（dMRI）的重要性。由于缺少模态是临床应用中的主要障碍，研究人员提出了一种基于生成对抗网络（GAN）和扩散模型的模态完成方法PDS，其中包括两项关键创新：一是跨模态学习的模式感知双模态3D扩散框架，二是与高效微观结构细化相结合的组织细化网络，以维持结构忠实度和细节。该研究在多个数据集上进行了评估，合成数据在临床验证中表现出强大的诊断性能。

Key Takeaways

MRI，特别是fMRI和dMRI，在神经退行性疾病研究中至关重要。
缺少模态是MRI临床应用的主要挑战。
GAN和扩散模型在模态完成方面显示出潜力，但仍面临fMRI-dMRI合成的困难。
PDS方法引入了两项关键创新：跨模态学习的模式感知双模态3D扩散框架和组织细化网络。
PDS方法在多个数据集上实现了最佳结果，包括fMRI和dMRI的合成。
合成数据在临床验证中表现出强大的诊断性能。

Cool Papers

点此查看论文截图

Kedreamix

https://kedreamix.github.io/Talk2Paper/Paper/2025-11-16/GAN/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !

GAN

元宇宙/虚拟人

元宇宙/虚拟人方向最新论文已更新，请持续关注 Update in 2025-11-16 Dynamic Avatar-Scene Rendering from Human-centric Context

2025-11-16 元宇宙/虚拟人

元宇宙/虚拟人

Face Swapping

Face Swapping 方向最新论文已更新，请持续关注 Update in 2025-11-16 DiffSwap++ 3D Latent-Controlled Diffusion for Identity-Preserving Face Swapping

2025-11-16 Face Swapping

Face Swapping

GAN

2025-11-16 更新

OpenSR-SRGAN: A Flexible Super-Resolution Framework for Multispectral Earth Observation Data

GPDM: Generation-Prior Diffusion Model for Accelerated Direct Attenuation and Scatter Correction of Whole-body 18F-FDG PET

WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting

KPLM-STA: Physically-Accurate Shadow Synthesis for Human Relighting via Keypoint-Based Light Modeling

AvatarTex: High-Fidelity Facial Texture Reconstruction from Single-Image Stylized Avatars

Future of AI Models: A Computational perspective on Model collapse

Pattern-Aware Diffusion Synthesis of fMRI/dMRI with Tissue and Microstructural Refinement