发布日期: 2025-11-16

更新日期: 2025-11-27

文章字数: 1.2k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-11-16 更新

DiffSwap++: 3D Latent-Controlled Diffusion for Identity-Preserving Face Swapping

Authors:Weston Bondurant, Arkaprava Sinha, Hieu Le, Srijan Das, Stephanie Schuckers

Diffusion-based approaches have recently achieved strong results in face swapping, offering improved visual quality over traditional GAN-based methods. However, even state-of-the-art models often suffer from fine-grained artifacts and poor identity preservation, particularly under challenging poses and expressions. A key limitation of existing approaches is their failure to meaningfully leverage 3D facial structure, which is crucial for disentangling identity from pose and expression. In this work, we propose DiffSwap++, a novel diffusion-based face-swapping pipeline that incorporates 3D facial latent features during training. By guiding the generation process with 3D-aware representations, our method enhances geometric consistency and improves the disentanglement of facial identity from appearance attributes. We further design a diffusion architecture that conditions the denoising process on both identity embeddings and facial landmarks, enabling high-fidelity and identity-preserving face swaps. Extensive experiments on CelebA, FFHQ, and CelebV-Text demonstrate that DiffSwap++ outperforms prior methods in preserving source identity while maintaining target pose and expression. Additionally, we introduce a biometric-style evaluation and conduct a user study to further validate the realism and effectiveness of our approach. Code will be made publicly available at https://github.com/WestonBond/DiffSwapPP

基于扩散的方法最近在面部交换中取得了强大的结果，相对于传统的基于GAN的方法，其提供了改进的视觉质量。然而，即使是最先进的模型也常受到精细纹理的伪影和身份保留不良的影响，特别是在具有挑战性的姿势和表情下。现有方法的关键局限性在于它们无法有效利用三维面部结构，这对于从姿势和表情中分离身份至关重要。在这项工作中，我们提出了DiffSwap++，这是一种新颖的基于扩散的面部交换管道，它在训练过程中结合了三维面部潜在特征。通过以三维感知表示引导生成过程，我们的方法增强了几何一致性，并改善了面部身份与外观属性的分离。我们进一步设计了一个扩散架构，该架构在降噪过程中对身份嵌入和面部特征点进行条件处理，从而实现高保真和身份保留的面部交换。在CelebA、FFHQ和CelebV-Text上的大量实验表明，DiffSwap++在保持源身份的同时保持目标姿势和表情方面优于先前的方法。此外，我们引入了生物识别风格的评估并进行了一项用户研究，以进一步验证我们的方法的真实性和有效性。代码将在https://github.com/WestonBond/DiffSwapPP上公开提供。

论文及项目相关链接

PDF

Summary：最新研究表明，基于扩散的方法在面部换脸技术中取得了强大的效果，它们克服了传统GAN方法的局限，并能产生更高质量的视觉结果。然而，现有技术仍面临精细纹理和身份保留问题，特别是在处理复杂姿势和表情时。本文提出了一种新的基于扩散的面部换脸管道DiffSwap++，它在训练过程中融入了三维面部潜在特征。通过引导生成过程，使其具备三维感知表示能力，该方法提高了几何一致性，并改善了面部身份与外观属性的分离。实验表明，DiffSwap++在CelebA、FFHQ和CelebV-Text数据集上的表现优于以前的方法，能够在保留源身份的同时保持目标姿势和表情。此外，还引入了生物识别评估和用户研究来验证其真实性和有效性。

Key Takeaways：