发布日期: 2025-06-14

更新日期: 2025-07-06

文章字数: 1.1k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-06-14 更新

CapST: Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deep-fake Videos

Authors:Wasim Ahmad, Yan-Tsung Peng, Yuan-Hao Chang, Gaddisa Olani Ganfure, Sarwar Khan

Deep-fake videos, generated through AI face-swapping techniques, have gained significant attention due to their potential for impactful impersonation attacks. While most research focuses on real vs. fake detection, attributing a deep-fake to its specific generation model or encoder is vital for forensic analysis, enabling source tracing and tailored countermeasures. This enhances detection by leveraging model-specific artifacts and supports proactive defenses. We investigate the model attribution problem for deep-fake videos using two datasets: Deepfakes from Different Models (DFDM) and GANGen-Detection, both comprising deep-fake videos and GAN-generated images. We use only fake images from GANGen-Detection to align with DFDM’s focus on attribution rather than binary classification. We formulate the task as a multiclass classification problem and introduce a novel Capsule-Spatial-Temporal (CapST) model that integrates a truncated VGG19 network for feature extraction, capsule networks for hierarchical encoding, and a spatio-temporal attention mechanism. Video-level fusion captures temporal dependencies across frames. Experiments on DFDM and GANGen-Detection show CapST outperforms baseline models in attribution accuracy while reducing computational cost.

深度伪造视频是通过人工智能换脸技术生成的，由于其可能对个人进行仿冒攻击的影响而备受关注。尽管大多数研究都集中在真实与虚假检测上，但对于法医学分析而言，将深度伪造归因于其特定的生成模型或编码器至关重要，可以实现溯源和针对性的应对措施。通过利用特定模型的伪迹，这增强了检测能力并支持主动防御。我们利用两个数据集对深度伪造视频的模型归属问题进行了研究：来自不同模型的深度伪造（DFDM）和GAN检测生成数据集（GANGen-Detection），这两个数据集都包含深度伪造视频和GAN生成的图像。我们只使用GAN检测生成数据集中的假图像来重点关注DFDM中的归属问题而非二分类问题。我们将任务制定为多元分类问题，并引入了一种新型的Capsule-Spatial-Temporal（CapST）模型，该模型集成了截断后的VGG19网络用于特征提取、胶囊网络进行层次编码以及时空注意力机制。视频级别的融合可以捕捉帧之间的时间依赖性。在DFDM和GANGen-Detection上的实验表明，CapST在归属准确性方面优于基线模型，同时降低了计算成本。

论文及项目相关链接

PDF

Summary

本文关注深度伪造视频（Deep-fake videos）的模型归属问题，利用人工智能面部替换技术生成的深度伪造视频可能引起身份冒充攻击。研究聚焦于使用两个数据集：不同模型深度伪造数据集（DFDM）和GAN生成图像数据集（GANGen-Detection）。文章提出了一个名为Capsule-Spatial-Temporal（CapST）的新模型，该模型在特征提取、层次编码和时空注意力机制方面进行了优化，用于解决多类分类问题中的模型归属任务。实验证明，CapST模型在DFDM和GANGen-Detection上的归属准确率优于基线模型，且计算成本低。

Key Takeaways

深度伪造视频通过AI面部替换技术生成，可能引起身份冒充攻击。
研究集中于深度伪造视频的模型归属问题，这对于法医学分析和采取针对性的应对措施至关重要。
利用两个数据集：不同模型深度伪造数据集（DFDM）和GAN生成图像数据集（GANGen-Detection）进行模型归属问题的研究。
提出了一种新的Capsule-Spatial-Temporal（CapST）模型，该模型集成了特征提取、层次编码和时空注意力机制。
CapST模型通过视频级融合捕捉帧间的时序依赖性。
实验证明CapST模型在归属准确率上优于基线模型。

Cool Papers

点此查看论文截图

Kedreamix

https://kedreamix.github.io/Talk2Paper/Paper/2025-06-14/Face%20Swapping/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !

Face Swapping

GAN

GAN 方向最新论文已更新，请持续关注 Update in 2025-06-14 High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model

2025-06-14 GAN

GAN

Speech

Speech 方向最新论文已更新，请持续关注 Update in 2025-06-14 Developing a High-performance Framework for Speech Emotion Recognition in Naturalistic Conditions Challenge for Emotional Attribute Prediction

2025-06-14 Speech

Speech