GAN

发布日期: 2025-03-28

更新日期: 2025-05-14

文章字数: 2.7k

阅读时长: 11 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-03-28 更新

RecTable: Fast Modeling Tabular Data with Rectified Flow

Authors:Masane Fuchi, Tomohiro Takagi

Score-based or diffusion models generate high-quality tabular data, surpassing GAN-based and VAE-based models. However, these methods require substantial training time. In this paper, we introduce RecTable, which uses the rectified flow modeling, applied in such as text-to-image generation and text-to-video generation. RecTable features a simple architecture consisting of a few stacked gated linear unit blocks. Additionally, our training strategies are also simple, incorporating a mixed-type noise distribution and a logit-normal timestep distribution. Our experiments demonstrate that RecTable achieves competitive performance compared to the several state-of-the-art diffusion and score-based models while reducing the required training time. Our code is available at https://github.com/fmp453/rectable.

基于分数的模型或扩散模型能够生成高质量的表格数据，超越了基于GAN和基于VAE的模型。然而，这些方法需要大量的训练时间。在本文中，我们介绍了RecTable，它使用校正流建模，应用于文本到图像生成和文本到视频生成等领域。RecTable具有简单的架构，由几个堆叠的门控线性单元块组成。此外，我们的训练策略也很简单，结合了混合类型的噪声分布和对数正态时间步长分布。实验表明，RecTable与几种最先进的扩散模型和基于分数的模型相比，在缩短训练时间的同时取得了具有竞争力的性能。我们的代码可在https://github.com/kmp453/rectable找到。

论文及项目相关链接

PDF 19 pages, 7 figures, 10 tables

Summary

本文介绍了RecTable模型，该模型采用修正流建模，用于生成表格数据。相比其他基于GAN和VAE的模型，RecTable能生成更高质量的表格数据，并且具有简单的架构和训练策略。实验证明，RecTable在减少训练时间的同时，与先进的扩散和基于分数的模型相比具有竞争力。

Key Takeaways

RecTable采用修正流建模，生成表格数据表现优异。
该模型架构简单，由几个堆叠的线性单元块组成。
训练策略也简单，包括混合类型的噪声分布和对数正态分布的时间步分布。
实验表明，相比其他先进的扩散和基于分数的模型，RecTable具有竞争力的性能。
该模型减少了训练时间。
代码已公开在GitHub上。

Cool Papers

点此查看论文截图

AvatarArtist: Open-Domain 4D Avatarization

Authors:Hongyu Liu, Xuan Wang, Ziyu Wan, Yue Ma, Jingye Chen, Yanbo Fan, Yujun Shen, Yibing Song, Qifeng Chen

This work focuses on open-domain 4D avatarization, with the purpose of creating a 4D avatar from a portrait image in an arbitrary style. We select parametric triplanes as the intermediate 4D representation and propose a practical training paradigm that takes advantage of both generative adversarial networks (GANs) and diffusion models. Our design stems from the observation that 4D GANs excel at bridging images and triplanes without supervision yet usually face challenges in handling diverse data distributions. A robust 2D diffusion prior emerges as the solution, assisting the GAN in transferring its expertise across various domains. The synergy between these experts permits the construction of a multi-domain image-triplane dataset, which drives the development of a general 4D avatar creator. Extensive experiments suggest that our model, AvatarArtist, is capable of producing high-quality 4D avatars with strong robustness to various source image domains. The code, the data, and the models will be made publicly available to facilitate future studies.

本文专注于开放域4D人物化身技术，旨在从任意风格的肖像图像中创建4D人物化身。我们选择参数化三平面作为中间4D表示，并提出了一种实用的训练范式，该范式结合了生成对抗网络（GANs）和扩散模型的优势。我们的设计源于观察，即4D GANs在无需监督的情况下擅长于图像和三平面之间的桥梁搭建，但在处理多样数据分布时通常面临挑战。一个稳健的2D扩散先验的出现作为解决方案，协助GAN在不同领域转移其专业知识。这些专家之间的协同作用，使得能够构建多域图像-三平面数据集，从而推动通用4D人物化身创作者的发展。大量实验表明，我们的模型——AvatarArtist，能够生成高质量、高鲁棒性的4D人物化身，对各种源图像领域具有强大的适应性。代码、数据和模型将公开提供，以促进未来的研究。

论文及项目相关链接

PDF Accepted to CVPR 2025. Project page: https://kumapowerliu.github.io/AvatarArtist

Summary

本文关注开放域4D个性化角色创建，旨在从任意风格的肖像图像中创建4D个性化角色。研究采用参数化triplanes作为中间4D表示形式，并提出了一种结合生成对抗网络（GANs）和扩散模型的实际训练范式。通过观察发现，4D GANs在桥接图像和triplanes时表现优异，但通常面临处理多样数据分布的挑战。引入稳健的2D扩散先验知识来解决这一问题，协助GAN在不同领域间传递知识。这些专家之间的协同作用使得构建多域图像-triplane数据集成为可能，进而推动通用4D个性化角色创建器的发展。实验表明，所提模型AvatarArtist能够生成高质量的4D个性化角色，并对各种源图像域表现出强大的稳健性。

Key Takeaways

研究关注开放域4D个性化角色创建，旨在从肖像图像创建4D个性化角色。
采用参数化triplanes作为中间4D表示形式。
结合生成对抗网络（GANs）和扩散模型的训练范式。
4D GANs在桥接图像和triplanes时表现优异，但处理多样数据分布时面临挑战。
引入2D扩散先验知识协助GAN在不同领域间传递知识。
构建多域图像-triplane数据集，推动通用4D个性化角色创建器的发展。

Cool Papers

点此查看论文截图

In the Blink of an Eye: Instant Game Map Editing using a Generative-AI Smart Brush

Authors:Vitaly Gnatyuk, Valeriia Koriukina, Ilya Levoshevich, Pavel Nurminskiy, Guenter Wallner

With video games steadily increasing in complexity, automated generation of game content has found widespread interest. However, the task of 3D gaming map art creation remains underexplored to date due to its unique complexity and domain-specific challenges. While recent works have addressed related topics such as retro-style level generation and procedural terrain creation, these works primarily focus on simpler data distributions. To the best of our knowledge, we are the first to demonstrate the application of modern AI techniques for high-resolution texture manipulation in complex, highly detailed AAA 3D game environments. We introduce a novel Smart Brush for map editing, designed to assist artists in seamlessly modifying selected areas of a game map with minimal effort. By leveraging generative adversarial networks and diffusion models we propose two variants of the brush that enable efficient and context-aware generation. Our hybrid workflow aims to enhance both artistic flexibility and production efficiency, enabling the refinement of environments without manually reworking every detail, thus helping to bridge the gap between automation and creative control in game development. A comparative evaluation of our two methods with adapted versions of several state-of-the art models shows that our GAN-based brush produces the sharpest and most detailed outputs while preserving image context while the evaluated state-of-the-art models tend towards blurrier results and exhibit difficulties in maintaining contextual consistency.

随着视频游戏的复杂度不断提升，游戏内容的自动生成已引起广泛关注。然而，由于3D游戏地图艺术创作的独特复杂性和特定领域的挑战，至今该任务仍被较少探索。虽然近期有一些关于复古风格关卡生成和程序化地形生成的研究，但这些研究主要集中在更简单的数据分布上。据我们所知，我们是首次展示现代AI技术在复杂、高度详细的AAA级3D游戏环境中进行高分辨率纹理操作的应用。我们引入了一种新型的智能画笔用于地图编辑，旨在帮助艺术家轻松修改游戏地图的选定区域。通过利用生成对抗网络和扩散模型，我们提出了两种画笔变体，能够实现高效且语境感知的生成。我们的混合工作流程旨在提高艺术灵活性和生产效率，能够在不重新手动处理每个细节的情况下优化环境，从而有助于在游戏开发中弥合自动化和创意控制之间的鸿沟。我们与几种最新模型改编版本的比较评估显示，我们的基于GAN的画笔产生的输出最清晰、最详细，同时保留了图像上下文，而被评估的现有最新模型往往产生模糊的结果，并且在保持上下文一致性方面存在困难。

论文及项目相关链接

PDF

Summary

随着游戏的复杂度不断提升，游戏内容的自动生成受到了广泛关注。但3D游戏地图艺术创作任务因其独特复杂性和特定领域挑战至今仍未被充分探索。虽然有关于复古风格关卡生成和程序化地形创建的研究，但它们主要集中在更简单的数据分布上。我们是首次展示将现代AI技术应用于复杂、高度详细的AAA 3D游戏环境中高分辨率纹理操作的实践。我们推出了一款新型智能地图编辑工具——智能画笔，旨在帮助艺术家轻松修改游戏地图的选定区域。借助生成对抗网络和扩散模型，我们推出了两款画笔变体，可实现高效且具备上下文感知的生成。我们的混合工作流程旨在提高艺术灵活性和生产效率，可以在不手动重新处理每个细节的情况下优化环境，有助于弥合游戏开发中自动化与创意控制之间的差距。

Key Takeaways