发布日期: 2025-06-27

更新日期: 2025-07-06

文章字数: 1.1k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-06-27 更新

TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography

Authors:Yuqin Dai, Wanlu Zhu, Ronghui Li, Xiu Li, Zhenyu Zhang, Jun Li, Jian Yang

Music-driven dance generation has garnered significant attention due to its wide range of industrial applications, particularly in the creation of group choreography. During the group dance generation process, however, most existing methods still face three primary issues: multi-dancer collisions, single-dancer foot sliding and abrupt swapping in the generation of long group dance. In this paper, we propose TCDiff++, a music-driven end-to-end framework designed to generate harmonious group dance. Specifically, to mitigate multi-dancer collisions, we utilize a dancer positioning embedding to better maintain the relative positioning among dancers. Additionally, we incorporate a distance-consistency loss to ensure that inter-dancer distances remain within plausible ranges. To address the issue of single-dancer foot sliding, we introduce a swap mode embedding to indicate dancer swapping patterns and design a Footwork Adaptor to refine raw motion, thereby minimizing foot sliding. For long group dance generation, we present a long group diffusion sampling strategy that reduces abrupt position shifts by injecting positional information into the noisy input. Furthermore, we integrate a Sequence Decoder layer to enhance the model’s ability to selectively process long sequences. Extensive experiments demonstrate that our TCDiff++ achieves state-of-the-art performance, particularly in long-duration scenarios, ensuring high-quality and coherent group dance generation.

音乐驱动的舞蹈生成因其广泛的工业应用，特别是在群体舞蹈创作领域而受到广泛关注。然而，在群体舞蹈生成过程中，大多数现有方法仍然面临三个主要问题：舞者间的碰撞、单一舞者的足部滑动以及在长群体舞蹈生成中的突然替换。在本文中，我们提出了TCDiff++，这是一个音乐驱动端到端的框架，旨在生成和谐的群体舞蹈。具体来说，为了减轻舞者间的碰撞，我们利用舞者定位嵌入来更好地保持舞者之间的相对位置。此外，我们引入了一个距离一致性损失，以确保舞者之间的距离保持在合理的范围内。为了解决单一舞者的足部滑动问题，我们引入了替换模式嵌入来表示舞者的替换模式，并设计了一个足部适配器来优化原始运动，从而最小化足部滑动。对于长群体舞蹈生成，我们提出了一种长组扩散采样策略，通过向噪声输入中注入位置信息来减少突然的位置变化。此外，我们集成了一个序列解码层，以提高模型处理长序列的选择性能力。大量实验表明，我们的TCDiff++达到了最新性能，特别是在长时间场景中，确保了高质量和连贯的群体舞蹈生成。

论文及项目相关链接

PDF

Summary

本文提出一种音乐驱动端到端的群体舞蹈生成框架TCDiff++。通过舞者定位嵌入和距离一致性损失减少舞者间碰撞，引入换舞模式嵌入和足部运动适配器解决单一舞者足部滑动问题。对于长舞蹈生成，采用长组扩散采样策略和序列解码层，减少突兀位置变化，增强模型处理长序列的能力，实现高质量连贯的群体舞蹈生成。

Key Takeaways