发布日期: 2025-07-03

更新日期: 2025-07-09

文章字数: 1.1k

阅读时长: 4 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-07-03 更新

How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects

Authors:Wonkwang Lee, Jongwon Jeong, Taehong Moon, Hyeon-Jong Kim, Jaehyeon Kim, Gunhee Kim, Byeong-Uk Lee

Motion synthesis for diverse object categories holds great potential for 3D content creation but remains underexplored due to two key challenges: (1) the lack of comprehensive motion datasets that include a wide range of high-quality motions and annotations, and (2) the absence of methods capable of handling heterogeneous skeletal templates from diverse objects. To address these challenges, we contribute the following: First, we augment the Truebones Zoo dataset, a high-quality animal motion dataset covering over 70 species, by annotating it with detailed text descriptions, making it suitable for text-based motion synthesis. Second, we introduce rig augmentation techniques that generate diverse motion data while preserving consistent dynamics, enabling models to adapt to various skeletal configurations. Finally, we redesign existing motion diffusion models to dynamically adapt to arbitrary skeletal templates, enabling motion synthesis for a diverse range of objects with varying structures. Experiments show that our method learns to generate high-fidelity motions from textual descriptions for diverse and even unseen objects, setting a strong foundation for motion synthesis across diverse object categories and skeletal templates. Qualitative results are available at: $\href{https://t2m4lvo.github.io}{https://t2m4lvo.github.io}$.

动作合成在多种对象类别中具有巨大的潜力，对于三维内容创建具有重要意义，但由于两个关键挑战而尚未得到充分探索：（1）缺乏包含广泛高质量动作和注释的综合动作数据集；（2）缺乏能够处理来自不同对象的异构骨架模板的方法。为了应对这些挑战，我们做出了以下贡献：首先，我们对高质量动物动作数据集Truebones Zoo进行了增强，该数据集涵盖了70多种物种，通过对其进行详细的文本描述注释，使其适合基于文本的动作合成。其次，我们引入了刚体增强技术，该技术可以在保持一致动力学的同时生成多样化的运动数据，使模型能够适应各种骨骼配置。最后，我们重新设计了现有的运动扩散模型，使其能够动态适应任意骨架模板，从而为具有不同结构的各种对象进行动作合成。实验表明，我们的方法从文本描述中学习生成高质量的动作，适用于各种甚至未见过的对象，为跨不同对象类别和骨架模板的动作合成奠定了坚实基础。定性结果可在：https://t2m4lvo.github.io查看。

论文及项目相关链接

PDF Accepted to ICML 2025

Summary

本文介绍了针对多样物体类别的运动合成研究。该研究通过扩充Truebones Zoo数据集并引入rig增广技术，解决了缺乏全面运动数据集和处理不同物体骨架模板的问题。同时，该研究重新设计了现有的运动扩散模型，使其能够适应任意骨架模板，实现了为不同结构的物体进行运动合成。实验表明，该方法从文本描述中学习生成高质量的运动，为跨越不同物体类别和骨架模板的运动合成奠定了坚实基础。

Key Takeaways