发布日期: 2025-10-19

更新日期: 2025-11-27

文章字数: 987

阅读时长: 3 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-10-19 更新

LaMoGen: Laban Movement-Guided Diffusion for Text-to-Motion Generation

Authors:Heechang Kim, Gwanghyun Kim, Se Young Chun

Diverse human motion generation is an increasingly important task, having various applications in computer vision, human-computer interaction and animation. While text-to-motion synthesis using diffusion models has shown success in generating high-quality motions, achieving fine-grained expressive motion control remains a significant challenge. This is due to the lack of motion style diversity in datasets and the difficulty of expressing quantitative characteristics in natural language. Laban movement analysis has been widely used by dance experts to express the details of motion including motion quality as consistent as possible. Inspired by that, this work aims for interpretable and expressive control of human motion generation by seamlessly integrating the quantification methods of Laban Effort and Shape components into the text-guided motion generation models. Our proposed zero-shot, inference-time optimization method guides the motion generation model to have desired Laban Effort and Shape components without any additional motion data by updating the text embedding of pretrained diffusion models during the sampling step. We demonstrate that our approach yields diverse expressive motion qualities while preserving motion identity by successfully manipulating motion attributes according to target Laban tags.

多样化的人类运动生成是一个日益重要的任务，在计算机视觉、人机交互和动画等领域有着广泛的应用。虽然使用扩散模型的文本到运动合成已经成功生成了高质量的运动，但实现精细粒度的表达运动控制仍然是一个巨大的挑战。这是由于数据集中运动风格多样性的缺乏以及自然语言表达定量特征的困难。拉班运动分析已被舞蹈专家广泛应用于表达运动的细节，尽可能保持运动质量的一致性。受此启发，本工作的目标是通过无缝集成拉班努力与形状组件的量化方法，实现可解释和富有表现力的运动生成控制。我们提出的零样本、推理时间优化方法，通过采样步骤中更新预训练扩散模型的文本嵌入，指导运动生成模型具有所需的拉班努力和形状组件，而无需任何额外的运动数据。我们证明，我们的方法能够产生多样化的表达运动质量，同时根据目标拉班标签成功操作运动属性，从而保持运动身份的完整性。

论文及项目相关链接

PDF

Summary

本文探讨了文本到运动生成中精细运动控制的挑战，并介绍了如何将Laban动作分析集成到文本引导的运动生成模型中。通过零样本、推理时间优化方法，实现了解读和表达人类运动生成的掌控力。在采样步骤中更新预训练扩散模型的文本嵌入，能依据目标Laban标签操控运动属性，同时保留运动身份和产生多样的表现力运动质量。

Key Takeaways