发布日期: 2025-02-19

更新日期: 2025-05-14

文章字数: 2.8k

阅读时长: 11 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-02-19 更新

GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text

Authors:Gyumin Shim, Sangmin Lee, Jaegul Choo

In this paper, we introduce GaussianMotion, a novel human rendering model that generates fully animatable scenes aligned with textual descriptions using Gaussian Splatting. Although existing methods achieve reasonable text-to-3D generation of human bodies using various 3D representations, they often face limitations in fidelity and efficiency, or primarily focus on static models with limited pose control. In contrast, our method generates fully animatable 3D avatars by combining deformable 3D Gaussian Splatting with text-to-3D score distillation, achieving high fidelity and efficient rendering for arbitrary poses. By densely generating diverse random poses during optimization, our deformable 3D human model learns to capture a wide range of natural motions distilled from a pose-conditioned diffusion model in an end-to-end manner. Furthermore, we propose Adaptive Score Distillation that effectively balances realistic detail and smoothness to achieve optimal 3D results. Experimental results demonstrate that our approach outperforms existing baselines by producing high-quality textures in both static and animated results, and by generating diverse 3D human models from various textual inputs.

本文介绍了GaussianMotion，这是一种新型的人类渲染模型。它使用高斯拼贴技术，根据文本描述生成完全可动画的场景。尽管现有方法使用各种三维表示实现合理的文本到三维人体生成，但它们通常在保真度和效率方面存在局限性，或者主要关注姿势控制有限的静态模型。相比之下，我们的方法结合了可变形三维高斯拼贴与文本到三维分数蒸馏，实现任意姿势的高保真和高效渲染，生成完全可动画的三维化身。通过在优化过程中密集生成各种随机姿势，我们的可变形三维人体模型以端到端的方式学习捕获从姿势条件扩散模型中提炼的广泛自然运动。此外，我们提出了自适应分数蒸馏，有效地平衡了现实细节和平滑度，以实现最佳的三维结果。实验结果表明，我们的方法在静态和动画结果中都产生了高质量纹理，并且从各种文本输入中生成了多样化的三维人体模型，超越了现有基线。

论文及项目相关链接

PDF 8 pages

Summary
基于文本描述生成动态人类渲染模型。提出GaussianMotion方法，使用高斯溅墨技术生成与文本描述对齐的可动态调整的3D人物场景。此方法结合了可变形3D高斯溅墨技术与文本到3D得分蒸馏技术，实现高保真和高效渲染，可生成任意姿态的完全可动画的3D化身。通过优化过程中的密集随机姿态生成，该模型能够广泛捕捉自然动作，并从姿态条件扩散模型中蒸馏出终端到终端的动作。此外，还提出了自适应得分蒸馏方法，有效地平衡了真实细节与平滑度，实现了最佳的3D效果。实验结果表明，该方法在静态和动画结果中均产生高质量纹理，并能从各种文本输入中生成多样化的3D人物模型。

Key Takeaways

该论文提出了GaussianMotion，一种基于文本描述生成可动画的3D人物场景的新方法。
使用高斯溅墨技术实现高保真和高效渲染。
结合可变形3D高斯溅墨与文本到3D得分蒸馏技术，生成任意姿态的完全可动画的3D化身。
通过密集随机姿态生成，模型广泛捕捉自然动作并从姿态条件扩散模型中蒸馏出动作。
提出的自适应得分蒸馏方法平衡了真实细节与平滑度。
实验结果表明，该方法在静态和动画结果中产生高质量纹理。

Cool Papers

点此查看论文截图

Resource Allocation and Pricing for Blockchain-enabled Metaverse: A Stackelberg Game Approach

Authors:Zhanpeng Zhu, Feilong Lin, Changbing Tang, Zhongyu Chen

As the next-generation Internet paradigm, the metaverse can provide users with immersive physical-virtual experiences without spatial limitations. However, there are various concerns to be overcome, such as resource allocation, resource pricing, and transaction security issues. To address the above challenges, we integrate blockchain technology into the metaverse to manage and automate complex interactions effectively and securely utilizing the advantages of blockchain. With the objective of promoting the Quality of Experience (QoE), Metaverse Service Users (MSUs) purchase rendering and bandwidth resources from the Metaverse Service Provider (MSP) to access low-latency and high-quality immersive services. The MSP maximizes the profit by controlling the unit prices of resources. In this paper, we model the interaction between the MSP and MSUs as a Stackelberg game, in which the MSP acts as the leader and MSUs are followers. The existence of Stackelberg equilibrium is analyzed and proved mathematically. Besides, we propose an efficient greedy-and-search-based resource allocation and pricing algorithm (GSRAP) to solve the Stackelberg equilibrium (SE) point. Finally, we conduct extensive simulations to verify the effectiveness and efficiency of our designs. The experiment results show that our algorithm outperforms the baseline scheme in terms of improving the MSP’s profit and convergence speed.

作为下一代互联网范式，元宇宙能够为用户提供不受空间限制的身临其境的虚实体验。然而，还需要克服各种担忧，例如资源配置、资源定价和交易安全问题。为了解决上述挑战，我们将区块链技术集成到元宇宙中，利用区块链的优势有效地管理和自动化复杂的交互，从而实现安全高效的管理。我们的目标是提升用户体验质量（QoE），元宇宙服务用户（MSU）从元宇宙服务提供商（MSP）处购买渲染和带宽资源，以获取低延迟、高质量的服务体验。MSP通过控制资源单价来最大化利润。在本文中，我们将MSP和MSU之间的交互建模为Stackelberg博弈，其中MSP作为领导者，而MSU作为追随者。分析了Stackelberg均衡的存在性并进行了数学证明。此外，我们提出了一种基于贪婪搜索的资源分配和定价算法（GSRAP）来解决Stackelberg均衡（SE）点。最后，我们进行了大量仿真实验来验证我们设计的有效性和效率。实验结果表明，我们的算法在提升MSP利润和收敛速度方面优于基线方案。

论文及项目相关链接

PDF 8 pages

Summary

元宇宙作为下一代互联网范式，能为用户提供无空间限制的沉浸式虚实体验。为应对资源分配、资源定价和交易安全等挑战，本文结合区块链技术，有效管理和自动化元宇宙中的复杂交互，提高服务质量。文中将元宇宙服务提供商和用户间的交互建模为斯塔克尔伯格博弈，并提出基于贪心搜索的资源分配和定价算法，通过仿真验证，该算法在提高服务提供商利润和收敛速度方面表现优异。

Key Takeaways

元宇宙作为下一代互联网范式，提供无空间限制的沉浸式虚实体验。
面临资源分配、资源定价和交易安全等挑战。
区块链技术被集成到元宇宙中，以有效管理和自动化复杂交互。
元宇宙服务提供商和用户之间的交互被建模为斯塔克尔伯格博弈。
提出基于贪心搜索的资源分配和定价算法（GSRAP）。
仿真实验证明，所提算法在提升服务提供商利润和收敛速度方面表现优异。

Cool Papers

点此查看论文截图

Na’vi or Knave: Jailbreaking Language Models via Metaphorical Avatars

Authors:Yu Yan, Sheng Sun, Junqi Tong, Min Liu, Qi Li

Metaphor serves as an implicit approach to convey information, while enabling the generalized comprehension of complex subjects. However, metaphor can potentially be exploited to bypass the safety alignment mechanisms of Large Language Models (LLMs), leading to the theft of harmful knowledge. In our study, we introduce a novel attack framework that exploits the imaginative capacity of LLMs to achieve jailbreaking, the J\underline{\textbf{A}}ilbreak \underline{\textbf{V}}ia \underline{\textbf{A}}dversarial Me\underline{\textbf{TA}} -pho\underline{\textbf{R}} (\textit{AVATAR}). Specifically, to elicit the harmful response, AVATAR extracts harmful entities from a given harmful target and maps them to innocuous adversarial entities based on LLM’s imagination. Then, according to these metaphors, the harmful target is nested within human-like interaction for jailbreaking adaptively. Experimental results demonstrate that AVATAR can effectively and transferablly jailbreak LLMs and achieve a state-of-the-art attack success rate across multiple advanced LLMs. Our study exposes a security risk in LLMs from their endogenous imaginative capabilities. Furthermore, the analytical study reveals the vulnerability of LLM to adversarial metaphors and the necessity of developing defense methods against jailbreaking caused by the adversarial metaphor. \textcolor{orange}{ \textbf{Warning: This paper contains potentially harmful content from LLMs.}}

隐喻作为一种隐晦的表达方式，能够传递信息，同时使复杂主题得到普遍理解。然而，隐喻有可能被用来绕过大型语言模型（LLM）的安全对齐机制，从而导致有害知识的窃取。在我们的研究中，我们引入了一种新的攻击框架，利用LLM的想象力来实现越狱，即J\underline{\textbf{A}}ilbreak \underline{\textbf{V}}ia \underline{\textbf{A}}dversarial Me\underline{\textbf{TA}} -pho\underline{\textbf{R}}（\textit{AVATAR}）。具体来说，为了引发有害反应，AVATAR会从给定的有害目标中提取有害实体，并根据LLM的想象力将它们映射到无害的对立实体。然后，根据这些隐喻，将有害目标嵌入人类交互中进行自适应越狱。实验结果表明，AVATAR可以有效地、可移植地对LLM进行越狱，并在多个先进LLM上达到了最先进的攻击成功率。我们的研究揭示了LLM由于其内在的想象力而存在的安全风险。此外，分析研究表明LLM容易受到对立隐喻的影响，有必要开发对抗因对立隐喻导致的越狱的防御方法。警告：本文包含潜在的有害内容来自LLM。

论文及项目相关链接

PDF We still need to polish our paper

Summary：本研究揭示了隐喻可能绕过大型语言模型的安全对齐机制的风险，提出一种新型攻击框架AVATAR，通过映射有害实体至无害对抗实体并利用隐喻策略嵌入攻击语言以窃取模型危害信息。研究表明大型语言模型内在想象力带来安全风险，亟需防御策略。实验结果表明，该框架对多个先进的大型语言模型具有有效性和可迁移性。

Key Takeaways：