嘘~ 正在从服务器偷取页面 . . .


⚠️ 以下所有内容总结都来自于 大语言模型的能力,如有错误,仅供参考,谨慎使用
🔴 请注意:千万不要用于严肃的学术场景,只能用于论文阅读前的初筛!
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ,还请您给我们一些鼓励!⭐️ HuggingFace免费体验

2025-02-13 更新

PrismAvatar: Real-time animated 3D neural head avatars on edge devices

Authors:Prashant Raina, Felix Taubner, Mathieu Tuli, Eu Wern Teh, Kevin Ferreira

We present PrismAvatar: a 3D head avatar model which is designed specifically to enable real-time animation and rendering on resource-constrained edge devices, while still enjoying the benefits of neural volumetric rendering at training time. By integrating a rigged prism lattice with a 3D morphable head model, we use a hybrid rendering model to simultaneously reconstruct a mesh-based head and a deformable NeRF model for regions not represented by the 3DMM. We then distill the deformable NeRF into a rigged mesh and neural textures, which can be animated and rendered efficiently within the constraints of the traditional triangle rendering pipeline. In addition to running at 60 fps with low memory usage on mobile devices, we find that our trained models have comparable quality to state-of-the-art 3D avatar models on desktop devices.



PDF 8 pages, 5 figures



Key Takeaways

  1. PrismAvatar是专为资源受限的边缘设备设计的实时动画和渲染的3D头像模型。
  2. 它结合了棱柱网格与3D可变头部模型,实现了高质量的渲染效果。
  3. 使用混合渲染模型进行精细渲染,同时重建网格头部和可变NeRF模型。
  4. 通过提炼可变形NeRF模型为可控制的网格和神经网络纹理,提高模型的效率和兼容性。
  5. 该模型可在移动设备上以每秒60帧的速度运行,具有较低的内存使用率。
  6. PrismAvatar的质量与桌面设备上的最新一代3D头像模型相当。

Cool Papers


GAS: Generative Avatar Synthesis from a Single Image

Authors:Yixing Lu, Junting Dong, Youngjoong Kwon, Qin Zhao, Bo Dai, Fernando De la Torre

We introduce a generalizable and unified framework to synthesize view-consistent and temporally coherent avatars from a single image, addressing the challenging problem of single-image avatar generation. While recent methods employ diffusion models conditioned on human templates like depth or normal maps, they often struggle to preserve appearance information due to the discrepancy between sparse driving signals and the actual human subject, resulting in multi-view and temporal inconsistencies. Our approach bridges this gap by combining the reconstruction power of regression-based 3D human reconstruction with the generative capabilities of a diffusion model. The dense driving signal from the initial reconstructed human provides comprehensive conditioning, ensuring high-quality synthesis faithful to the reference appearance and structure. Additionally, we propose a unified framework that enables the generalization learned from novel pose synthesis on in-the-wild videos to naturally transfer to novel view synthesis. Our video-based diffusion model enhances disentangled synthesis with high-quality view-consistent renderings for novel views and realistic non-rigid deformations in novel pose animation. Results demonstrate the superior generalization ability of our method across in-domain and out-of-domain in-the-wild datasets. Project page: https://humansensinglab.github.io/GAS/






Key Takeaways

  1. 引入了一个通用框架,能够从单一图像合成视角一致、时间连贯的虚拟人物。
  2. 结合回归式三维人体重建与扩散模型的生成能力。
  3. 初始重建的人体提供的密集驱动信号,确保合成的高质量且忠于参考。
  4. 框架能够推广新姿态合成中学到的知识到新的视角合成。
  5. 实现高质量的视角一致渲染和逼真的非刚性变形。
  6. 框架具有优越的在域内和域外野生数据集上的泛化能力。

Cool Papers


Drivable 3D Gaussian Avatars

Authors:Wojciech Zielonka, Timur Bagautdinov, Shunsuke Saito, Michael Zollhöfer, Justus Thies, Javier Romero

We present Drivable 3D Gaussian Avatars (D3GA), a multi-layered 3D controllable model for human bodies that utilizes 3D Gaussian primitives embedded into tetrahedral cages. The advantage of using cages compared to commonly employed linear blend skinning (LBS) is that primitives like 3D Gaussians are naturally re-oriented and their kernels are stretched via the deformation gradients of the encapsulating tetrahedron. Additional offsets are modeled for the tetrahedron vertices, effectively decoupling the low-dimensional driving poses from the extensive set of primitives to be rendered. This separation is achieved through the localized influence of each tetrahedron on 3D Gaussians, resulting in improved optimization. Using the cage-based deformation model, we introduce a compositional pipeline that decomposes an avatar into layers, such as garments, hands, or faces, improving the modeling of phenomena like garment sliding. These parts can be conditioned on different driving signals, such as keypoints for facial expressions or joint-angle vectors for garments and the body. Our experiments on two multi-view datasets with varied body shapes, clothes, and motions show higher-quality results. They surpass PSNR and SSIM metrics of other SOTA methods using the same data while offering greater flexibility and compactness.



PDF Accepted to 3DV25 Website: https://zielon.github.io/d3ga/


本文介绍了基于三维高斯基础的Drivable 3D Gaussian Avatars(D3GA)模型。此模型使用嵌入四面体笼的三维高斯基本体,通过笼子的变形梯度自然调整高斯基本体的方向并拉伸其内核。此模型实现了低维驱动姿势与渲染的大量基本体之间的有效分离,提高了优化效果。实验表明,该模型在多种数据集上的表现优于其他先进方法,具有更高的质量和灵活性。

Key Takeaways

  1. Drivable 3D Gaussian Avatars (D3GA) 是一种利用三维高斯基本体嵌入四面体笼的多层三维可控人体模型。
  2. 与常用的线性混合皮肤技术相比,使用四面体笼的优势在于可以自然调整高斯基本体的方向和内核拉伸。
  3. 该模型实现了低维驱动姿势与大量渲染基本体之间的有效分离,通过每个四面体对三维高斯局部的影响,提高了优化效果。
  4. 该模型通过分解角色为多个层次(如服装、手、脸等)改善了现象建模,如服装滑动等。
  5. 这些部分可以根据不同的驱动信号进行调节,如面部表情的关键点或服装和身体的角度向量。
  6. 在多种数据集上的实验表明,该模型的表现在图像质量上超越了其他先进方法,体现在PSNR和SSIM指标上。

Cool Papers


文章作者: Kedreamix
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !
3DGS 方向最新论文已更新,请持续关注 Update in 2025-02-13 TranSplat Surface Embedding-guided 3D Gaussian Splatting for Transparent Object Manipulation
GAN 方向最新论文已更新,请持续关注 Update in 2025-02-13 BF-GAN Development of an AI-driven Bubbly Flow Image Generation Model Using Generative Adversarial Networks