⚠️ 以下所有内容总结都来自于 大语言模型的能力,如有错误,仅供参考,谨慎使用
🔴 请注意:千万不要用于严肃的学术场景,只能用于论文阅读前的初筛!
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ,还请您给我们一些鼓励!⭐️ HuggingFace免费体验
2025-11-19 更新
PFAvatar: Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos
Authors:Dianbing Xi, Guoyuan An, Jingsen Zhu, Zhijian Liu, Yuan Liu, Ruiyuan Zhang, Jiayuan Lu, Rui Wang, Yuchi Huo
We propose PFAvatar (Pose-Fusion Avatar), a new method that reconstructs high-quality 3D avatars from ``Outfit of the Day’’ (OOTD) photos, which exhibit diverse poses, occlusions, and complex backgrounds. Our method consists of two stages: (1) fine-tuning a pose-aware diffusion model from few-shot OOTD examples and (2) distilling a 3D avatar represented by a neural radiance field (NeRF). In the first stage, unlike previous methods that segment images into assets (e.g., garments, accessories) for 3D assembly, which is prone to inconsistency, we avoid decomposition and directly model the full-body appearance. By integrating a pre-trained ControlNet for pose estimation and a novel Condition Prior Preservation Loss (CPPL), our method enables end-to-end learning of fine details while mitigating language drift in few-shot training. Our method completes personalization in just 5 minutes, achieving a 48$\times$ speed-up compared to previous approaches. In the second stage, we introduce a NeRF-based avatar representation optimized by canonical SMPL-X space sampling and Multi-Resolution 3D-SDS. Compared to mesh-based representations that suffer from resolution-dependent discretization and erroneous occluded geometry, our continuous radiance field can preserve high-frequency textures (e.g., hair) and handle occlusions correctly through transmittance. Experiments demonstrate that PFAvatar outperforms state-of-the-art methods in terms of reconstruction fidelity, detail preservation, and robustness to occlusions/truncations, advancing practical 3D avatar generation from real-world OOTD albums. In addition, the reconstructed 3D avatar supports downstream applications such as virtual try-on, animation, and human video reenactment, further demonstrating the versatility and practical value of our approach.
我们提出了PFAvatar(姿态融合化身),这是一种新的方法,可以从“日常穿搭”(OOTD)照片重建高质量的三维化身。这些照片展现出多种姿态、遮挡和复杂背景。我们的方法分为两个阶段:(1)通过少量OOTD示例对姿态感知扩散模型进行微调;(2)通过神经辐射场(NeRF)表示一个三维化身并进行提炼。在第一阶段,与以往将图像分割成资产(如服装、配饰)进行三维组装的方法不同,这种方法容易导致不一致性。我们避免分解并直接对全身外观进行建模。通过集成预先训练的ControlNet进行姿态估计和新颖的条件先验保留损失(CPPL),我们的方法能够在端到端学习中精细细节,同时减轻少量训练中的语言漂移。我们的方法只需5分钟即可完成个性化设置,与以前的方法相比实现了48倍的速度提升。在第二阶段,我们引入了一种基于NeRF的化身表示,通过标准SMPL-X空间采样和多分辨率3D-SDS进行优化。与基于网格的表示方法相比,后者受到分辨率依赖的离散化和遮挡几何错误的影响,我们的连续辐射场能够保留高频纹理(例如头发),并通过透光率正确处理遮挡。实验表明,PFAvatar在重建保真度、细节保留以及遮挡/截断鲁棒性方面优于最先进的方法,推动了从现实世界OOTD相册生成实用三维化身的发展。此外,重建的三维化身支持下游应用,如虚拟试穿、动画和人类视频复现,进一步证明了我们的方法的通用性和实用价值。
论文及项目相关链接
PDF Accepted by AAAI 2026
Summary
基于Pose-Fusion Avatar(PFAvatar)的新方法,能够从日常服装照片重建高质量的三维头像。该方法分为两个阶段:第一阶段利用少量日常服装图像微调姿态感知扩散模型;第二阶段通过神经辐射场(NeRF)表示三维头像。新方法避免了资产分解的不一致性,能够端到端学习细节,减少语言漂移。第二阶段引入基于NeRF的头像表示法,优化采样和多分辨率三维表面重建技术。相比基于网格的表示法,NeRF能保留高频纹理、正确处理遮挡问题。实验证明,PFAvatar在重建保真度、细节保留以及遮挡鲁棒性方面优于现有技术,并适用于虚拟试穿、动画和人体视频再现等应用。
Key Takeaways
- PFAvatar能够从日常服装照片重建高质量的三维头像。
- 方法分为两个阶段:微调姿态感知扩散模型和用神经辐射场(NeRF)表示三维头像。
- 避免资产分解的不一致性,直接建模全身外观。
- 端到端学习细节,减少语言漂移。
- 采用NeRF表示法,能够保留高频纹理并正确处理遮挡问题。
- PFAvatar在重建保真度、细节保留和遮挡鲁棒性方面优于现有技术。
- 适用于虚拟试穿、动画和人体视频再现等应用。
点此查看论文截图
LiDAR-GS++:Improving LiDAR Gaussian Reconstruction via Diffusion Priors
Authors:Qifeng Chen, Jiarun Liu, Rengan Xie, Tao Tang, Sicong Du, Yiru Zhao, Yuchi Huo, Sheng Yang
Recent GS-based rendering has made significant progress for LiDAR, surpassing Neural Radiance Fields (NeRF) in both quality and speed. However, these methods exhibit artifacts in extrapolated novel view synthesis due to the incomplete reconstruction from single traversal scans. To address this limitation, we present LiDAR-GS++, a LiDAR Gaussian Splatting reconstruction method enhanced by diffusion priors for real-time and high-fidelity re-simulation on public urban roads. Specifically, we introduce a controllable LiDAR generation model conditioned on coarsely extrapolated rendering to produce extra geometry-consistent scans and employ an effective distillation mechanism for expansive reconstruction. By extending reconstruction to under-fitted regions, our approach ensures global geometric consistency for extrapolative novel views while preserving detailed scene surfaces captured by sensors. Experiments on multiple public datasets demonstrate that LiDAR-GS++ achieves state-of-the-art performance for both interpolated and extrapolated viewpoints, surpassing existing GS and NeRF-based methods.
基于GS的最近渲染在激光雷达上取得了显著进展,在质量和速度方面都超越了神经辐射场(NeRF)。然而,这些方法由于单次遍历扫描的重建不完整,在推断出的新视角合成中会出现伪影。为了解决这一局限性,我们提出了激光雷达GS++,这是一种激光雷达高斯喷绘重建方法,通过扩散先验增强,用于公共城市道路上的实时和高保真再模拟。具体来说,我们引入了一个可控的激光雷达生成模型,该模型以粗略推断的渲染为条件,以生成额外的几何一致性扫描,并采用了有效的蒸馏机制进行扩展重建。通过将重建扩展到未拟合区域,我们的方法确保了推断新视角的全局几何一致性,同时保留了传感器捕获的详细场景表面。在多个公共数据集上的实验表明,激光雷达GS++在插值和推断观点方面均达到了最新水平,超越了现有的GS和NeRF方法。
论文及项目相关链接
PDF Accepted by AAAI-26
Summary
基于GS的渲染方法在LiDAR领域取得了显著进展,在质量和速度上都超越了神经辐射场(NeRF)。然而,由于单遍扫描的重建不完整,这些方法在推算新颖视角合成时会出现伪影。为解决此局限,我们推出LiDAR-GS++方法,这是一种结合扩散先验增强的LiDAR高斯拼贴重建方法,适用于公共城市道路的高保真实时模拟。我们引入可控的LiDAR生成模型,以粗略推算渲染为条件生成额外的几何一致扫描,并采用有效的蒸馏机制进行扩展重建。通过扩展重建到未拟合区域,我们的方法确保了推算新颖视角的全局几何一致性,同时保留了传感器捕捉到的详细场景表面。实验证明,LiDAR-GS++在插值和推算观点上均达到了最佳性能,超越了现有的GS和NeRF方法。
Key Takeaways
- 基于GS的渲染方法在LiDAR领域进展显著,超越NeRF在质量和速度方面。
- 单遍扫描的重建不完整导致推算新颖视角合成时出现伪影。
- LiDAR-GS++方法结合扩散先验增强LiDAR高斯拼贴重建,适用于公共城市道路的高保真实时模拟。
- 引入可控的LiDAR生成模型,以粗略推算渲染为条件生成额外的几何一致扫描。
- 采用有效的蒸馏机制进行扩展重建,确保全局几何一致性。
- LiDAR-GS++在插值和推算观点上均达到了最佳性能。
- 该方法超越了现有的GS和NeRF方法。
点此查看论文截图
DehazeGS: Seeing Through Fog with 3D Gaussian Splatting
Authors:Jinze Yu, Yiqun Wang, Aiheng Jiang, Zhengda Lu, Jianwei Guo, Yong Li, Hongxing Qin, Xiaopeng Zhang
Current novel view synthesis methods are typically designed for high-quality and clean input images. However, in foggy scenes, scattering and attenuation can significantly degrade the quality of rendering. Although NeRF-based dehazing approaches have been developed, their reliance on deep fully connected neural networks and per-ray sampling strategies leads to high computational costs. Furthermore, NeRF’s implicit representation limits its ability to recover fine-grained details from hazy scenes. To overcome these limitations, we propose learning an explicit Gaussian representation to explain the formation mechanism of foggy images through a physically forward rendering process. Our method, DehazeGS, reconstructs and renders fog-free scenes using only multi-view foggy images as input. Specifically, based on the atmospheric scattering model, we simulate the formation of fog by establishing the transmission function directly onto Gaussian primitives via depth-to-transmission mapping. During training, we jointly learn the atmospheric light and scattering coefficients while optimizing the Gaussian representation of foggy scenes. At inference time, we remove the effects of scattering and attenuation in Gaussian distributions and directly render the scene to obtain dehazed views. Experiments on both real-world and synthetic foggy datasets demonstrate that DehazeGS achieves state-of-the-art performance. visualizations are available at https://dehazegs.github.io/
当前的新型视图合成方法通常针对高质量和清晰的输入图像进行设计。然而,在雾天场景中,散射和衰减会显著降低渲染质量。尽管已经开发了基于NeRF的去雾方法,但它们依赖于深度全连接神经网络和按射线采样策略,导致计算成本高。此外,NeRF的隐式表示限制了其从雾天场景恢复细节的能力。为了克服这些局限性,我们提出了学习显式高斯表示的方法,通过正向物理渲染过程来解释雾天图像的形成机制。我们的方法DehazeGS仅使用多视角雾天图像作为输入,重建和渲染无雾场景。具体来说,基于大气散射模型,我们通过深度到传输映射直接在高斯原始数据上建立传输函数来模拟雾的形成。在训练过程中,我们联合学习大气光和散射系数,同时优化雾天场景的高斯表示。在推理阶段,我们消除了高斯分布中的散射和衰减影响,并直接呈现场景以获得去雾的视图。在真实世界和合成雾数据集上的实验表明,DehazeGS达到了最先进的性能。可视化效果可访问:[https://dehazegs.github.io/] 。
论文及项目相关链接
PDF 9 pages,5 figures. Accepted by AAAI2026. visualizations are available at https://dehazegs.github.io/
Summary
本文提出一种基于Gaussian表示的去雾方法DehazeGS,通过物理前向渲染过程学习雾天图像的形成机制。该方法利用大气散射模型模拟雾的形成,并通过深度到传输的映射直接在Gaussian原语上建立传输函数。在训练过程中,同时学习大气光和散射系数,优化雾天场景的高斯表示。在推理时,去除高斯分布中的散射和衰减效应,直接渲染场景以获得去雾视图。该方法在真实和合成雾天数据集上实现了最佳性能。
Key Takeaways
- 当前的新型视图合成方法主要面向高质量和清晰的输入图像,但在雾天场景中,散射和衰减会降低渲染质量。
- NeRF基于去雾的方法虽然存在,但依赖深度全连接神经网络和每射线采样策略,计算成本高。
- NeRF的隐式表示难以从雾天场景中恢复细节。
- 提出一种基于Gaussian表示的去雾方法DehazeGS,通过物理前向渲染过程学习雾天图像的形成机制。
- 通过大气散射模型模拟雾的形成,通过深度到传输的映射建立传输函数。
- 在训练过程中联合学习大气光和散射系数,优化雾天场景的高斯表示。
- 在真实和合成雾天数据集上实现了最佳性能。