NeRF

发布日期: 2025-02-15

更新日期: 2025-05-14

文章字数: 5.2k

阅读时长: 21 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-02-15 更新

DenseSplat: Densifying Gaussian Splatting SLAM with Neural Radiance Prior

Authors:Mingrui Li, Shuhong Liu, Tianchen Deng, Hongyu Wang

Gaussian SLAM systems excel in real-time rendering and fine-grained reconstruction compared to NeRF-based systems. However, their reliance on extensive keyframes is impractical for deployment in real-world robotic systems, which typically operate under sparse-view conditions that can result in substantial holes in the map. To address these challenges, we introduce DenseSplat, the first SLAM system that effectively combines the advantages of NeRF and 3DGS. DenseSplat utilizes sparse keyframes and NeRF priors for initializing primitives that densely populate maps and seamlessly fill gaps. It also implements geometry-aware primitive sampling and pruning strategies to manage granularity and enhance rendering efficiency. Moreover, DenseSplat integrates loop closure and bundle adjustment, significantly enhancing frame-to-frame tracking accuracy. Extensive experiments on multiple large-scale datasets demonstrate that DenseSplat achieves superior performance in tracking and mapping compared to current state-of-the-art methods.

基于高斯SLAM的系统在实时渲染和精细重建方面优于基于NeRF的系统。然而，它们对大量关键帧的依赖并不适合在真实世界的机器人系统中部署，因为机器人系统通常在稀疏视图条件下运行，这可能导致地图中出现大量空洞。为了应对这些挑战，我们引入了DenseSplat，这是第一个有效结合NeRF和3DGS优势SLAM系统。DenseSplat利用稀疏关键帧和NeRF先验来初始化密集填充地图并无缝填充间隙的基元。它还实现了几何感知基元采样和修剪策略，以管理粒度并提高渲染效率。此外，DenseSplat集成了回路闭合和捆绑调整，大大提高了帧到帧的跟踪精度。在多个大型数据集上的广泛实验表明，与当前最先进的方法相比，DenseSplat在跟踪和映射方面实现了卓越的性能。

论文及项目相关链接

PDF

Summary

本文介绍了DenseSplat，一个结合NeRF和3DGS优势的首个SLAM系统。它通过利用稀疏关键帧和NeRF先验来初始化密集地图的原始数据，实现无缝填充间隙。DenseSplat还实现了几何感知的原始数据采样和修剪策略，以提高渲染效率，并集成了闭环和捆绑调整，显著提高帧间跟踪精度。实验证明，DenseSplat在跟踪和映射方面优于当前最先进的方法。

Key Takeaways

DenseSplat是首个结合NeRF和3DGS优势的SLAM系统。
它利用稀疏关键帧和NeRF先验来初始化密集地图的原始数据。
DenseSplat实现了几何感知的原始数据采样和修剪策略，以提高渲染效率。
DenseSplat集成了闭环和捆绑调整，提高帧间跟踪精度。
DenseSplat能有效填充地图中的间隙，实现无缝渲染。
通过大量实验证明，DenseSplat在跟踪和映射方面表现优越。

Cool Papers

点此查看论文截图

Hyperparameter Optimization and Force Error Correction of Neuroevolution Potential for Predicting Thermal Conductivity of Wurtzite GaN

Authors:Zhuo Chen, Yuejin Yuan, Wenyang Ding, Shouhang Li, Meng An, Gang Zhang

As a representative of wide-bandgap semiconductors, wurtzite gallium nitride (GaN) has been widely utilized in high-power devices due to high breakdown voltage and low specific on resistance. Accurate prediction of wurtzite GaN thermal conductivity is a prerequisite for designing effective thermal management systems of electronic applications. Machine learning driven molecular dynamics simulation offers a promising approach to predicting the thermal conductivity of large-scale systems without requiring predefined parameters. However, these methods often underestimate the thermal conductivity of materials with inherently high thermal conductivity due to the large predicted force error compared with first-principle calculation, posing a critical challenge for their broader application. In this study, we successfully developed a neuroevolution potential for wurtzite GaN and accurately predicted its thermal conductivity, 259 W/m-K at room temperatue, achieving excellent agreement with reported experimental measurements. The hyperparameters of neuroevolution potential (NEP) were optimized based on systematic analysis of reproduced energy and force, structural feature, computational efficiency. Furthermore, a force prediction error correction method was implemented, effectively reducing the error caused by the additional force noise in the Langevin thermostat by extrapolating to the zero-force error limit. This study provides valuable insights and hold significant implication for advancing efficient thermal management technologies in wide bandgap semiconductor devices.

作为宽带隙半导体的代表，纤锌矿氮化镓（GaN）由于其高的击穿电压和低的特定开启电阻而广泛应用于高功率设备中。准确预测纤锌矿GaN的热导率是设计电子应用的有效热管理系统的前提条件。机器学习驱动分子动力学模拟提供了一种有前景的方法，可以在不需要预设参数的情况下预测大规模系统的热导率。然而，由于预测出的力误差与第一原理计算相比往往很大，这些方法通常会低估固有高导热率的材料的热导率，这对其更广泛的应用构成了重大挑战。在这项研究中，我们成功开发了一种用于纤锌矿GaN的神经进化势，并准确预测了其热导率，在室温下为259 W/m-K，与已报道的实验测量结果吻合良好。神经进化势（NEP）的超参数基于能量和力再现性、结构特征、计算效率的系统分析进行了优化。此外，实施了一种力预测误差校正方法，通过外推到零力误差极限，有效减少了朗之万恒温器附加力噪声引起的误差。本研究为推进宽带隙半导体设备中的高效热管理技术提供了有价值的见解和重要的启示。

论文及项目相关链接

PDF 15 pages, 5 figures

Summary

宽禁带半导体材料氮化镓（GaN）因其高击穿电压和低特定电阻而在高功率器件中得到广泛应用。本研究成功开发出针对氮化镓的神经演化势（NEP），准确预测了其热导率，为电子应用的热管理系统设计提供了重要依据。研究实现了高效的力预测误差校正方法，减少了兰格文恒温器中的附加力噪声误差。

Key Takeaways

宽禁带半导体氮化镓（GaN）在高功率器件中的广泛应用。
神经演化势（NEP）成功应用于预测氮化镓的热导率。
在室温下，预测的氮化镓热导率为259 W/m-K，与实验测量结果吻合良好。
研究优化了神经演化势的超参数，基于系统分析能量和力、结构特征、计算效率。
实施了一种有效的力预测误差校正方法，减少了兰格文恒温器中的附加力噪声误差。
本研究为高效热管理技术在宽禁带半导体器件中的应用提供了重要见解和启示。

Cool Papers

点此查看论文截图

Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection

Authors:Hongru Yan, Yu Zheng, Yueqi Duan

Skins wrapping around our bodies, leathers covering over the sofa, sheet metal coating the car - it suggests that objects are enclosed by a series of continuous surfaces, which provides us with informative geometry prior for objectness deduction. In this paper, we propose Gaussian-Det which leverages Gaussian Splatting as surface representation for multi-view based 3D object detection. Unlike existing monocular or NeRF-based methods which depict the objects via discrete positional data, Gaussian-Det models the objects in a continuous manner by formulating the input Gaussians as feature descriptors on a mass of partial surfaces. Furthermore, to address the numerous outliers inherently introduced by Gaussian splatting, we accordingly devise a Closure Inferring Module (CIM) for the comprehensive surface-based objectness deduction. CIM firstly estimates the probabilistic feature residuals for partial surfaces given the underdetermined nature of Gaussian Splatting, which are then coalesced into a holistic representation on the overall surface closure of the object proposal. In this way, the surface information Gaussian-Det exploits serves as the prior on the quality and reliability of objectness and the information basis of proposal refinement. Experiments on both synthetic and real-world datasets demonstrate that Gaussian-Det outperforms various existing approaches, in terms of both average precision and recall.

身体周围的皮肤、沙发上的皮革、汽车上的金属涂层，这暗示着物体被一系列连续的表面所包围，为我们提供了物体性推断的有用几何先验。在本文中，我们提出了Gaussian-Det，它利用高斯涂抹作为表面表征，用于基于多视角的3D对象检测。与现有的单目或基于NeRF的方法不同，后者通过离散位置数据描述物体，Gaussian-Det通过制定大量部分表面上的特征描述符，以连续的方式对物体进行建模。此外，为了解决高斯涂抹本质上引入的大量异常值，我们相应地设计了一个闭合推断模块（CIM），用于全面的基于表面的物体性推断。CIM首先估计由于高斯涂抹的欠定性给定的部分表面的概率特征残差，然后将它们合并到对象提议的整体表面闭合的整体表示中。通过这种方式，Gaussian-Det利用的表面信息作为物体性质和提议精化的信息和可靠性基础。在合成和真实世界数据集上的实验表明，Gaussian-Det在平均精度和召回率方面都优于各种现有方法。

论文及项目相关链接

PDF Accepted to ICLR 2025

Summary

本文提出一种基于高斯插值（Gaussian Splatting）的表面表示方法的高斯检测器（Gaussian-Det），用于多视角的3D对象检测。不同于现有的单目或基于NeRF的方法，它通过连续方式建模对象，并使用输入的高斯作为大量部分表面上的特征描述符。为解决高斯插值产生的众多异常值，本文设计了一个闭合推断模块（CIM），用于全面的基于表面的对象性推断。实验表明，Gaussian-Det在合成和真实数据集上的平均精度和召回率均优于各种现有方法。

Key Takeaways

Gaussian-Det采用高斯插值作为表面表示方法，用于多视角的3D对象检测。
与离散位置数据的表示方法不同，Gaussian-Det以连续方式建模对象。
Gaussian-Det使用输入的高斯作为部分表面上的特征描述符。
为了解决高斯插值产生的异常值问题，设计了闭合推断模块（CIM）。
CIM估计部分表面的概率特征残差，并将其合并成对象提案的整体表示。
表面信息在Gaussian-Det中作为对象性的质量和可靠性先验，以及提案细化的信息基础。

Cool Papers

点此查看论文截图

RenderWorld: World Model with Self-Supervised 3D Label

Authors:Ziyang Yan, Wenzhen Dong, Yihua Shao, Yuhang Lu, Liu Haiyang, Jingwen Liu, Haozhe Wang, Zhe Wang, Yan Wang, Fabio Remondino, Yuexin Ma

End-to-end autonomous driving with vision-only is not only more cost-effective compared to LiDAR-vision fusion but also more reliable than traditional methods. To achieve a economical and robust purely visual autonomous driving system, we propose RenderWorld, a vision-only end-to-end autonomous driving framework, which generates 3D occupancy labels using a self-supervised gaussian-based Img2Occ Module, then encodes the labels by AM-VAE, and uses world model for forecasting and planning. RenderWorld employs Gaussian Splatting to represent 3D scenes and render 2D images greatly improves segmentation accuracy and reduces GPU memory consumption compared with NeRF-based methods. By applying AM-VAE to encode air and non-air separately, RenderWorld achieves more fine-grained scene element representation, leading to state-of-the-art performance in both 4D occupancy forecasting and motion planning from autoregressive world model.

仅使用视觉的端到端自动驾驶系统不仅与激光雷达视觉融合系统相比更具成本优势，而且与传统的自动驾驶系统相比更可靠。为了实现经济高效且稳健的纯视觉自动驾驶系统，我们提出了RenderWorld，这是一个仅使用视觉的端到端自动驾驶框架。它使用基于自监督的高斯Img2Occ模块生成3D占用标签，通过AM-VAE编码这些标签，并利用世界模型进行预测和规划。RenderWorld采用高斯拼贴技术来表示3D场景并渲染2D图像，与基于NeRF的方法相比，这大大提高了分割精度并降低了GPU内存消耗。通过将AM-VAE应用于空气和非空气区域的单独编码，RenderWorld实现了更精细的场景元素表示，从而在基于自回归的世界模型的四维占用预测和运动规划中实现了最先进的性能。

论文及项目相关链接

PDF Accepted in 2025 IEEE International Conference on Robotics and Automation (ICRA)

Summary

RenderWorld是一个纯视觉端到端的自动驾驶框架，它通过Img2Occ模块生成3D占用标签，采用AM-VAE编码标签，并利用世界模型进行预测和规划。该框架采用高斯拼贴技术表示3D场景并渲染2D图像，提高了分割精度，降低了GPU内存消耗。通过分别编码空气和非空气，RenderWorld实现了更精细的场景元素表示，在4D占用预测和基于自回归世界模型的运动规划中达到了最先进的性能。

Key Takeaways

RenderWorld是一个纯视觉的端到端自动驾驶框架，相比LiDAR-vision融合，更具成本效益和可靠性。
该框架通过Img2Occ模块生成3D占用标签，并采用AM-VAE进行编码。
RenderWorld利用世界模型进行预测和规划，提高了分割精度并降低了GPU内存消耗。
该框架采用高斯拼贴技术表示3D场景并渲染2D图像。
通过分别编码空气和非空气，RenderWorld实现了更精细的场景元素表示。
RenderWorld在4D占用预测方面达到了最先进的性能。

Cool Papers

点此查看论文截图

Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation

Authors:Hubert Kompanowski, Binh-Son Hua

We present a method to generate 3D objects in styles. Our method takes a text prompt and a style reference image as input and reconstructs a neural radiance field to synthesize a 3D model with the content aligning with the text prompt and the style following the reference image. To simultaneously generate the 3D object and perform style transfer in one go, we propose a stylized score distillation loss to guide a text-to-3D optimization process to output visually plausible geometry and appearance. Our stylized score distillation is based on a combination of an original pretrained text-to-image model and its modified sibling with the key and value features of self-attention layers manipulated to inject styles from the reference image. Comparisons with state-of-the-art methods demonstrated the strong visual performance of our method, further supported by the quantitative results from our user study.

我们提出了一种生成风格化3D物体的方法。该方法以文本提示和风格参考图像作为输入，重建神经辐射场，合成一个与文本提示内容相符、风格参照参考图像的3D模型。为了一次性生成3D物体并同时进行风格转换，我们提出了一种风格化分数蒸馏损失，以引导文本到3D的优化过程，输出视觉上有说服力的几何形状和外观。我们的风格化分数蒸馏损失基于原始预训练的文本到图像模型和经过修改的同胞模型的组合，通过操作自注意力层的键和值特征来注入参考图像的风格。与最先进的方法的比较表明了我们方法的强大视觉性能，定量结果也得到了我们用户研究的进一步支持。

论文及项目相关链接

PDF

Summary

本文介绍了一种生成具有特定风格的3D物体的新方法。该方法以文本提示和风格参考图像为输入，通过重建神经辐射场来合成一个3D模型，该模型的内容与文本提示对齐，风格遵循参考图像。为了同时生成3D物体并进行风格迁移，作者提出了一种风格化得分蒸馏损失，以指导文本到3D的优化过程，输出具有视觉可行性的几何形状和外观。该方法基于预训练的文本到图像模型的原始版本及其修改过的版本，通过操纵自注意力层的键和值特征来注入参考图像的风格。与现有方法的比较结果证明了该方法在视觉性能上的强大表现，用户研究的定量结果也进一步支持了这一点。

Key Takeaways

该方法能够通过结合文本提示和风格参考图像，生成具有特定风格的3D物体。
提出了一种风格化得分蒸馏损失，用于指导文本到3D的优化过程。
方法结合了预训练的文本到图像模型的原始版本和其修改版本。
通过操纵自注意力层的键和值特征，将参考图像的风格注入到生成的3D物体中。
与现有方法相比，该方法在视觉性能上表现出强大的表现。
用户研究的定量结果支持了该方法的性能。

Cool Papers

点此查看论文截图

Learning Naturally Aggregated Appearance for Efficient 3D Editing

Authors:Ka Leong Cheng, Qiuyu Wang, Zifan Shi, Kecheng Zheng, Yinghao Xu, Hao Ouyang, Qifeng Chen, Yujun Shen

Neural radiance fields, which represent a 3D scene as a color field and a density field, have demonstrated great progress in novel view synthesis yet are unfavorable for editing due to the implicitness. This work studies the task of efficient 3D editing, where we focus on editing speed and user interactivity. To this end, we propose to learn the color field as an explicit 2D appearance aggregation, also called canonical image, with which users can easily customize their 3D editing via 2D image processing. We complement the canonical image with a projection field that maps 3D points onto 2D pixels for texture query. This field is initialized with a pseudo canonical camera model and optimized with offset regularity to ensure the naturalness of the canonical image. Extensive experiments on different datasets suggest that our representation, dubbed AGAP, well supports various ways of 3D editing (e.g., stylization, instance segmentation, and interactive drawing). Our approach demonstrates remarkable efficiency by being at least 20 times faster per edit compared to existing NeRF-based editing methods. Project page is available at https://felixcheng97.github.io/AGAP/.

神经辐射场通过将三维场景表示为颜色场和密度场，在新视角合成方面取得了巨大的进步，但由于其隐式性不利于编辑。本研究致力于高效三维编辑任务，重点关注编辑速度和用户交互性。为此，我们提出将颜色场学习为明确的二维外观聚合，也称为规范图像，用户可以通过二维图像处理轻松进行三维编辑。我们补充了投影场，将三维点映射到二维像素上进行纹理查询。该字段使用伪规范相机模型进行初始化，并通过偏移规则进行优化，以确保规范图像的自然性。在不同数据集上的大量实验表明，我们的表示方法（称为AGAP）非常支持多种三维编辑方式（例如风格化、实例分割和交互式绘图）。我们的方法展现出显著的效率，每次编辑的速度至少是现有基于NeRF的编辑方法的20倍以上。项目页面可在[https://felixcheng97.github.io/AGAP/找到。]

论文及项目相关链接

PDF Project page: https://felixcheng97.github.io/AGAP/; accepted to 3DV 2025

Summary

该工作研究了高效的3D编辑任务，聚焦于编辑速度与用户交互性。为此，提出将颜色场学习为明确的2D外观聚合，即所谓的规范图像，用户可以通过2D图像处理轻松进行3D编辑。规范图像辅以投影场，将3D点映射到2D像素以进行纹理查询。该场采用伪规范相机模型进行初始化，并通过偏移规则进行优化，确保规范图像的自然性。在多个数据集上的实验表明，该方法的表示形式——AGAP（规范图像的投影聚合）非常支持多种形式的3D编辑（如风格化、实例分割和交互式绘图）。我们的方法展现出卓越的效率，每次编辑的速度至少是现有NeRF编辑方法的20倍。项目页面可在https://felixcheng97.github.io/AGAP/找到。

Key Takeaways