⚠️ 以下所有内容总结都来自于 大语言模型的能力,如有错误,仅供参考,谨慎使用
🔴 请注意:千万不要用于严肃的学术场景,只能用于论文阅读前的初筛!
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ,还请您给我们一些鼓励!⭐️ HuggingFace免费体验
2025-09-13 更新
SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification
Authors:Yifu Tao, Maurice Fallon
We present a neural radiance field (NeRF) based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photorealistic texture. Our system adopts the state-of-the-art NeRF representation to incorporate lidar. Adding lidar data adds strong geometric constraints on the depth and surface normals, which is particularly useful when modelling uniform texture surfaces which contain ambiguous visual reconstruction cues. A key contribution of this work is a novel method to quantify the epistemic uncertainty of the lidar-visual NeRF reconstruction by estimating the spatial variance of each point location in the radiance field given the sensor observations from the cameras and lidar. This provides a principled approach to evaluate the contribution of each sensor modality to the final reconstruction. In this way, reconstructions that are uncertain (due to e.g. uniform visual texture, limited observation viewpoints, or little lidar coverage) can be identified and removed. Our system is integrated with a real-time lidar SLAM system which is used to bootstrap a Structure-from-Motion (SfM) reconstruction procedure. It also helps to properly constrain the overall metric scale which is essential for the lidar depth loss. The refined SLAM trajectory can then be divided into submaps using Spectral Clustering to group sets of co-visible images together. This submapping approach is more suitable for visual reconstruction than distance-based partitioning. Our uncertainty estimation is particularly effective when merging submaps as their boundaries often contain artefacts due to limited observations. We demonstrate the reconstruction system using a multi-camera, lidar sensor suite in experiments involving both robot-mounted and handheld scanning. Our test datasets cover a total area of more than 20,000 square metres.
我们提出了一种基于神经辐射场(NeRF)的大规模重建系统,该系统融合了激光雷达和视觉数据,生成了高质量、几何准确的重建模型,并捕捉了逼真的纹理。我们的系统采用最先进的NeRF表示法来融入激光雷达数据。添加激光雷达数据为深度和表面法线增加了强大的几何约束,这在建模具有模糊视觉重建线索的均匀纹理表面时特别有用。这项工作的关键贡献是一种量化激光雷达-视觉NeRF重建的主观不确定性的新方法,通过估计辐射场中每个点位置的空间方差,给定来自相机和激光雷达的传感器观测数据。这为评估每种传感器模态对最终重建的贡献提供了理论方法。通过这种方式,可以识别和去除因均匀视觉纹理、有限的观察观点或很少的激光雷达覆盖而导致的重建不确定性。我们的系统与实时激光雷达SLAM系统相结合,用于引导运动结构(SfM)重建程序。它还有助于正确约束整体度量规模,这对于激光雷达深度损失至关重要。经过细化的SLAM轨迹可以使用谱聚类划分为子地图,将共视图组合在一起。这种子图方法比基于距离的分隔更适合视觉重建。当合并子图时,我们的不确定性估计特别有效,因为它们的边界通常由于有限的观察而出现伪影。我们通过使用多相机和激光雷达传感器组合进行实验,展示了重建系统,涉及机器人安装和手持扫描。我们的测试数据集覆盖的总面积超过20,000平方米。
论文及项目相关链接
PDF Accepted by T-RO. Webpage: https://dynamic.robots.ox.ac.uk/projects/silvr/
摘要
本文介绍了一种基于神经辐射场(NeRF)的大规模重建系统,该系统融合了激光雷达和视觉数据,可生成高质量、几何精确、纹理逼真的重建结果。系统采用最先进的NeRF表示法融入激光雷达数据,为深度和表面法线添加强大的几何约束,特别适用于建模具有模糊视觉重建线索的均匀纹理表面。本文的关键贡献是提出了一种量化激光雷达-视觉NeRF重建的主观不确定性的新方法,通过估计给定相机和激光雷达传感器观测的辐射场中每个点位置的空间方差来实现。这提供了一种评估每种传感器对最终重建贡献的可靠方法。通过这种方式,可以识别并去除由于例如统一视觉纹理、有限的观察观点或激光雷达覆盖范围小而导致的不确定的重建。本系统整合了实时激光雷达SLAM系统,用于引导运动结构(SfM)重建程序。它还有助于正确约束整体度量规模,这对激光雷达深度损失至关重要。经过细化的SLAM轨迹可以使用谱聚类划分为子地图,将共视图像集组合在一起。这种子映射方法更适合于视觉重建,距离基础上的分区。我们的不确定性估计在合并子图时特别有效,因为它们的边界通常由于有限的观察而包含伪影。我们通过使用多相机、激光雷达传感器组合进行实验,展示了机器人安装和手持扫描的重建系统。我们的测试数据集覆盖超过20,000平方米的区域。
Key Takeaways
- 基于NeRF的大型重建系统融合了激光雷达和视觉数据。
- 系统采用NeRF表示法,融入激光雷达数据,增强了深度和表面法线的几何约束。
- 提出了量化激光雷达-视觉NeRF重建的主观不确定性的新方法。
- 通过估计辐射场中每个点位置的空间方差,评估了传感器对最终重建的贡献。
- 系统与实时激光雷达SLAM系统整合,引导SfM重建程序,并有助于约束整体度量规模。
- 使用谱聚类将SLAM轨迹划分为子地图,适合视觉重建。
- 实验展示了机器人安装和手持扫描的重建系统性能,覆盖超过20,000平方米的区域。
点此查看论文截图


The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods
Authors:Yifu Tao, Miguel Ángel Muñoz-Bañón, Lintong Zhang, Jiahao Wang, Lanke Frank Tarimo Fu, Maurice Fallon
This paper introduces a large-scale multi-modal dataset captured in and around well-known landmarks in Oxford using a custom-built multi-sensor perception unit as well as a millimetre-accurate map from a Terrestrial LiDAR Scanner (TLS). The perception unit includes three synchronised global shutter colour cameras, an automotive 3D LiDAR scanner, and an inertial sensor - all precisely calibrated. We also establish benchmarks for tasks involving localisation, reconstruction, and novel-view synthesis, which enable the evaluation of Simultaneous Localisation and Mapping (SLAM) methods, Structure-from-Motion (SfM) and Multi-view Stereo (MVS) methods as well as radiance field methods such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting. To evaluate 3D reconstruction the TLS 3D models are used as ground truth. Localisation ground truth is computed by registering the mobile LiDAR scans to the TLS 3D models. Radiance field methods are evaluated not only with poses sampled from the input trajectory, but also from viewpoints that are from trajectories which are distant from the training poses. Our evaluation demonstrates a key limitation of state-of-the-art radiance field methods: we show that they tend to overfit to the training poses/images and do not generalise well to out-of-sequence poses. They also underperform in 3D reconstruction compared to MVS systems using the same visual inputs. Our dataset and benchmarks are intended to facilitate better integration of radiance field methods and SLAM systems. The raw and processed data, along with software for parsing and evaluation, can be accessed at https://dynamic.robots.ox.ac.uk/datasets/oxford-spires/.
本文介绍了一个大规模多模态数据集,该数据集是在牛津著名地标内部及其周围使用定制的多传感器感知单元以及陆地激光雷达扫描仪(TLS)的毫米级精度地图所捕获的。感知单元包括三个同步的全局快门彩色相机、汽车3D激光雷达扫描仪和惯性传感器,所有这些都经过精确校准。我们还为涉及定位、重建和新颖视图合成的任务建立了基准测试,这些基准测试能够评估同时定位与地图绘制(SLAM)方法、结构从运动(SfM)和多视图立体(MVS)方法以及神经辐射场(NeRF)和三维高斯平铺等辐射场方法的效果。为了评估3D重建,TLS 3D模型被用作真实数据。定位真实数据是通过将移动激光雷达扫描注册到TLS 3D模型来计算的。辐射场方法的评估不仅使用了从输入轨迹中采样的姿态,还使用了远离训练姿态的轨迹的视点。我们的评估显示出现有辐射场方法的一个关键局限性:它们往往过度拟合于训练姿态/图像,不能很好地泛化到序列外的姿态。在3D重建方面,它们也表现不如使用相同视觉输入MVS系统。我们的数据集和基准测试旨在促进辐射场方法和SLAM系统之间的更好集成。原始数据和经过处理的数据以及用于解析和评估的软件都可以访问https://dynamic.robots.ox.ac.uk/datasets/oxford-spires/。
论文及项目相关链接
PDF Accepted by IJRR. Website: https://dynamic.robots.ox.ac.uk/datasets/oxford-spires/
Summary
本文介绍了一个大规模的多模式数据集,该数据集在牛津著名地标周围捕获,使用自定义的多传感器感知单元以及陆地激光雷达扫描仪(TLS)的毫米级精确地图。文章还建立了针对定位、重建和新颖视图合成的任务基准,以评估SLAM方法、SfM和MVS方法以及如NeRF等辐射场方法。利用TLS 3D模型作为评估重建效果的真实标准。通过移动激光雷达扫描与TLS 3D模型进行匹配计算定位真实值。评估结果显示,最先进的辐射场方法存在关键局限性:它们容易过度拟合训练姿态/图像,对超出序列的姿态概括性差,且在重建效果上不如使用相同视觉输入的MVS系统。本文的数据集和基准测试旨在促进辐射场方法和SLAM系统的更好融合。数据集和软件可以在https://dynamic.robots.ox.ac.uk/datasets/oxford-spires/获取。
Key Takeaways
- 该论文介绍了一个大规模多模式数据集,涵盖牛津著名地标周围的感知数据。
- 数据集使用了自定义的多传感器感知单元和陆地激光雷达扫描仪(TLS)的精确地图。
- 文章设立了针对定位、重建和新颖视图合成的任务基准,旨在评估多种技术方法。
- TLS 3D模型用于评估重建效果的真实标准。
- 最先进的辐射场方法存在局限性,易过度拟合训练数据,对外部姿态概括性差,且在重建效果上逊于MVS系统。
- 数据集和软件可用于研究和评估辐射场方法和SLAM系统的融合。
点此查看论文截图






