嘘~ 正在从服务器偷取页面 . . .

3DGS


⚠️ 以下所有内容总结都来自于 大语言模型的能力,如有错误,仅供参考,谨慎使用
🔴 请注意:千万不要用于严肃的学术场景,只能用于论文阅读前的初筛!
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ,还请您给我们一些鼓励!⭐️ HuggingFace免费体验

2025-11-20 更新

SparseSurf: Sparse-View 3D Gaussian Splatting for Surface Reconstruction

Authors:Meiying Gu, Jiawei Zhang, Jiahe Li, Xiaohan Yu, Haonan Luo, Jin Zheng, Xiao Bai

Recent advances in optimizing Gaussian Splatting for scene geometry have enabled efficient reconstruction of detailed surfaces from images. However, when input views are sparse, such optimization is prone to overfitting, leading to suboptimal reconstruction quality. Existing approaches address this challenge by employing flattened Gaussian primitives to better fit surface geometry, combined with depth regularization to alleviate geometric ambiguities under limited viewpoints. Nevertheless, the increased anisotropy inherent in flattened Gaussians exacerbates overfitting in sparse-view scenarios, hindering accurate surface fitting and degrading novel view synthesis performance. In this paper, we propose \net{}, a method that reconstructs more accurate and detailed surfaces while preserving high-quality novel view rendering. Our key insight is to introduce Stereo Geometry-Texture Alignment, which bridges rendering quality and geometry estimation, thereby jointly enhancing both surface reconstruction and view synthesis. In addition, we present a Pseudo-Feature Enhanced Geometry Consistency that enforces multi-view geometric consistency by incorporating both training and unseen views, effectively mitigating overfitting caused by sparse supervision. Extensive experiments on the DTU, BlendedMVS, and Mip-NeRF360 datasets demonstrate that our method achieves the state-of-the-art performance.

近期针对场景几何优化高斯贴图技术(Gaussian Splatting)的进步已经实现了从图像高效重建细节丰富的表面。然而,当输入视图较少时,这种优化容易出现过度拟合的情况,导致重建质量不佳。现有方法通过采用扁平化高斯基元来更好地适应表面几何结构,并结合深度正则化来缓解有限视角下的几何模糊性,以应对这一挑战。然而,扁平化高斯中固有的各向异性在稀疏视图场景中加剧了过度拟合问题,阻碍了精确的表面拟合和降低了新视角合成性能。在本文中,我们提出了\net{}方法,该方法能够重建更准确、更详细的表面,同时保留高质量的新视角渲染。我们的关键见解是引入立体几何纹理对齐(Stereo Geometry-Texture Alignment),它架起了渲染质量和几何估计之间的桥梁,从而共同提高了表面重建和视角合成的效果。此外,我们还提出了一种伪特征增强几何一致性(Pseudo-Feature Enhanced Geometry Consistency)方法,它通过结合训练视图和未见过的视图来强制执行多视角几何一致性,有效地缓解了稀疏监督导致的过拟合问题。在DTU、BlendedMVS和Mip-NeRF360数据集上的大量实验表明,我们的方法达到了最先进的性能。

论文及项目相关链接

PDF Accepted at AAAI 2026. Project page: https://miya-oi.github.io/SparseSurf-project

Summary

本文提出一种名为\net{}的方法,用于在稀疏视图情况下优化场景几何重建,提高表面重建的准确性和细节丰富度,同时保持良好的新型视图渲染质量。该方法通过引入立体几何纹理对齐技术,提升渲染质量与几何估计的桥梁作用,从而同时增强表面重建和视图合成。此外,还提出了一种伪特征增强几何一致性方法,通过结合训练视图和未见视图,有效缓解稀疏监督导致的过拟合问题。

Key Takeaways

  1. \net{}方法提高了在稀疏视图情况下的表面重建准确性。
  2. 引入立体几何纹理对齐技术,提升渲染质量与几何估计的联合优化。
  3. 伪特征增强几何一致性方法,通过结合训练与未见视图,增强多视图几何一致性。
  4. 方法能有效缓解由稀疏监督引起的过拟合问题。
  5. 提出的方法在DTU、BlendedMVS和Mip-NeRF360数据集上实现了最先进的性能。
  6. 采用优化后的Gaussian Splatting技术,能够从图像中重建出详细的表面。

Cool Papers

点此查看论文截图

Interaction-Aware 4D Gaussian Splatting for Dynamic Hand-Object Interaction Reconstruction

Authors:Hao Tian, Chenyangguang Zhang, Rui Liu, Wen Shen, Xiaolin Qin

This paper focuses on a challenging setting of simultaneously modeling geometry and appearance of hand-object interaction scenes without any object priors. We follow the trend of dynamic 3D Gaussian Splatting based methods, and address several significant challenges. To model complex hand-object interaction with mutual occlusion and edge blur, we present interaction-aware hand-object Gaussians with newly introduced optimizable parameters aiming to adopt piecewise linear hypothesis for clearer structural representation. Moreover, considering the complementarity and tightness of hand shape and object shape during interaction dynamics, we incorporate hand information into object deformation field, constructing interaction-aware dynamic fields to model flexible motions. To further address difficulties in the optimization process, we propose a progressive strategy that handles dynamic regions and static background step by step. Correspondingly, explicit regularizations are designed to stabilize the hand-object representations for smooth motion transition, physical interaction reality, and coherent lighting. Experiments show that our approach surpasses existing dynamic 3D-GS-based methods and achieves state-of-the-art performance in reconstructing dynamic hand-object interaction.

本文重点关注了一个挑战性的场景,即在不依赖任何对象先验知识的情况下,对手与对象交互场景的几何形状和外观进行同时建模。我们遵循动态三维高斯混刷方法的发展趋势,并解决了几个重大挑战。为了模拟手与对象之间复杂的交互,包括相互遮挡和边缘模糊,我们提出了交互感知的手对象高斯模型,并引入了全新的可优化参数,旨在采用分段线性假设进行更清晰的结构表示。此外,考虑到交互过程中手形与对象形状的互补性和紧密性,我们将手的信息融入了对象的变形场,构建了交互感知的动态场来对灵活的动作进行建模。为了进一步优化过程中的困难,我们提出了一种渐进的策略,逐步处理动态区域和静态背景。相应地,设计了明确的规则来稳定手对象表示,以实现平滑的运动过渡、真实的物理交互和连贯的照明。实验表明,我们的方法超越了现有的基于动态三维GS的方法,并在重建动态手对象交互方面达到了最先进的性能。

论文及项目相关链接

PDF 11 pages, 6 figures

Summary

本文研究了无物体先验下的手物交互场景几何与外观建模问题。采用动态三维高斯拼贴方法,解决复杂手物交互中的相互遮挡和边缘模糊等挑战。通过引入可优化参数,提出交互感知的手物高斯模型,采用分段线性假设进行更清晰的结构表示。同时,将手部信息融入物体变形场,构建交互感知动态场以模拟灵活运动。为优化过程中的困难,采取逐步处理动态区域和静态背景的策略,并设计显式正则化以稳定手物表示为平滑运动过渡、真实物理交互和连贯照明。实验表明,该方法超越了现有的动态三维GS方法,在手物交互重建上达到领先水平。

Key Takeaways

  1. 研究背景:该文研究了在没有物体先验知识的情况下,对手物交互场景的几何和外观建模的挑战性问题。
  2. 方法基础:采用动态三维高斯拼贴方法作为基础。
  3. 解决复杂手物交互问题:通过引入可优化参数,提出了交互感知的手物高斯模型,处理相互遮挡和边缘模糊等问题。
  4. 分段线性假设:采用分段线性假设进行更清晰的结构表示。
  5. 融合手部信息:将手部信息融入物体变形场,构建交互感知动态场,以更好地模拟手物交互中的灵活运动。
  6. 优化策略:采用逐步处理动态区域和静态背景的策略,并设计显式正则化以优化手物表示的稳定性。

Cool Papers

点此查看论文截图

2D Gaussians Spatial Transport for Point-supervised Density Regression

Authors:Miao Shang, Xiaopeng Hong

This paper introduces Gaussian Spatial Transport (GST), a novel framework that leverages Gaussian splatting to facilitate transport from the probability measure in the image coordinate space to the annotation map. We propose a Gaussian splatting-based method to estimate pixel-annotation correspondence, which is then used to compute a transport plan derived from Bayesian probability. To integrate the resulting transport plan into standard network optimization in typical computer vision tasks, we derive a loss function that measures discrepancy after transport. Extensive experiments on representative computer vision tasks, including crowd counting and landmark detection, validate the effectiveness of our approach. Compared to conventional optimal transport schemes, GST eliminates iterative transport plan computation during training, significantly improving efficiency. Code is available at https://github.com/infinite0522/GST.

本文介绍了高斯空间传输(Gaussian Spatial Transport,GST)这一新型框架,它利用高斯涂抹技术,促进了从图像坐标空间的概率测度到注释图的传输。我们提出了一种基于高斯涂抹的方法,估计像素与注释之间的对应关系,然后利用贝叶斯概率计算由此产生的传输计划。为了将结果传输计划整合到典型计算机视觉任务的常规网络优化中,我们推导出了一个损失函数,该函数可以衡量传输后的差异。在包括人群计数和地标检测等具有代表性的计算机视觉任务上进行的广泛实验验证了我们方法的有效性。与常规的最佳传输方案相比,GST消除了训练过程中的迭代传输计划计算,大大提高了效率。代码可在https://github.com/infinite0522/GST获取。

论文及项目相关链接

PDF 9 pages, 5 figures, accepted by AAAI, 2026

Summary
本文提出一种名为高斯空间传输(GST)的新框架,利用高斯涂抹技术实现从图像坐标空间概率度量到注释图的传输。作者提出了一种基于高斯涂抹的方法,估计像素与注释之间的对应关系,并根据贝叶斯概率计算传输计划。为了将传输计划整合到典型计算机视觉任务的标准网络优化中,作者推导了一个损失函数,用于测量传输后的差异。在人群计数和地标检测等具有代表性的计算机视觉任务上的大量实验验证了该方法的有效性。与常规的最佳传输方案相比,GST消除了训练过程中的迭代传输计划计算,显著提高了效率。

Key Takeaways

  1. 引入了高斯空间传输(GST)框架,利用高斯涂抹技术实现图像坐标空间到注释图的概率传输。
  2. 提出基于高斯涂抹的像素与注释之间的对应关系估计方法。
  3. 根据贝叶斯概率计算传输计划。
  4. 推导了用于计算机视觉任务的标准网络优化的损失函数,以测量传输后的差异。
  5. 在多个具有代表性的计算机视觉任务上进行了实验验证,包括人群计数和地标检测等。
  6. GST框架相比传统最佳传输方案,具有更高的效率,消除了训练过程中的迭代传输计划计算。

Cool Papers

点此查看论文截图

IBGS: Image-Based Gaussian Splatting

Authors:Hoang Chuong Nguyen, Wei Mao, Jose M. Alvarez, Miaomiao Liu

3D Gaussian Splatting (3DGS) has recently emerged as a fast, high-quality method for novel view synthesis (NVS). However, its use of low-degree spherical harmonics limits its ability to capture spatially varying color and view-dependent effects such as specular highlights. Existing works augment Gaussians with either a global texture map, which struggles with complex scenes, or per-Gaussian texture maps, which introduces high storage overhead. We propose Image-Based Gaussian Splatting, an efficient alternative that leverages high-resolution source images for fine details and view-specific color modeling. Specifically, we model each pixel color as a combination of a base color from standard 3DGS rendering and a learned residual inferred from neighboring training images. This promotes accurate surface alignment and enables rendering images of high-frequency details and accurate view-dependent effects. Experiments on standard NVS benchmarks show that our method significantly outperforms prior Gaussian Splatting approaches in rendering quality, without increasing the storage footprint.

3D高斯模糊技术(3DGS)最近崭露头角,作为一种快速、高质量的新视角合成(NVS)方法。然而,它使用低阶球面谐波,限制了其在捕捉空间变化颜色和视角相关效果方面的能力,如高光。现有研究通过全局纹理贴图或每个高斯纹理贴图来增强高斯模糊技术,这在处理复杂场景时面临挑战,并引入了较高的存储开销。我们提出了基于图像的Gaussian Splatting方法,这是一种高效的替代方法,利用高分辨率源图像进行精细细节和特定视角的颜色建模。具体来说,我们将每个像素的颜色建模为标准3DGS渲染的基本颜色与从邻近训练图像中学习到的残差相结合。这促进了准确的表面对齐,并能够实现高质量的高频细节和准确的视角相关效果的渲染。在标准NVS基准测试上的实验表明,我们的方法在渲染质量上显著优于先前的Gaussian Splatting方法,且没有增加存储开销。

论文及项目相关链接

PDF Accepted to NeurIPS 2025

Summary

3D Gaussian Splatting(3DGS)在新型视图合成(NVS)中表现出快速且高质量的方法,但其在低阶球面谐波上的使用限制了其在捕捉空间变化颜色和视角相关效果(如高光)方面的能力。现有工作通过全局纹理映射或每个高斯纹理映射增强高斯效果,但前者在复杂场景中表现不佳,后者则引入了较高的存储开销。本文提出基于图像的Gaussian Splatting方法,利用高分辨率源图像进行精细细节和特定视角的颜色建模。实验表明,在标准NVS基准测试中,该方法在渲染质量上显著优于之前的高斯Splatting方法,且未增加存储成本。

Key Takeaways

  1. 3D Gaussian Splatting (3DGS) 是一种快速且高质量的新型视图合成方法。
  2. 现有工作使用全局纹理映射或每个高斯纹理映射增强高斯效果存在局限。全局纹理映射在复杂场景中表现不佳,而每个高斯纹理映射引入较高的存储开销。
  3. 基于图像的Gaussian Splatting方法被提出,利用高分辨率源图像进行精细细节和特定视角的颜色建模。
  4. 该方法将每个像素颜色建模为来自标准3DGS渲染的基础颜色与从邻近训练图像中学习到的残差的组合。
  5. 该方法促进了准确的表面对齐,并能够实现高频细节的渲染以及准确的视角相关效果。
  6. 实验结果表明,在标准NVS基准测试中,该方法在渲染质量上显著优于之前的高斯Splatting方法。

Cool Papers

点此查看论文截图

Dental3R: Geometry-Aware Pairing for Intraoral 3D Reconstruction from Sparse-View Photographs

Authors:Yiyi Miao, Taoyu Wu, Tong Chen, Ji Jiang, Zhe Tang, Zhengyong Jiang, Angelos Stefanidis, Limin Yu, Jionglong Su

Intraoral 3D reconstruction is fundamental to digital orthodontics, yet conventional methods like intraoral scanning are inaccessible for remote tele-orthodontics, which typically relies on sparse smartphone imagery. While 3D Gaussian Splatting (3DGS) shows promise for novel view synthesis, its application to the standard clinical triad of unposed anterior and bilateral buccal photographs is challenging. The large view baselines, inconsistent illumination, and specular surfaces common in intraoral settings can destabilize simultaneous pose and geometry estimation. Furthermore, sparse-view photometric supervision often induces a frequency bias, leading to over-smoothed reconstructions that lose critical diagnostic details. To address these limitations, we propose \textbf{Dental3R}, a pose-free, graph-guided pipeline for robust, high-fidelity reconstruction from sparse intraoral photographs. Our method first constructs a Geometry-Aware Pairing Strategy (GAPS) to intelligently select a compact subgraph of high-value image pairs. The GAPS focuses on correspondence matching, thereby improving the stability of the geometry initialization and reducing memory usage. Building on the recovered poses and point cloud, we train the 3DGS model with a wavelet-regularized objective. By enforcing band-limited fidelity using a discrete wavelet transform, our approach preserves fine enamel boundaries and interproximal edges while suppressing high-frequency artifacts. We validate our approach on a large-scale dataset of 950 clinical cases and an additional video-based test set of 195 cases. Experimental results demonstrate that Dental3R effectively handles sparse, unposed inputs and achieves superior novel view synthesis quality for dental occlusion visualization, outperforming state-of-the-art methods.

口腔内3D重建是数字正畸的基础,但传统的如口腔内扫描等方法对于远程正畸治疗并不可行,远程正畸通常依赖于稀疏的手机图像。虽然3D高斯投影技术(3DGS)在新型视图合成方面显示出潜力,但将其应用于未定位的前部及双侧颊面照片的标准临床三联体却具有挑战性。口腔环境中常见的视野基线大、照明不一致和镜面表面等问题会破坏姿态和几何估计的同时性。此外,稀疏视图的光度监督通常会引发频率偏差,导致过度平滑的重建,从而丢失关键的诊断细节。为了解决这些局限性,我们提出了无姿态的图形引导管道Dental3R,可从稀疏的口腔内照片中进行稳健、高保真重建。我们的方法首先构建了一个基于几何感知配对策略(GAPS),以智能地选择具有高价值图像对的紧凑子图。GAPS侧重于对应匹配,从而提高了几何初始化的稳定性并降低了内存使用。基于恢复的姿态和点云,我们使用小波正则化的目标训练了3DGS模型。通过离散小波变换强制实施带限保真度,我们的方法能够保留精细的釉质边界和邻接边缘,同时抑制高频伪影。我们在包含950个临床病例的大规模数据集和包含另外195个病例的视频测试集上验证了我们的方法。实验结果表明,Dental3R能够处理稀疏且无姿态的输入,并实现了高质量的牙齿咬合可视化新型视图合成,优于当前最先进的方法。

论文及项目相关链接

PDF

Summary

本文介绍了针对远程正畸中的稀疏手机图像,采用一种名为Dental3R的方法,通过构建几何感知配对策略(GAPS)和基于小波正则化的目标训练三维高斯平铺(3DGS)模型,实现了稳健、高保真度的口腔内三维重建。该方法在大型数据集上的验证结果表明,其能处理稀疏、非定位输入,并在牙齿咬合可视化方面实现优于现有方法的新型视图合成质量。

Key Takeaways

  • 传统的口腔内三维重建方法如口腔内扫描对于远程正畸不可行,因此需要新型解决方案。
  • Dental3R是一种用于稳健、高保真重建的方法,可以从稀疏的口腔内照片中建立模型。它提出了一个几何感知配对策略(GAPS),以改善几何初始化的稳定性并减少内存使用。
  • 该方法采用小波正则化的目标训练三维高斯拼接(3DGS)模型,旨在从稀疏的非定位输入中重建高质量的图像。该策略可以在保护牙齿表面的细微边界和牙缝的同时抑制高频伪影。
  • 在大型数据集上的验证结果显示,Dental3R方法在牙齿咬合可视化方面优于现有方法的新型视图合成质量。

Cool Papers

点此查看论文截图

Gaussian Splatting-based Low-Rank Tensor Representation for Multi-Dimensional Image Recovery

Authors:Yiming Zeng, Xi-Le Zhao, Wei-Hao Wu, Teng-Yu Ji, Chao Wang

Tensor singular value decomposition (t-SVD) is a promising tool for multi-dimensional image representation, which decomposes a multi-dimensional image into a latent tensor and an accompanying transform matrix. However, two critical limitations of t-SVD methods persist: (1) the approximation of the latent tensor (e.g., tensor factorizations) is coarse and fails to accurately capture spatial local high-frequency information; (2) The transform matrix is composed of fixed basis atoms (e.g., complex exponential atoms in DFT and cosine atoms in DCT) and cannot precisely capture local high-frequency information along the mode-3 fibers. To address these two limitations, we propose a Gaussian Splatting-based Low-rank tensor Representation (GSLR) framework, which compactly and continuously represents multi-dimensional images. Specifically, we leverage tailored 2D Gaussian splatting and 1D Gaussian splatting to generate the latent tensor and transform matrix, respectively. The 2D and 1D Gaussian splatting are indispensable and complementary under this representation framework, which enjoys a powerful representation capability, especially for local high-frequency information. To evaluate the representation ability of the proposed GSLR, we develop an unsupervised GSLR-based multi-dimensional image recovery model. Extensive experiments on multi-dimensional image recovery demonstrate that GSLR consistently outperforms state-of-the-art methods, particularly in capturing local high-frequency information.

张量奇异值分解(t-SVD)是多维图像表示的一种有前途的工具,它将多维图像分解为一个潜在张量和一个伴随的转换矩阵。然而,t-SVD方法存在两个关键局限性:(1)潜在张量的近似(例如,张量分解)是粗糙的,无法准确捕获空间局部高频信息;(2)转换矩阵由固定基础原子(例如,DFT中的复数指数原子和DCT中的余弦原子)组成,无法精确捕获模式3纤维上的局部高频信息。为了解决这两个局限性,我们提出了基于高斯喷涂的低秩张量表示(GSLR)框架,该框架能紧凑且连续地表示多维图像。具体来说,我们利用定制的2D高斯喷涂和1D高斯喷涂来生成潜在张量和转换矩阵。在这种表示框架下,2D和1D高斯喷涂是必不可少且互补的,具有强大的表示能力,特别是对局部高频信息。为了评估所提出的GSLR的表示能力,我们开发了一个基于无监督GSLR的多维图像恢复模型。关于多维图像恢复的广泛实验表明,GSLR始终优于最先进的方法,特别是在捕获局部高频信息方面。

论文及项目相关链接

PDF

Summary

该文本介绍了张量奇异值分解(t-SVD)在多维图像表示中的应用及其存在的两个关键局限性。为解决这些问题,提出了基于高斯涂抹的低秩张量表示(GSLR)框架,该框架利用定制的二维和一维高斯涂抹生成潜在张量和转换矩阵,具有强大的表示能力,特别是针对局部高频信息。通过基于GSLR的无监督多维图像恢复模型实验验证,GSLR在捕捉局部高频信息方面表现优异,优于现有方法。

Key Takeaways

  1. t-SVD用于多维图像表示,但存在两个关键局限:潜在张量的近似不准确及转换矩阵无法精确捕捉局部高频信息。
  2. GSLR框架通过定制的二维和一维高斯涂抹解决上述问题,实现多维图像的更紧凑和连续表示。
  3. 二维和一维高斯涂抹在该框架中不可或缺且互补,特别擅长表示局部高频信息。
  4. GSLR框架具有强大的表示能力。
  5. 通过基于GSLR的无监督多维图像恢复模型实验验证,其性能优于现有方法。
  6. GSLR在捕捉局部高频信息方面表现优异。

Cool Papers

点此查看论文截图

iGaussian: Real-Time Camera Pose Estimation via Feed-Forward 3D Gaussian Splatting Inversion

Authors:Hao Wang, Linqing Zhao, Xiuwei Xu, Jiwen Lu, Haibin Yan

Recent trends in SLAM and visual navigation have embraced 3D Gaussians as the preferred scene representation, highlighting the importance of estimating camera poses from a single image using a pre-built Gaussian model. However, existing approaches typically rely on an iterative \textit{render-compare-refine} loop, where candidate views are first rendered using NeRF or Gaussian Splatting, then compared against the target image, and finally, discrepancies are used to update the pose. This multi-round process incurs significant computational overhead, hindering real-time performance in robotics. In this paper, we propose iGaussian, a two-stage feed-forward framework that achieves real-time camera pose estimation through direct 3D Gaussian inversion. Our method first regresses a coarse 6DoF pose using a Gaussian Scene Prior-based Pose Regression Network with spatial uniform sampling and guided attention mechanisms, then refines it through feature matching and multi-model fusion. The key contribution lies in our cross-correlation module that aligns image embeddings with 3D Gaussian attributes without differentiable rendering, coupled with a Weighted Multiview Predictor that fuses features from Multiple strategically sampled viewpoints. Experimental results on the NeRF Synthetic, Mip-NeRF 360, and T&T+DB datasets demonstrate a significant performance improvement over previous methods, reducing median rotation errors to 0.2° while achieving 2.87 FPS tracking on mobile robots, which is an impressive 10 times speedup compared to optimization-based approaches. Code: https://github.com/pythongod-exe/iGaussian

近期的SLAM和视觉导航趋势已经采用3D高斯作为首选的场景表示,强调利用预构建的高斯模型从单张图像估计相机姿态的重要性。然而,现有方法通常依赖于迭代渲染-比较-优化循环,首先使用NeRF或高斯拼贴法渲染候选视图,然后与目标图像进行比较,最后利用差异来更新姿态。这种多轮过程产生了大量的计算开销,阻碍了机器人在实时性能方面的表现。在本文中,我们提出了iGaussian,这是一个两阶段的前馈框架,通过直接的3D高斯反演实现实时相机姿态估计。我们的方法首先使用基于高斯场景先验的姿态回归网络进行粗略的6DoF姿态回归,该网络具有空间均匀采样和导向注意机制。然后进行特征匹配和多模型融合进行细化。关键贡献在于我们的互相关模块,它无需可微分渲染即可将图像嵌入与3D高斯属性对齐,与加权多视角预测器相结合,融合来自多个战略采样视角的特征。在NeRF合成、Mip-NeRF 360和T&T+DB数据集上的实验结果表明,与之前的方法相比,我们的方法具有显著的性能改进,将中位数旋转误差减少到0.2°,同时在移动机器人上实现2.87 FPS的跟踪速度,与优化方法相比,速度提高了令人印象深刻的10倍。代码地址:https://github.com/pythongod-exe/iGaussian

论文及项目相关链接

PDF IROS 2025

Summary
基于3D高斯场景表示,本文提出iGaussian方法,实现了实时相机姿态估计的直接3D高斯反演。通过高斯场景先验姿态回归网络和特征匹配多模型融合两个阶段,实现了高效准确的姿态估计。该方法避免了迭代渲染比较修正的耗时过程,显著提高了计算效率和实时性能。

Key Takeaways

  1. 当前SLAM和视觉导航趋势采用3D高斯作为场景表示的首选方法。
  2. 现有方法通常依赖于迭代渲染比较修正的循环,影响实时性能。
  3. iGaussian方法通过直接3D高斯反演实现实时相机姿态估计。
  4. iGaussian包含两个阶段的前馈框架:基于高斯场景先验的姿态回归网络和特征匹配多模型融合。
  5. 方法的关键贡献在于跨相关模块,该模块通过图像嵌入与3D高斯属性对齐,无需可微分渲染。
  6. 加权多视角预测器融合来自多个战略采样视角的特征。
  7. 实验结果表明,iGaussian在多个数据集上实现了显著的性能改进,实现了高效的相机姿态估计。

Cool Papers

点此查看论文截图

Splat Regression Models

Authors:Mara Daniels, Philippe Rigollet

We introduce a highly expressive class of function approximators called Splat Regression Models. Model outputs are mixtures of heterogeneous and anisotropic bump functions, termed splats, each weighted by an output vector. The power of splat modeling lies in its ability to locally adjust the scale and direction of each splat, achieving both high interpretability and accuracy. Fitting splat models reduces to optimization over the space of mixing measures, which can be implemented using Wasserstein-Fisher-Rao gradient flows. As a byproduct, we recover the popular Gaussian Splatting methodology as a special case, providing a unified theoretical framework for this state-of-the-art technique that clearly disambiguates the inverse problem, the model, and the optimization algorithm. Through numerical experiments, we demonstrate that the resulting models and algorithms constitute a flexible and promising approach for solving diverse approximation, estimation, and inverse problems involving low-dimensional data.

我们引入了一类高度表达的功能逼近器,称为Splat回归模型。模型输出是异质性和各向异性的凸起函数的混合物,称为Splats,每个Splat由输出向量加权。Splat建模的威力在于它能够在局部调整每个Splat的尺度和方向,从而实现高可解释性和准确性。拟合Splat模型归结为混合测量空间的优化问题,可以使用Wasserstein-Fisher-Rao梯度流来实现。作为副产品,我们恢复了流行的高斯平铺方法作为一个特例,为这种最新技术提供了一个统一的理论框架,明确了反问题、模型和优化算法之间的区别。通过数值实验,我们证明所得模型和算法是解决涉及低维数据的各种逼近、估计和反问题的灵活而有前途的方法。

论文及项目相关链接

PDF

Summary

本文介绍了一种称为Splat回归模型的高度表达性函数逼近器类别。模型输出是异质的、各向异性的bump函数的混合物,称为splat,每个splat由输出向量加权。Splat建模的威力在于其能够局部调整每个splat的尺度和方向,从而实现高可解释性和准确性。拟合splat模型简化为对混合度量的优化问题,可使用Wasserstein-Fisher-Rao梯度流实现。此外,我们将流行的Gaussian Splatting方法作为特殊情况重新获得,为这一最新技术提供了统一的理论框架,明确了反问题、模型和优化算法之间的区别。数值实验表明,所得模型和算法是解决涉及低维数据的各种逼近、估计和反问题的灵活而有前途的方法。

Key Takeaways

  1. 引入了名为Splat回归模型的高度表达性函数逼近器。
  2. Splat模型输出是异质的、各向异性的bump函数的混合物,每个splat由输出向量加权。
  3. Splat建模能局部调整每个splat的尺度和方向,实现高可解释性和准确性。
  4. 拟合splat模型转化为对混合度量的优化问题,可利用Wasserstein-Fisher-Rao梯度流实现。
  5. 恢复了流行的Gaussian Splatting方法,为其提供了统一的理论框架。
  6. 明确了反问题、模型和优化算法之间的区别。

Cool Papers

点此查看论文截图

Rapid Design and Fabrication of Body Conformable Surfaces with Kirigami Cutting and Machine Learning

Authors:Jyotshna Bali, Jinyang Li, Jie Chen, Suyi Li

By integrating the principles of kirigami cutting and data-driven modeling, this study aims to develop a personalized, rapid, and low-cost design and fabrication pipeline for creating body-conformable surfaces around the knee joint. The process begins with 3D scanning of the anterior knee surface of human subjects, followed by extracting the corresponding skin deformation between two joint angles in terms of longitudinal strain and Poisson’s ratio. In parallel, a machine learning model is constructed using extensive simulation data from experimentally calibrated finite element analysis. This model employs Gaussian Process (GP) regression to relate kirigami cut lengths to the resulting longitudinal strain and Poisson’s ratio. With an R2 score of 0.996, GP regression outperforms other models in predicting kirigami’s large deformations. Finally, an inverse design approach based on the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is used to generate kirigami patch designs that replicate the in-plane skin deformation observed from the knee scans. This pipeline was applied to three human subjects, and the resulting kirigami knee patches were fabricated using rapid laser cutting, requiring only a business day from knee scanning to kirigami patch delivery. The low-cost, personalized kirigami patches successfully conformed to over 75 percent of the skin area across all subjects, establishing a foundation for a wide range of wearable devices. The study demonstrates this potential through an impact-resistant kirigami foam patch, which not only conforms to dynamic knee motion but also provides joint protection against impact. Finally, the proposed design and fabrication framework is generalizable and can be extended to other deforming body surfaces, enabling the creation of personalized wearables such as protective gear, breathable adhesives, and body-conformable electronics.

本研究旨在结合剪纸工艺(kirigami)的原理和数据驱动建模,开发一种个性化、快速、低成本的设计和制造流程,以创建围绕膝关节的身体贴合表面。流程始于对人类膝关节前表面的3D扫描,然后通过提取两个关节角度之间的相应皮肤变形,得到纵向应变和泊松比。同时,利用实验校准有限元分析产生的大量仿真数据构建机器学习模型。该模型采用高斯过程(GP)回归,将剪纸长度与得到的纵向应变和泊松比相关联。GP回归的R²分数为0.996,在预测剪纸的大变形方面优于其他模型。最后,基于协方差矩阵自适应进化策略(CMA-ES)的逆向设计方法被用来生成复制从膝盖扫描中观察到的平面内皮肤变形的剪纸补丁设计。该流程应用于三名人类受试者,得到的剪纸膝盖补丁采用快速激光切割技术制作,从膝盖扫描到剪纸补丁交付只需一个工作日。这种低成本、个性化的剪纸补丁成功地贴合了所有受试者超过7.5%的皮肤区域,为各种可穿戴设备奠定了基础。该研究通过耐冲击的剪纸泡沫补丁展示了这一潜力,它不仅符合动态的膝盖运动,而且能为关节提供抗冲击保护。最后,所提出的设计制造框架具有通用性,可扩展到其他变形体表,使个性化可穿戴设备如防护装备、透气胶水和身体贴合电子产品的创造成为可能。

论文及项目相关链接

PDF

Summary
本研究通过结合切痕学原理与数据驱动建模,旨在开发一种个性化、快速且低成本的膝关节周围可贴合身体表面的设计与制造流程。通过对人体膝关节表面进行三维扫描并提取关节角度变化时的皮肤变形数据,结合机器学习模型预测切痕学的大变形,最后采用逆向设计方法生成复制膝关节扫描中的皮肤变形的切痕贴片设计。该流程成功应用于三名受试者,制造的切痕贴片只需一天即可从膝关节扫描到交付。所得贴片在超过75%的皮肤区域贴合良好,为研究可穿戴设备提供了基础。该研究方法具有通用性,可扩展至其他变形体表,用于制作个性化防护装备、透气胶带和贴合身体电子设备等。

Key Takeaways

  1. 本研究结合了切痕学原理与数据驱动建模,为膝关节个性化设计和制造提供了新方法。
  2. 通过三维扫描提取膝关节皮肤变形数据,为制作贴合表面的产品提供基础。
  3. 使用机器学习模型预测切痕学的大变形,其中高斯过程回归表现优异。
  4. 逆向设计方法成功生成与膝关节皮肤变形匹配的切痕贴片设计。
  5. 该流程在三天内完成从膝关节扫描到贴片的制作,证明了其快速和低成本的特点。
  6. 切痕贴片在超过75%的皮肤区域与受试者贴合良好,显示出其潜在应用价值。

Cool Papers

点此查看论文截图

Towards Understanding 3D Vision: the Role of Gaussian Curvature

Authors:Sherlon Almeida da Silva, Davi Geiger, Luiz Velho, Moacir Antonelli Ponti

Recent advances in computer vision have predominantly relied on data-driven approaches that leverage deep learning and large-scale datasets. Deep neural networks have achieved remarkable success in tasks such as stereo matching and monocular depth reconstruction. However, these methods lack explicit models of 3D geometry that can be directly analyzed, transferred across modalities, or systematically modified for controlled experimentation. We investigate the role of Gaussian curvature in 3D surface modeling. Besides Gaussian curvature being an invariant quantity under change of observers or coordinate systems, we demonstrate using the Middlebury stereo dataset that it offers a sparse and compact description of 3D surfaces. Furthermore, we show a strong correlation between the performance rank of top state-of-the-art stereo and monocular methods and the low total absolute Gaussian curvature. We propose that this property can serve as a geometric prior to improve future 3D reconstruction algorithms.

计算机视觉领域的最新进展主要依赖于利用深度学习和大规模数据集的数据驱动方法。深度神经网络在立体匹配和单目深度重建等任务中取得了显著的成功。然而,这些方法缺乏可以直接分析、跨模态迁移或进行系统修改以进行受控实验的3D几何的显式模型。我们研究了高斯曲率在3D表面建模中的作用。此外,高斯曲率是一个在观察者或坐标系变化下保持不变的数量,我们通过使用Middlebury立体数据集证明,它提供了稀疏且紧凑的3D表面描述。我们还展示了最先进的立体和单目方法的性能排名与较低的总绝对高斯曲率之间存在强烈的相关性。我们建议,这一属性可以作为未来3D重建算法的几何先验。

论文及项目相关链接

PDF

Summary
计算机视觉的最新进展主要依赖于利用深度学习和大规模数据集的数据驱动方法。尽管深度神经网络在立体匹配和单目深度重建等任务上取得了显著的成功,但这些方法缺乏明确的3D几何模型,无法直接进行分析、跨模态迁移或有针对性地修改进行受控实验。本研究探讨了高斯曲率在3D表面建模中的作用。除了高斯曲率具有观察者或坐标系变化下的不变性外,我们还利用Middlebury立体数据集证明了其能稀疏且紧凑地描述3D表面。此外,我们还发现顶级立体和单目方法的性能排名与总绝对高斯曲率之间存在强烈相关性。我们提议将此属性作为几何先验,以提高未来的3D重建算法。

Key Takeaways

  1. 计算机视觉的最新进展主要依赖数据驱动方法和深度学习。
  2. 尽管取得显著成功,但现有方法缺乏明确的3D几何模型。
  3. 高斯曲率在3D表面建模中具有重要作用。
  4. 高斯曲率具有观察者或坐标系变化下的不变性。
  5. 利用Middlebury立体数据集证明了高斯曲率能稀疏且紧凑地描述3D表面。
  6. 顶级立体和单目方法的性能与总绝对高斯曲率之间存在强烈相关性。

Cool Papers

点此查看论文截图


文章作者: Kedreamix
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !
 上一篇
NeRF NeRF
NeRF 方向最新论文已更新,请持续关注 Update in 2025-11-20 iGaussian Real-Time Camera Pose Estimation via Feed-Forward 3D Gaussian Splatting Inversion
2025-11-20
下一篇 
元宇宙/虚拟人 元宇宙/虚拟人
元宇宙/虚拟人 方向最新论文已更新,请持续关注 Update in 2025-11-20 PFAvatar Pose-Fusion 3D Personalized Avatar Reconstruction from Real-World Outfit-of-the-Day Photos
  目录