⚠️ 以下所有内容总结都来自于 大语言模型的能力,如有错误,仅供参考,谨慎使用
🔴 请注意:千万不要用于严肃的学术场景,只能用于论文阅读前的初筛!
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ,还请您给我们一些鼓励!⭐️ HuggingFace免费体验
2025-11-06 更新
Object-Centric 3D Gaussian Splatting for Strawberry Plant Reconstruction and Phenotyping
Authors:Jiajia Li, Keyi Zhu, Qianwen Zhang, Dong Chen, Qi Sun, Zhaojian Li
Strawberries are among the most economically significant fruits in the United States, generating over $2 billion in annual farm-gate sales and accounting for approximately 13% of the total fruit production value. Plant phenotyping plays a vital role in selecting superior cultivars by characterizing plant traits such as morphology, canopy structure, and growth dynamics. However, traditional plant phenotyping methods are time-consuming, labor-intensive, and often destructive. Recently, neural rendering techniques, notably Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS), have emerged as powerful frameworks for high-fidelity 3D reconstruction. By capturing a sequence of multi-view images or videos around a target plant, these methods enable non-destructive reconstruction of complex plant architectures. Despite their promise, most current applications of 3DGS in agricultural domains reconstruct the entire scene, including background elements, which introduces noise, increases computational costs, and complicates downstream trait analysis. To address this limitation, we propose a novel object-centric 3D reconstruction framework incorporating a preprocessing pipeline that leverages the Segment Anything Model v2 (SAM-2) and alpha channel background masking to achieve clean strawberry plant reconstructions. This approach produces more accurate geometric representations while substantially reducing computational time. With a background-free reconstruction, our algorithm can automatically estimate important plant traits, such as plant height and canopy width, using DBSCAN clustering and Principal Component Analysis (PCA). Experimental results show that our method outperforms conventional pipelines in both accuracy and efficiency, offering a scalable and non-destructive solution for strawberry plant phenotyping.
草莓在美国是经济价值最高的水果之一,每年农场销售额超过2亿美元,约占水果生产总值的13%。植物表型鉴定在通过形态、冠层结构和生长动态等植物特征选择优良品种方面起着至关重要的作用。然而,传统的植物表型鉴定方法耗时、劳力密集,并且往往具有破坏性。最近,神经渲染技术,特别是神经辐射场(NeRF)和3D高斯涂抹(3DGS),已经涌现为高性能的3D重建框架。通过捕捉目标植物周围的一系列多视图图像或视频,这些方法能够实现复杂的植物架构的非破坏性重建。尽管有它们的潜力,但目前在农业领域应用的3DGS大多重建了整个场景,包括背景元素,这引入了噪声,增加了计算成本,并使得下游特征分析复杂化。为了解决这一局限性,我们提出了一种结合预处理管道的新型对象中心3D重建框架,该预处理管道利用Segment Anything Model v2(SAM-2)和alpha通道背景掩膜技术实现干净的草莓植物重建。这种方法产生了更准确的几何表示,同时大大降低了计算时间。通过无背景重建,我们的算法可以自动估计重要的植物特征,如植物高度和冠层宽度,使用DBSCAN聚类分析和主成分分析(PCA)。实验结果表明,我们的方法在准确性和效率方面都优于传统管道,为草莓植物表型鉴定提供了可扩展和非破坏性的解决方案。
论文及项目相关链接
PDF 11 pages, 4 figures, 3 tables
Summary
本文介绍了草莓在美国的经济重要性,以及植物表型选择在优质品种选育中的关键作用。传统方法耗时且具破坏性。最近,神经渲染技术,如神经辐射场(NeRF)和三维高斯摊开(3DGS),已出现为高保真三维重建的强大框架。为解决现有3DGS方法在农业领域重建整个场景的问题,包括背景元素在内引入噪声和增加计算成本,提出一种新型以对象为中心的三维重建框架,采用预处理管道,利用分割任何事情模型v2(SAM-2)和alpha通道背景遮蔽来实现清洁的草莓植物重建。该方法能更准确地表现几何特征,同时大大减少计算时间。通过无背景重建,算法可自动估算重要的植物特征,如植物高度和冠层宽度。
Key Takeaways
- 草莓在美国是重要的经济水果,年销售额超过两亿美元,占总水果产值的约13%。
- 植物表型选择对于选择优质品种至关重要,传统方法存在耗时、劳动密集、破坏性等问题。
- 神经渲染技术如NeRF和3DGS可实现高保真三维重建。
- 当前3DGS方法在农业领域的应用常重建整个场景,包括背景元素,导致噪声、计算成本增加和分析复杂。
- 为解决此问题,提出了新型以对象为中心的三维重建框架,采用SAM-2模型和alpha通道背景遮蔽实现草莓植物精准重建。
- 此方法能更准确地表现几何特征,同时减少计算时间。
点此查看论文截图
Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey
Authors:Jiahui Zhang, Yuelei Li, Anpei Chen, Muyu Xu, Kunhao Liu, Jianyuan Wang, Xiao-Xiao Long, Hanxue Liang, Zexiang Xu, Hao Su, Christian Theobalt, Christian Rupprecht, Andrea Vedaldi, Kaichen Zhou, Paul Pu Liang, Shijian Lu, Fangneng Zhan
3D reconstruction and view synthesis are foundational problems in computer vision, graphics, and immersive technologies such as augmented reality (AR), virtual reality (VR), and digital twins. Traditional methods rely on computationally intensive iterative optimization in a complex chain, limiting their applicability in real-world scenarios. Recent advances in feed-forward approaches, driven by deep learning, have revolutionized this field by enabling fast and generalizable 3D reconstruction and view synthesis. This survey offers a comprehensive review of feed-forward techniques for 3D reconstruction and view synthesis, with a taxonomy according to the underlying representation architectures including point cloud, 3D Gaussian Splatting (3DGS), Neural Radiance Fields (NeRF), etc. We examine key tasks such as pose-free reconstruction, dynamic 3D reconstruction, and 3D-aware image and video synthesis, highlighting their applications in digital humans, SLAM, robotics, and beyond. In addition, we review commonly used datasets with detailed statistics, along with evaluation protocols for various downstream tasks. We conclude by discussing open research challenges and promising directions for future work, emphasizing the potential of feed-forward approaches to advance the state of the art in 3D vision.
3D重建和视图合成是计算机视觉、图形学和沉浸式技术(如增强现实(AR)、虚拟现实(VR)和数字孪生)中的基础问题。传统方法依赖于复杂链中的计算密集型迭代优化,这在现实场景中的应用有一定的局限性。最近,由深度学习驱动的前馈方法的最新进展已经彻底改变了这一领域,使快速和通用的3D重建和视图合成成为可能。本文全面回顾了用于3D重建和视图合成的前馈技术,根据底层架构进行分类,包括点云、三维高斯贴图(3DGS)、神经辐射场(NeRF)等。我们研究了姿势无关重建、动态三维重建和三维感知图像和视频合成等关键任务,重点介绍它们在数字人类、SLAM、机器人等领域的应用。此外,我们还回顾了常用数据集及其详细统计数据,以及各种下游任务的评估协议。最后,我们讨论了开放的研究挑战和未来工作的有前途的方向,强调了前馈方法在推动计算机视觉前沿方面的潜力。
论文及项目相关链接
PDF A project page associated with this survey is available at https://fnzhan.com/projects/Feed-Forward-3D
Summary
基于深度学习的前馈方法革新了三维重建和视图合成领域,具有快速且可泛化的优点。本文综述了前馈技术在三维重建和视图合成中的应用,按基础架构分类,包括点云、三维高斯贴片技术和神经辐射场等。还探讨了姿态无关重建、动态三维重建和三维感知图像视频合成等关键任务,及其在数字人类、SLAM和机器人等领域的应用。同时介绍了常用数据集和评估协议,并指出了当前面临的研究挑战和未来发展方向。
Key Takeaways
以下是本文的七个关键见解:
- 前馈方法基于深度学习在三维重建和视图合成领域具有显著优势。
- 传统方法受限于计算密集型的迭代优化,而前馈方法实现了快速且可泛化的三维重建和视图合成。
- 文章提供了对前馈技术的全面综述,涵盖了多种基础架构分类,包括点云、三维高斯贴片技术和神经辐射场等。
- 关键任务包括姿态无关重建、动态三维重建和三维感知图像视频合成等受到关注。
- 这些技术在数字人类、SLAM和机器人等领域有广泛的应用前景。
- 文章介绍了常用的数据集,提供了详细的统计数据,并讨论了各种下游任务的评估协议。
点此查看论文截图
MediQ-GAN: Quantum-Inspired GAN for High Resolution Medical Image Generation
Authors:Qingyue Jiao, Yongcan Tang, Jun Zhuang, Jason Cong, Yiyu Shi
Machine learning-assisted diagnosis shows promise, yet medical imaging datasets are often scarce, imbalanced, and constrained by privacy, making data augmentation essential. Classical generative models typically demand extensive computational and sample resources. Quantum computing offers a promising alternative, but existing quantum-based image generation methods remain limited in scale and often face barren plateaus. We present MediQ-GAN, a quantum-inspired GAN with prototype-guided skip connections and a dual-stream generator that fuses classical and quantum-inspired branches. Its variational quantum circuits inherently preserve full-rank mappings, avoid rank collapse, and are theory-guided to balance expressivity with trainability. Beyond generation quality, we provide the first latent-geometry and rank-based analysis of quantum-inspired GANs, offering theoretical insight into their performance. Across three medical imaging datasets, MediQ-GAN outperforms state-of-the-art GANs and diffusion models. While validated on IBM hardware for robustness, our contribution is hardware-agnostic, offering a scalable and data-efficient framework for medical image generation and augmentation.
机器学习辅助诊断具有广阔前景,然而医学成像数据集通常稀缺、分布不均且受到隐私限制,这使得数据增强变得至关重要。传统生成模型通常需要大量的计算和样本资源。量子计算提供了一种有前途的替代方案,但现有的基于量子图像的生成方法在规模上仍然受到限制,并且常常面临缺乏训练样本的困境。我们提出了MediQ-GAN,这是一个受量子启发的GAN,具有原型引导的跳过连接和双流生成器,融合了经典和量子启发分支。其变分量子电路固有地保持全秩映射,避免秩崩溃,并在理论指导下平衡了表达性和可训练性。除了生成质量外,我们还提供了首个基于潜在几何和秩的量子启发GAN分析,为其性能提供了理论见解。在三个医学成像数据集上,MediQ-GAN优于最先进的GANs和扩散模型。尽管在IBM硬件上进行了稳健性验证,但我们的贡献是硬件无关的,提供了一个可扩展和高效的数据框架用于医学图像生成和增强。
论文及项目相关链接
Summary
本文介绍了MediQ-GAN,这是一种结合了量子计算思想的生成对抗网络(GAN),用于医学图像生成和增强。它采用原型引导跳跃连接和双流生成器,可在有限的计算资源下生成高质量医学图像。同时,文章还对量子启发式的GAN进行了首次潜在几何和排名分析,为理解其性能提供了理论洞察。MediQ-GAN在三个医学成像数据集上的表现均优于最先进的其他GAN和扩散模型。
Key Takeaways
- 医学成像数据集的稀缺性、不平衡性和隐私约束使得数据增强变得重要。
- 经典生成模型需要大量计算资源和样本,而量子计算提供了一个有前景的替代方案。
- MediQ-GAN结合了量子计算思想,采用原型引导跳跃连接和双流生成器,生成高质量医学图像。
- 首次对量子启发式的GAN进行潜在几何和排名分析,提供理论洞察。
- MediQ-GAN在三个医学成像数据集上的表现优于其他先进模型。
- 该方法在IBM硬件上经过验证,具有稳健性,并且是硬件无关的。
点此查看论文截图