NeRF

发布日期: 2025-08-05

更新日期: 2025-08-20

文章字数: 2.6k

阅读时长: 10 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-08-05 更新

A Conditional GAN for Tabular Data Generation with Probabilistic Sampling of Latent Subspaces

Authors:Leonidas Akritidis, Panayiotis Bozanis

The tabular form constitutes the standard way of representing data in relational database systems and spreadsheets. But, similarly to other forms, tabular data suffers from class imbalance, a problem that causes serious performance degradation in a wide variety of machine learning tasks. One of the most effective solutions dictates the usage of Generative Adversarial Networks (GANs) in order to synthesize artificial data instances for the under-represented classes. Despite their good performance, none of the proposed GAN models takes into account the vector subspaces of the input samples in the real data space, leading to data generation in arbitrary locations. Moreover, the class labels are treated in the same manner as the other categorical variables during training, so conditional sampling by class is rendered less effective. To overcome these problems, this study presents ctdGAN, a conditional GAN for alleviating class imbalance in tabular datasets. Initially, ctdGAN executes a space partitioning step to assign cluster labels to the input samples. Subsequently, it utilizes these labels to synthesize samples via a novel probabilistic sampling strategy and a new loss function that penalizes both cluster and class mis-predictions. In this way, ctdGAN is trained to generate samples in subspaces that resemble those of the original data distribution. We also introduce several other improvements, including a simple, yet effective cluster-wise scaling technique that captures multiple feature modes without affecting data dimensionality. The exhaustive evaluation of ctdGAN with 14 imbalanced datasets demonstrated its superiority in generating high fidelity samples and improving classification accuracy.

表格形式构成了关系数据库系统和电子表格中表示数据的标准方式。但是，与其他形式类似，表格数据也面临着类别不平衡的问题，这一问题会在多种机器学习任务中导致严重的性能下降。最有效的解决方案之一是利用生成对抗网络（GANs）合成代表性不足类别的虚拟数据实例。尽管它们性能良好，但提出的GAN模型都没有考虑到真实数据空间中输入样本的向量子空间，导致数据在任意位置生成。此外，类标签在训练期间与其他分类变量以相同的方式处理，因此按类别有条件采样变得不那么有效。为了解决这些问题，本研究提出了ctdGAN，这是一种用于缓解表格数据集中类别不平衡问题的条件GAN。首先，ctdGAN执行空间划分步骤，为输入样本分配集群标签。然后，它利用这些标签通过一种新的概率采样策略和一个新的损失函数来合成样本，该损失函数会惩罚集群和类别的错误预测。通过这种方式，ctdGAN经过训练，能够在类似于原始数据分布的子空间中生成样本。我们还引入了其他几项改进，包括一种简单有效的集群缩放技术，能够捕获多个特征模式，而不会影响数据的维度。通过对14个不平衡数据集的ctdGAN的详尽评估，证明了其在生成高保真样本和提高分类精度方面的优越性。

论文及项目相关链接

PDF

Summary

本文主要介绍了一种解决表格数据类别不平衡问题的方法——ctdGAN。该方法通过空间分区步骤为输入样本分配集群标签，并利用这些标签通过新型概率采样策略和损失函数合成样本。ctdGAN能够在保持原始数据分布子空间的同时生成样本，并引入多种改进，包括有效的集群尺度技术，能捕捉多个特征模式而不影响数据维度。通过14个不平衡数据集的全面评估，证明了ctdGAN在高保真样本生成和提高分类精度方面的优越性。

Key Takeaways

表格数据同样面临类别不平衡问题，会影响机器学习任务的性能。
目前GAN模型在解决表格数据类别不平衡时存在缺陷，如数据生成位置任意、类别标签处理不当等。
ctdGAN是一种针对表格数据集类别不平衡问题的条件GAN解决方案。
ctdGAN通过空间分区步骤为输入样本分配集群标签，利用这些标签合成样本。
ctdGAN采用新型概率采样策略和损失函数，惩罚集群和类别的误预测。
ctdGAN能够生成与原始数据分布子空间相似的样本。

Cool Papers

点此查看论文截图

Authors:Yan Miao, Will Shen, Sayan Mitra

We present a novel framework demonstrating zero-shot sim-to-real transfer of visual control policies learned in a Neural Radiance Field (NeRF) environment for quadrotors to fly through racing gates. Robust transfer from simulation to real flight poses a major challenge, as standard simulators often lack sufficient visual fidelity. To address this, we construct a photorealistic simulation environment of quadrotor racing tracks, called FalconGym, which provides effectively unlimited synthetic images for training. Within FalconGym, we develop a pipelined approach for crossing gates that combines (i) a Neural Pose Estimator (NPE) coupled with a Kalman filter to reliably infer quadrotor poses from single-frame RGB images and IMU data, and (ii) a self-attention-based multi-modal controller that adaptively integrates visual features and pose estimation. This multi-modal design compensates for perception noise and intermittent gate visibility. We train this controller purely in FalconGym with imitation learning and deploy the resulting policy to real hardware with no additional fine-tuning. Simulation experiments on three distinct tracks (circle, U-turn and figure-8) demonstrate that our controller outperforms a vision-only state-of-the-art baseline in both success rate and gate-crossing accuracy. In 30 live hardware flights spanning three tracks and 120 gates, our controller achieves a 95.8% success rate and an average error of just 10 cm when flying through 38 cm-radius gates.

我们提出了一种新型框架，展示了在神经网络辐射场（NeRF）环境中学习的视觉控制策略在无射击状态下的模拟到真实转移，使四旋翼无人机能够穿越竞速门。从模拟到真实飞行的稳健转移是一个巨大的挑战，因为标准模拟器通常缺乏足够的视觉逼真度。为解决这一问题，我们构建了名为FalconGym的四旋翼竞速轨道的光学逼真模拟环境，它为训练提供了有效无限量的合成图像。在FalconGym中，我们开发了一种穿越大门的方法，它结合了（i）一个神经网络姿态估计器（NPE）与卡尔曼滤波器，可以可靠地从单帧RGB图像和IMU数据中推断出四旋翼的姿态；（ii）一种基于自注意力的多模态控制器，它自适应地整合视觉特征和姿态估计。这种多模态设计能够补偿感知噪声和间歇的门可见性。我们仅使用FalconGym进行模仿学习来训练该控制器，并将其应用到实际硬件上，无需额外微调。在三个不同轨道（圆圈、U形转弯和图形数字-待选择是否正确完成输入的图片为标准的要求呈现的不同几何图形的四个“（）”）上的模拟实验表明，我们的控制器在成功率和穿越门时的准确性方面都优于仅使用视觉的最新技术基线。在跨越三个轨道和穿过120个门的实际硬件飞行测试中，我们的控制器取得了高达95.8%的成功率，平均误差仅为穿过半径为38厘米的门的距离仅为十厘米。

论文及项目相关链接

PDF Accepted in IROS 2025

摘要

本研究提出了一种新型框架，展示了在神经网络辐射场（NeRF）环境中学习的视觉控制策略从模拟到现实世界的零样本迁移。该框架用于四旋翼无人机穿越竞速门。由于标准模拟器通常缺乏足够的视觉逼真度，从模拟到真实飞行的稳健迁移是一大挑战。为解决这一问题，我们构建了名为FalconGym的高仿真模拟环境，提供了用于训练的有效无限合成图像。在FalconGym中，我们开发了一种穿越门的方法，结合了神经姿态估计器（NPE）和卡尔曼滤波器，从单帧RGB图像和IMU数据中可靠推断四旋翼的姿态，以及基于自注意力的多模态控制器，该控制器自适应地整合视觉特征和姿态估计。多模态设计补偿了感知噪声和间歇性门可见问题。我们仅在FalconGym中使用模仿学习训练了该控制器，并将所得策略部署到实际硬件上，无需额外微调。在三个不同轨道上的模拟实验表明，我们的控制器在成功率和穿越门精度方面都优于仅使用视觉的基线。在涵盖三个轨道和120个门的实际硬件的飞行测试中，我们的控制器成功率达到了95.8%，平均误差仅为10厘米。

关键见解