嘘~ 正在从服务器偷取页面 . . .

Few-Shot


⚠️ 以下所有内容总结都来自于 大语言模型的能力,如有错误,仅供参考,谨慎使用
🔴 请注意:千万不要用于严肃的学术场景,只能用于论文阅读前的初筛!
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ,还请您给我们一些鼓励!⭐️ HuggingFace免费体验

2025-10-23 更新

Beyond the Explicit: A Bilingual Dataset for Dehumanization Detection in Social Media

Authors:Dennis Assenmacher, Paloma Piot, Katarina Laken, David Jurgens, Claudia Wagner

Digital dehumanization, although a critical issue, remains largely overlooked within the field of computational linguistics and Natural Language Processing. The prevailing approach in current research concentrating primarily on a single aspect of dehumanization that identifies overtly negative statements as its core marker. This focus, while crucial for understanding harmful online communications, inadequately addresses the broader spectrum of dehumanization. Specifically, it overlooks the subtler forms of dehumanization that, despite not being overtly offensive, still perpetuate harmful biases against marginalized groups in online interactions. These subtler forms can insidiously reinforce negative stereotypes and biases without explicit offensiveness, making them harder to detect yet equally damaging. Recognizing this gap, we use different sampling methods to collect a theory-informed bilingual dataset from Twitter and Reddit. Using crowdworkers and experts to annotate 16,000 instances on a document- and span-level, we show that our dataset covers the different dimensions of dehumanization. This dataset serves as both a training resource for machine learning models and a benchmark for evaluating future dehumanization detection techniques. To demonstrate its effectiveness, we fine-tune ML models on this dataset, achieving performance that surpasses state-of-the-art models in zero and few-shot in-context settings.

数字化非人性化虽然是一个关键问题,但在计算语言学和自然语言处理领域仍然被忽视。当前的研究方法主要集中在非人性化的一个方面,即以明显的负面言论为核心标志。这种关注虽然对于理解有害的在线交流至关重要,但不足以涵盖更广泛的非人性化现象。具体来说,它会忽视更微妙的非人性化形式,这些形式虽然没有明显的攻击性,但仍然会在在线互动中对边缘群体造成有害的偏见。这些微妙的偏见形式可以在没有明显攻击性的情况下暗中强化负面刻板印象和偏见,使其更难检测但同样具有破坏性。认识到了这一差距,我们使用不同的采样方法从Twitter和Reddit上收集了一个理论上的双语数据集。我们利用众包工人和专家在文档和跨度层面标注了16000个实例,证明了我们的数据集涵盖了非人性化的不同维度。这个数据集既是训练机器学习模型的资源,也是评估未来非人性化检测技术的基准。为了证明其有效性,我们在该数据集上微调了机器学习模型,实现了在零次和少次情境下的性能超越现有最先进的模型。

论文及项目相关链接

PDF

Summary

本文关注数字时代的去人性化问题,指出当前研究主要关注显性负面言论,忽视了更广泛和微妙的去人性化形式。为此,作者使用不同的采样方法从Twitter和Reddit收集双语数据集,通过众包工人和专家标注了16,000个文档和片段级别的实例。该数据集既可作为机器学习模型的训练资源,也可作为评估未来去人性化检测技术的基准。利用此数据集微调机器学习模型,在零样本和少样本上下文中实现了超越现有模型性能的效果。

Key Takeaways

  1. 数字去人性化是计算语言学和自然语言处理领域被忽视的关键问题。
  2. 当前研究主要关注显性负面言论作为去人性化的核心标记,但这种关注不足以涵盖更广泛的去人性化形式。
  3. 更微妙的去人性化形式虽不带有明显的攻击性,但仍会在在线互动中加剧对边缘群体的有害偏见。
  4. 作者使用不同的采样方法从Twitter和Reddit收集了一个涵盖多种去人性化维度的双语数据集。
  5. 该数据集包含大量实例,并通过众包工人和专家进行了文档和片段级别的标注。
  6. 数据集既可用于训练机器学习模型,也为评估去人性化检测技术提供了基准。

Cool Papers

点此查看论文截图

DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP

Authors:Mariano Barone, Antonio Laudante, Giuseppe Riccio, Antonio Romano, Marco Postiglione, Vincenzo Moscato

The extraction of pharmacological knowledge from regulatory documents has become a key focus in biomedical natural language processing, with applications ranging from adverse event monitoring to AI-assisted clinical decision support. However, research in this field has predominantly relied on English-language corpora such as DrugBank, leaving a significant gap in resources tailored to other healthcare systems. To address this limitation, we introduce DART (Drug Annotation from Regulatory Texts), the first structured corpus of Italian Summaries of Product Characteristics derived from the official repository of the Italian Medicines Agency (AIFA). The dataset was built through a reproducible pipeline encompassing web-scale document retrieval, semantic segmentation of regulatory sections, and clinical summarization using a few-shot-tuned large language model with low-temperature decoding. DART provides structured information on key pharmacological domains such as indications, adverse drug reactions, and drug-drug interactions. To validate its utility, we implemented an LLM-based drug interaction checker that leverages the dataset to infer clinically meaningful interactions. Experimental results show that instruction-tuned LLMs can accurately infer potential interactions and their clinical implications when grounded in the structured textual fields of DART. We publicly release our code on GitHub: https://github.com/PRAISELab-PicusLab/DART.

从监管文献中提取药理学知识已成为生物医学自然语言处理的关键焦点,其应用范围从不良事件监测到人工智能辅助临床决策支持。然而,该领域的研究主要依赖于英语语料库,如DrugBank,针对其他医疗系统的资源存在巨大空白。为了解决这一局限性,我们引入了DART(来自监管文本的药品注释),这是从意大利药品管理局(AIFA)官方存储库中提取的意大利药品特性摘要的第一个结构化语料库。该数据集是通过可重复的管道构建的,该管道包括网页规模文档检索、监管部分的语义分割以及使用少量样本训练的大型语言模型进行临床总结,采用低温解码。DART提供有关关键药理学领域的结构化信息,如适应症、药物不良反应和药物相互作用。为了验证其实用性,我们实现了一个基于LLM的药物相互作用检查器,它利用该数据集推断具有临床意义的相互作用。实验结果表明,在DART的结构化文本字段的基础上,经指令训练过的LLM可以准确推断潜在的相互作用及其临床意义。我们在GitHub上公开发布我们的代码:https://github.com/PRAISELab-PicusLab/DART。

论文及项目相关链接

PDF

Summary
文本介绍了药物信息提取中的挑战与缺失。为了应对这些问题,开发了一种新的语料库DART(来自意大利药品管理局监管文本的药品注释),该语料库采用可重现的管道构建,包括文档检索、语义分割和临床摘要。该语料库主要用于药理学的关键领域,如适应症、不良反应和药物相互作用。研究表明,基于LLM的药物相互作用检查器可以准确推断潜在相互作用及其临床意义。

Key Takeaways

  • 药物信息提取在生物医学自然语言处理中成为关键焦点。
  • 现有的研究主要依赖英语语料库,对非英语国家的医疗体系缺乏针对性的资源。
  • DART是首个基于意大利药品特性的结构化语料库,来自意大利药品管理局的官方存储库。
  • DART为关键药理学领域提供结构化信息,如适应症、不良反应和药物相互作用。

Cool Papers

点此查看论文截图

Efficient Few-shot Identity Preserving Attribute Editing for 3D-aware Deep Generative Models

Authors:Vishal Vinod

Identity preserving editing of faces is a generative task that enables modifying the illumination, adding/removing eyeglasses, face aging, editing hairstyles, modifying expression etc., while preserving the identity of the face. Recent progress in 2D generative models have enabled photorealistic editing of faces using simple techniques leveraging the compositionality in GANs. However, identity preserving editing for 3D faces with a given set of attributes is a challenging task as the generative model must reason about view consistency from multiple poses and render a realistic 3D face. Further, 3D portrait editing requires large-scale attribute labelled datasets and presents a trade-off between editability in low-resolution and inflexibility to editing in high resolution. In this work, we aim to alleviate some of the constraints in editing 3D faces by identifying latent space directions that correspond to photorealistic edits. To address this, we present a method that builds on recent advancements in 3D-aware deep generative models and 2D portrait editing techniques to perform efficient few-shot identity preserving attribute editing for 3D-aware generative models. We aim to show from experimental results that using just ten or fewer labelled images of an attribute is sufficient to estimate edit directions in the latent space that correspond to 3D-aware attribute editing. In this work, we leverage an existing face dataset with masks to obtain the synthetic images for few attribute examples required for estimating the edit directions. Further, to demonstrate the linearity of edits, we investigate one-shot stylization by performing sequential editing and use the (2D) Attribute Style Manipulation (ASM) technique to investigate a continuous style manifold for 3D consistent identity preserving face aging. Code and results are available at: https://vishal-vinod.github.io/gmpi-edit/

面部身份保留编辑是一项生成任务,能够修改照明、添加/删除眼镜、面部衰老、编辑发型、修改表情等,同时保留面部的身份。最近,二维生成模型的进步已经能够使用简单的技术利用生成对抗网络(GANs)的组成性,进行逼真的面部编辑。然而,对于给定的属性集进行三维面部身份保留编辑是一项具有挑战性的任务,因为生成模型必须从多个姿势中推断视图一致性并渲染一个真实的三维面部。此外,三维肖像编辑需要大规模属性标签数据集,并且在低分辨率下的可编辑性和高分辨率下的编辑灵活性之间存在权衡。在这项工作中,我们的目标是通过对潜在空间方向的识别来缓解三维面部编辑的一些约束,这些方向对应于逼真的编辑。为解决这一问题,我们提出了一种方法,该方法建立在最新的三维深度生成模型和二维肖像编辑技术之上,为三维感知生成模型执行高效的身份保留属性编辑。我们的实验结果表明,仅使用十个或更少的属性标签图像就足以估计潜在空间中对应于三维感知属性编辑的编辑方向。在这项工作中,我们利用现有的带面具的面部数据集来获得少数属性示例的合成图像,这些图像用于估计编辑方向。此外,为了证明编辑的线性,我们通过顺序编辑进行了一次性风格化,并使用(二维)属性风格操纵(ASM)技术来研究用于三维一致的身份保留面部衰老的连续风格流形。代码和结果可通过以下网址获取:https://vishal-vinod.github.io/gmpi-edit/

论文及项目相关链接

PDF 14 pages, 7 figures

摘要

面对身份保留编辑任务,本文旨在解决在少量样本下对三维人脸进行身份保留编辑的挑战。通过利用基于深度学习的三维感知生成模型和二维肖像编辑技术,实现了高效的少样本身份保留属性编辑。实验结果显示,只需十个或更少的样本,就能估算出对应的三维感知属性编辑的潜在空间方向。利用带有掩码的面部数据集生成合成图像来估计编辑方向。此外,通过顺序编辑和二维属性风格操纵技术,验证了编辑的线性特性和连续风格流形在保持三维一致性身份不变的情况下进行面部衰老等操作的可行性。相关代码和结果已公开分享。

关键见解

  1. 身份保留编辑允许修改面部特征如光照、眼镜、面部年龄、发型和表情等,同时保持面部身份不变。
  2. 三维人脸的属性编辑具有挑战性,需要模型理解多个姿态的视图一致性并呈现逼真的三维效果。
  3. 提出一种基于三维感知生成模型和二维肖像编辑技术的方法,实现了高效的少样本身份保留属性编辑。
  4. 使用仅十个或更少样本的标签图像,就可以估算潜在空间方向,这些方向对应于三维感知属性编辑。
  5. 利用带有掩码的面部数据集生成合成图像来训练模型和执行编辑操作。
  6. 顺序编辑验证了一次性风格化操作的可行性,这种操作可在连续的风格流形上进行,保持三维一致性和身份不变。

Cool Papers

点此查看论文截图

BlendCLIP: Bridging Synthetic and Real Domains for Zero-Shot 3D Object Classification with Multimodal Pretraining

Authors:Ajinkya Khoche, Gergő László Nagy, Maciej Wozniak, Thomas Gustafsson, Patric Jensfelt

Zero-shot 3D object classification is crucial for real-world applications like autonomous driving, however it is often hindered by a significant domain gap between the synthetic data used for training and the sparse, noisy LiDAR scans encountered in the real-world. Current methods trained solely on synthetic data fail to generalize to outdoor scenes, while those trained only on real data lack the semantic diversity to recognize rare or unseen objects. We introduce BlendCLIP, a multimodal pretraining framework that bridges this synthetic-to-real gap by strategically combining the strengths of both domains. We first propose a pipeline to generate a large-scale dataset of object-level triplets – consisting of a point cloud, image, and text description – mined directly from real-world driving data and human annotated 3D boxes. Our core contribution is a curriculum-based data mixing strategy that first grounds the model in the semantically rich synthetic CAD data before progressively adapting it to the specific characteristics of real-world scans. Our experiments show that our approach is highly label-efficient: introducing as few as 1.5% real-world samples per batch into training boosts zero-shot accuracy on the nuScenes benchmark by 27%. Consequently, our final model achieves state-of-the-art performance on challenging outdoor datasets like nuScenes and TruckScenes, improving over the best prior method by 19.3% on nuScenes, while maintaining strong generalization on diverse synthetic benchmarks. Our findings demonstrate that effective domain adaptation, not full-scale real-world annotation, is the key to unlocking robust open-vocabulary 3D perception. Our code and dataset will be released upon acceptance on https://github.com/kesu1/BlendCLIP.

零样本3D目标分类对于自动驾驶等实际应用至关重要,然而它常常受到训练时使用的合成数据与现实中稀疏、嘈杂的激光雷达扫描之间存在的巨大领域差距的阻碍。当前仅依赖合成数据训练的方法无法推广到室外场景,而仅依赖真实数据训练的方法则缺乏语义多样性,无法识别稀有或未见过的目标。我们引入了BlendCLIP,这是一个多模态预训练框架,它通过结合两个领域的优势来缩小合成与真实之间的差距。我们首先提出一个流程来生成大规模的目标级三元组数据集,该数据集由直接从现实世界驾驶数据中挖掘出的点云、图像和文本描述组成,以及人工标注的3D框。我们的核心贡献是基于课程的数据混合策略,该策略首先使模型基于语义丰富的合成CAD数据,然后逐步适应现实扫描的特定特征。我们的实验表明,我们的方法非常高效:每批仅引入1.5%的真实世界样本即可提高nuScenes基准测试上的零样本准确率27%。因此,我们的最终模型在具有挑战性的室外数据集(如nuScenes和TruckScenes)上实现了最先进的性能,在nuScenes上的表现比最佳先前方法提高了19.3%,同时在各种合成基准测试中保持强大的泛化能力。我们的代码和数据集将在验收后发布在https://github.com/kesu1/BlendCLIP上。

论文及项目相关链接

PDF Under Review

Summary

本文介绍了BlendCLIP框架,这是一种解决零样本三维物体分类中的合成数据与现实数据差距问题的新方法。该框架结合了两者的优势,通过课程式的数据混合策略,首先在语义丰富的合成CAD数据基础上训练模型,然后逐步适应真实世界扫描的特定特征。实验表明,该方法高度标签有效,仅在训练批次中引入1.5%的真实世界样本即可大幅提升nuScenes基准测试上的零样本准确率。最终模型在具有挑战性的户外数据集上实现了最佳性能。

Key Takeaways

  1. BlendCLIP是一个多模态预训练框架,旨在缩小合成数据与现实数据之间的差距,提高零样本三维物体分类的性能。
  2. 该方法结合生成大规模的对象级三元组数据集,包括点云、图像和文本描述,挖掘自真实世界驾驶数据和人标注的三维框。
  3. 采用课程式的数据混合策略,先在合成CAD数据上训练模型,再逐步适应真实世界扫描的特性。
  4. 引入少量真实世界样本(仅1.5%)即可显著提升模型性能,在nuScenes基准测试上提高27%的零样本准确率。
  5. 最终模型在具有挑战性的户外数据集上实现了最佳性能,相比之前最好的方法在nuScenes上提高了19.3%。
  6. 该方法强调有效的域自适应是关键,而不是全面现实世界的标注。

Cool Papers

点此查看论文截图

Preference-driven Knowledge Distillation for Few-shot Node Classification

Authors:Xing Wei, Chunchun Chen, Rui Fan, Xiaofeng Cao, Sourav Medya, Wei Ye

Graph neural networks (GNNs) can efficiently process text-attributed graphs (TAGs) due to their message-passing mechanisms, but their training heavily relies on the human-annotated labels. Moreover, the complex and diverse local topologies of nodes of real-world TAGs make it challenging for a single mechanism to handle. Large language models (LLMs) perform well in zero-/few-shot learning on TAGs but suffer from a scalability challenge. Therefore, we propose a preference-driven knowledge distillation (PKD) framework to synergize the complementary strengths of LLMs and various GNNs for few-shot node classification. Specifically, we develop a GNN-preference-driven node selector that effectively promotes prediction distillation from LLMs to teacher GNNs. To further tackle nodes’ intricate local topologies, we develop a node-preference-driven GNN selector that identifies the most suitable teacher GNN for each node, thereby facilitating tailored knowledge distillation from teacher GNNs to the student GNN. Extensive experiments validate the efficacy of our proposed framework in few-shot node classification on real-world TAGs. Our code is available.

图神经网络(GNNs)由于其信息传递机制,可以有效地处理文本属性图(TAGs)。然而,它们的训练严重依赖于人工标注的标签。此外,现实世界TAG节点复杂且多样的局部拓扑结构使得单一机制处理起来具有挑战性。大型语言模型(LLMs)在TAG的零/少样本学习上表现良好,但面临可扩展性挑战。因此,我们提出了一种偏好驱动的知识蒸馏(PKD)框架,以协同大型语言模型和各类图神经网络在少样本节点分类中的互补优势。具体来说,我们开发了一种GNN偏好驱动节点选择器,有效促进了从大型语言模型到教师图神经网络的预测蒸馏。为了进一步解决节点的复杂局部拓扑问题,我们开发了节点偏好驱动GNN选择器,它能针对每个节点确定最合适的教师图神经网络,从而推动定制化的知识蒸馏从教师图神经网络到学生图神经网络。大量实验验证了我们在现实世界TAG的少样本节点分类问题上提出框架的有效性。我们的代码已经公开可用。

论文及项目相关链接

PDF Accepted by NeurIPS 2025

Summary
本文提出了一个偏好驱动的知识蒸馏(PKD)框架,旨在结合大型语言模型(LLMs)和各种图神经网络(GNNs)的优势,用于少样本节点分类。该框架包括一个GNN偏好驱动节点选择器,有效促进LLMs到教师GNN的预测蒸馏。同时,针对节点的复杂局部拓扑结构,开发了一个节点偏好驱动的GNN选择器,为每个节点选择最合适的教师GNN,从而实现定制的知识蒸馏。实验证明,该框架在真实世界文本属性图(TAGs)的少样本节点分类中效果显著。

Key Takeaways

  1. 介绍了知识蒸馏方法应用于处理文本属性图(TAGs)中的节点分类问题。
  2. 结合大型语言模型(LLMs)与图神经网络(GNNs)的优势进行知识蒸馏。
  3. 提出一个偏好驱动的知识蒸馏框架(PKD),包含两个关键组件:一个促进预测蒸馏的GNN偏好驱动节点选择器和一个针对复杂局部拓扑结构的节点偏好驱动的GNN选择器。
  4. 该框架旨在解决单一机制难以处理真实世界TAGs中复杂和多样的局部拓扑问题。
  5. 通过大量实验验证了该框架在少样本节点分类任务上的有效性。
  6. 该研究提供了一个解决方案,克服了单一模型在处理具有不同属性的复杂图数据时的局限性。

Cool Papers

点此查看论文截图

Increasing the Utility of Synthetic Images through Chamfer Guidance

Authors:Nicola Dall’Asen, Xiaofeng Zhang, Reyhane Askari Hemmat, Melissa Hall, Jakob Verbeek, Adriana Romero-Soriano, Michal Drozdzal

Conditional image generative models hold considerable promise to produce infinite amounts of synthetic training data. Yet, recent progress in generation quality has come at the expense of generation diversity, limiting the utility of these models as a source of synthetic training data. Although guidance-based approaches have been introduced to improve the utility of generated data by focusing on quality or diversity, the (implicit or explicit) utility functions oftentimes disregard the potential distribution shift between synthetic and real data. In this work, we introduce Chamfer Guidance: a training-free guidance approach which leverages a handful of real exemplar images to characterize the quality and diversity of synthetic data. We show that by leveraging the proposed Chamfer Guidance, we can boost the diversity of the generations w.r.t. a dataset of real images while maintaining or improving the generation quality on ImageNet-1k and standard geo-diversity benchmarks. Our approach achieves state-of-the-art few-shot performance with as little as 2 exemplar real images, obtaining 96.4% in terms of precision, and 86.4% in terms of distributional coverage, which increase to 97.5% and 92.7%, respectively, when using 32 real images. We showcase the benefits of the Chamfer Guidance generation by training downstream image classifiers on synthetic data, achieving accuracy boost of up to 15% for in-distribution over the baselines, and up to 16% in out-of-distribution. Furthermore, our approach does not require using the unconditional model, and thus obtains a 31% reduction in FLOPs w.r.t. classifier-free-guidance-based approaches at sampling time.

条件图像生成模型具有产生无限合成训练数据的巨大潜力。然而,生成质量方面的最新进展却是以牺牲生成多样性为代价的,这限制了这些模型作为合成训练数据源的有效性。虽然基于引导的方法已经被引入,通过关注质量或多样性来提高生成数据的有效性,但(隐式或显式)效用函数通常忽略了合成数据和真实数据之间潜在的分发转移。在这项工作中,我们引入了Chamfer Guidance:一种无需训练的引导方法,它利用少量真实示例图像来表征合成数据的质量和多样性。我们表明,通过利用提出的Chamfer Guidance,我们可以在面对真实图像数据集时提高生成的多样性,同时在ImageNet-1k和标准地理多样性基准测试上保持或提高生成质量。我们的方法仅需几张真实示例图像即可实现最先进的少量性能,在精确度为96.4%和数据分布覆盖率为86.4%的情况下取得领先地位,在使用32张真实图像时,这些数字分别提高到97.5%和92.7%。我们通过训练合成数据上的下游图像分类器来展示Chamfer Guidance生成的优势,在内部和外部数据分布方面分别实现了最多高达15%和高达16%的准确度提升。此外,我们的方法不需要使用无条件模型,因此在采样时间方面与使用无分类引导法的方法相比,实现了高达31%的浮点运算减少。

论文及项目相关链接

PDF Accepted to NeurIPS 2025

Summary

本文介绍了名为Chamfer Guidance的条件图像生成模型改进方法。该方法通过利用少量真实示例图像来表征合成数据的质量和多样性,在不需要训练的情况下提高生成的多样性,同时保持或提高在ImageNet-1k和地理多样性基准测试上的生成质量。使用仅少数几张真实图像,该方法就能在少数镜头场景下实现最先进的性能,并在使用更多真实图像时进一步提高精度和分布覆盖。此外,该方法还展示了在合成数据上训练下游图像分类器的优势,可以在分布内和分布外分别提高15%和16%的准确率。同时,该方法不需要使用无条件模型,因此在采样时实现了与无分类器引导方法相比的31%的FLOPs减少。

Key Takeaways

  1. Chamfer Guidance是一种改进条件图像生成模型的方法,旨在提高合成数据的多样性和质量。
  2. 该方法利用少量真实示例图像来表征合成数据的质量和多样性,实现训练自由的指导方式。
  3. Chamfer Guidance在ImageNet-1k和地理多样性基准测试上表现出卓越的性能,能够保持或提高生成质量的同时提高多样性。
  4. 在少数镜头场景下,使用少量真实图像即可实现最先进的性能。随着真实图像数量的增加,性能和效果进一步提升。
  5. 通过合成数据训练的下游图像分类器能够显著提高分类准确率,尤其是在分布内和分布外的场景下。
  6. Chamfer Guidance方法不需要使用无条件模型,因此在采样时具有较低的FLOPs消耗,实现了效率的提升。

Cool Papers

点此查看论文截图


文章作者: Kedreamix
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !
 上一篇
I2I Translation I2I Translation
I2I Translation 方向最新论文已更新,请持续关注 Update in 2025-10-23 RayPose Ray Bundling Diffusion for Template Views in Unseen 6D Object Pose Estimation
下一篇 
Agent Agent
Agent 方向最新论文已更新,请持续关注 Update in 2025-10-23 Search Self-play Pushing the Frontier of Agent Capability without Supervision
2025-10-23
  目录