发布日期: 2025-06-28

更新日期: 2025-07-06

文章字数: 1.3k

阅读时长: 5 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-06-28 更新

Variational Supervised Contrastive Learning

Authors:Ziwen Wang, Jiajun Fan, Thao Nguyen, Heng Ji, Ge Liu

Contrastive learning has proven to be highly efficient and adaptable in shaping representation spaces across diverse modalities by pulling similar samples together and pushing dissimilar ones apart. However, two key limitations persist: (1) Without explicit regulation of the embedding distribution, semantically related instances can inadvertently be pushed apart unless complementary signals guide pair selection, and (2) excessive reliance on large in-batch negatives and tailored augmentations hinders generalization. To address these limitations, we propose Variational Supervised Contrastive Learning (VarCon), which reformulates supervised contrastive learning as variational inference over latent class variables and maximizes a posterior-weighted evidence lower bound (ELBO) that replaces exhaustive pair-wise comparisons for efficient class-aware matching and grants fine-grained control over intra-class dispersion in the embedding space. Trained exclusively on image data, our experiments on CIFAR-10, CIFAR-100, ImageNet-100, and ImageNet-1K show that VarCon (1) achieves state-of-the-art performance for contrastive learning frameworks, reaching 79.36% Top-1 accuracy on ImageNet-1K and 78.29% on CIFAR-100 with a ResNet-50 encoder while converging in just 200 epochs; (2) yields substantially clearer decision boundaries and semantic organization in the embedding space, as evidenced by KNN classification, hierarchical clustering results, and transfer-learning assessments; and (3) demonstrates superior performance in few-shot learning than supervised baseline and superior robustness across various augmentation strategies.

对比学习已在多种模态中证明了其在构建表示空间方面的高效性和适应性，通过将相似样本拉在一起并将不相似样本推开。然而，仍然存在两个主要局限性：（1）若没有嵌入分布的明确规定，除非有补充信号引导配对选择，语义上相关的实例可能会不经意地被推开；（2）过度依赖大量内部批次负样本和定制增强策略会阻碍泛化。为解决这些局限性，我们提出了变分监督对比学习（VarCon），它将监督对比学习重新表述为潜在类别变量上的变分推断，并最大化后验加权证据下限（ELBO），以替代详尽的配对比较，实现高效的类别感知匹配，并在嵌入空间中实现对内类别分散的精细控制。仅在图像数据上进行训练，我们在CIFAR-10、CIFAR-100、ImageNet-100和ImageNet-1K上的实验表明，VarCon（1）在对比学习框架中实现了最新技术性能，在ImageNet-1K上达到了79.36％的Top-1准确率，在CIFAR-100上达到了78.29％的准确率，使用ResNet-50编码器并在仅200个epoch内收敛；（2）在嵌入空间中的决策边界和语义组织更加清晰，这由KNN分类、层次聚类结果和迁移学习评估所证明；（3）在少样本学习上表现出优于监督基准的性能，并且在各种增强策略下表现出优越的稳健性。

论文及项目相关链接

PDF

Summary

对比学习在构建跨不同模态的表示空间时展现出高效和适应性，通过拉近相似样本并推远不相似样本。但存在两大局限：一是缺乏嵌入分布明确调控，可能导致语义相关实例被无意中推开；二是过度依赖大量内部批次负样本和定制增强手段，阻碍泛化。为解决这些问题，我们提出变分监督对比学习（VarCon），将监督对比学习重新表述为潜在类别变量的变分推断，最大化后验加权证据下限（ELBO），以高效类别感知匹配取代详尽的配对比较，并在嵌入空间中精细控制类内离散度。仅在图像数据上进行训练，我们的实验表明，VarCon在对比学习框架中达到最佳性能，具有更好的决策边界和语义组织以及优异的少样本学习能力。

Key Takeaways

对比学习通过拉近相似样本、推远不相似样本，在构建跨不同模态的表示空间时表现出高效和适应性。
存在两个主要局限：缺乏嵌入分布的明确调控，以及过度依赖大量内部批次负样本和定制增强手段。
变分监督对比学习（VarCon）通过变分推断解决这些问题，实现更高效和灵活的类别感知匹配。
VarCon在多个数据集上达到对比学习框架的最佳性能，并在嵌入空间中展现出更清晰的决策边界和语义组织。
VarCon在少样本学习能力方面表现出卓越性能，相较于监督基准测试和各种增强策略都展现出优越性。
VarCon在图像数据训练下，使用ResNet-50编码器在ImageNet-1K和CIFAR-100上分别达到79.36%和78.29%的Top-1准确率，并在仅200个epoch内收敛。

Cool Papers

点此查看论文截图