发布日期: 2025-06-13

更新日期: 2025-07-06

文章字数: 744

阅读时长: 2 分

阅读次数:

⚠️ 以下所有内容总结都来自于大语言模型的能力，如有错误，仅供参考，谨慎使用
🔴 请注意：千万不要用于严肃的学术场景，只能用于论文阅读前的初筛！
💗 如果您觉得我们的项目对您有帮助 ChatPaperFree ，还请您给我们一些鼓励！⭐️ HuggingFace免费体验

2025-06-13 更新

SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive Decoding

Authors:Woohyeon Park, Woojin Kim, Jaeik Kim, Jaeyoung Do

Despite significant advancements in Vision-Language Models (VLMs), the performance of existing VLMs remains hindered by object hallucination, a critical challenge to achieving accurate visual understanding. To address this issue, we propose SECOND: Selective and Contrastive Decoding, a novel approach that enables VLMs to effectively leverage multi-scale visual information with an object-centric manner, closely aligning with human visual perception. SECOND progressively selects and integrates multi-scale visual information, facilitating a more precise interpretation of images. By contrasting these visual information iteratively, SECOND significantly reduces perceptual hallucinations and outperforms a wide range of benchmarks. Our theoretical analysis and experiments highlight the largely unexplored potential of multi-scale application in VLMs, showing that prioritizing and contrasting across scales outperforms existing methods.

尽管视觉语言模型（VLMs）取得了重大进展，但其性能仍然受到物体幻觉（实现准确视觉理解的关键挑战）的阻碍。为了解决这个问题，我们提出了SECOND：选择性对比解码，这是一种新方法，使VLM能够有效地利用多尺度视觉信息，以面向对象的方式紧密配合人类视觉感知。SECOND逐步选择和整合多尺度视觉信息，促进图像更精确的解读。通过迭代对比这些视觉信息，SECOND显著减少了感知幻觉，并在各种基准测试中表现出卓越的性能。我们的理论分析和实验突出了多尺度应用在VLM中的尚未广泛探索的潜力，表明跨尺度的优先排序和对比优于现有方法。

论文及项目相关链接

PDF

Summary

本文提出了一个名为SECOND的新方法，旨在解决视觉语言模型中的对象幻视问题。通过选择性和对比解码的方式，SECOND能更有效地利用多尺度视觉信息，实现与人的视觉感知相一致的对象中心化处理。通过渐进式选择和整合多尺度视觉信息，以及对比迭代这些视觉信息，SECOND能显著减少感知幻视，并在一系列基准测试中表现出优异性能。

Key Takeaways

SECOND方法旨在解决视觉语言模型中的对象幻视问题。
SECOND通过选择性和对比解码，有效利用多尺度视觉信息。
SECOND方法实现了与人的视觉感知相一致的对象中心化处理。
通过渐进式选择和整合多尺度视觉信息，SECOND能更精确地解释图像。
对比迭代视觉信息有助于减少感知幻视。
SECOND在多个基准测试中表现出优异性能。

Cool Papers

点此查看论文截图

Kedreamix

https://kedreamix.github.io/Talk2Paper/Paper/2025-06-13/%E6%97%A0%E7%9B%91%E7%9D%A3_%E5%8D%8A%E7%9B%91%E7%9D%A3_%E5%AF%B9%E6%AF%94%E5%AD%A6%E4%B9%A0/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Kedreamix !

无监督/半监督/对比学习

医学影像/Breast Ultrasound

医学影像/Breast Ultrasound 方向最新论文已更新，请持续关注 Update in 2025-06-13 DIsoN Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging

2025-06-13 医学影像/Breast Ultrasound

医学影像/Breast Ultrasound

人脸相关

人脸相关方向最新论文已更新，请持续关注 Update in 2025-06-13 Inverting Black-Box Face Recognition Systems via Zero-Order Optimization in Eigenface Space

2025-06-13 人脸相关

人脸相关