📜 papers

2026-05-14 · 1 topics

視覚言語モデルの信頼性はアテンションでは測れない──隠れ状態の幾何構造が正誤を AUROC 0.95 で予測

3つのVLMを解析し、アテンションの鋭さと信頼性の相関がほぼゼロであることを解明。隠れ状態の線形プローブによる高精度な正誤予測を提案。（原題: Where Reliability Lives in Vision-Language Models: A Mechanistic Study of Attention, Hidden States, and Causal Circuits）