Are VLM Identity Judgments Logically Consistent? Evaluating Symmetry, Chain-of-Thought, and Transitivity in Person Re-Identification
Track: long paper (up to 10 pages)
Keywords: Logical Reasoning, LLM, Person Re-Identification, Identity, VLM
TL;DR: Are VLM Identity Judgments Logically Consistent? Evaluating Symmetry, Transitivity, and Chain-of-Thought in Person Re-Identification
Abstract: Vision-language models (VLMs) are increasingly used for visual reasoning tasks, yet their logical consistency remains poorly understood. We investigate whether VLMs make logically consistent identity judgments in person re-identification (re-ID), a task requiring fine-grained visual comparison. We propose three tests grounded in basic logical properties: (1) symmetry - whether the judgment ``A is the same person as B'' is invariant to presentation order; (2) transitivity - whether ``A = B'' and ``B = C'' implies ``A = C''; and (3) chain-of-thought consistency - whether explicit reasoning improves logical coherence. We evaluate four open-source VLMs (Qwen2-VL-7B, MiniCPM-V, Llama-3.2-Vision, LLaVA-NeXT-7B) alongside a CLIP embedding baseline on Market-1501. Our results reveal that two of four VLMs exhibit degenerate behavior (always predicting DIFFERENT), while the non-degenerate models show 14--26\% symmetry violations and up to 38.5\% transitivity violations. Strikingly, we find an accuracy - consistency trade-off: the most accurate model (MiniCPM-V, 81.5\%) has the lowest symmetry rate (74\%), while the perfectly symmetric CLIP baseline achieves only 52.6\% accuracy. These findings highlight a fundamental gap between VLM accuracy and logical coherence.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 62
Loading