Enforcing Logical Invariance in Large Language Models via Symmetry Pair Training
Track: tiny / short paper (up to 4 pages)
Keywords: Contrastive Consistency Tuning, Symmetry Engine
Abstract: Despite their scale, Large Language Models (LLMs) frequently exhibit
\emph{logical fragility}---a phenomenon wherein minor linguistic permutations
of the same logical premise yield contradictory outputs. We introduce
\textbf{Contrastive Consistency Tuning (CCT)}, a training framework that
enforces logical invariance in a model's latent space by leveraging
semantically equivalent but structurally distinct \emph{Symmetry Pairs}.
CCT augments a standard cross-entropy objective with a contrastive
consistency penalty that minimises representational divergence between
logically equivalent prompts. To generate training data at scale, we
present the \textbf{Symmetry Engine}, an automated pipeline of five logical
transformation rules applied to FOLIO and ProofWriter benchmarks.
Evaluated on Llama-3 (8B) and Mistral-7B, CCT reduces the
\emph{Contradiction Rate} (CR) by 19--20 percentage points over vanilla
fine-tuning baselines while preserving overall accuracy. Crucially, we
demonstrate that frontier models such as GPT-4o and Claude~3.5~Sonnet
exhibit non-trivial contradiction rates (${\sim}30\%$), suggesting that
logical fragility is not resolved by scale alone.
Presenter: ~Prasanth1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 130
Loading