Governance Drift: When Scientific AI Loses Accountability Through Citation Instability

AAAI 2026 Workshop AIGOV Submission9 Authors

06 Oct 2025 (modified: 21 Nov 2025)AAAI 2026 Workshop AIGOV SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI Governance, Accountability, Alignment, Large Language Models, Citation Drift, Responsible AI, Governance-by-Design, Epistemic Stability
TL;DR: We show that citation instability in scientific LLMs is a governance failure, introducing a Governance Stability Index to audit AI accountability.
Abstract: As AI systems become autonomous agents in scientific research, their accountability mechanisms—particularly citation practices—reveal critical governance failures. This study introduces governance drift, where Large Language Models systematically violate accountability obligations through citation mutation, loss, and fabrication across multi-turn conversations. Through analysis of 240 conversations across four LLaMA models using 36 scientific papers, we demonstrate that citation instability represents a fundamental governance breakdown. Results show dramatic variation in accountability adherence, with llama-4-scout-17b exhibiting 85.6% fabrication rates—a clear violation of epistemic governance norms. We introduce the Governance Stability Index (GSI) as a quantitative audit tool for AI accountability. These findings reveal that current AI systems lack the governance-by-design mechanisms necessary for responsible autonomous research assistance. Future governance frameworks should treat citation verification as an accountability primitive to ensure trustworthy, auditable scientific AI.
Submission Number: 9
Loading