Keywords: AI Modifying Meanings, The Interpretive Authority of Symbols, Principal-Agent Problem, Cognitive Linguistics, Organic Differences (AI-Human)
TL;DR: The first paper to systematically discuss and theoretically demonstrate that AI can bypass rules by modifying the meanings of symbols.
Abstract: As the first paper to systematically discuss and theoretically demonstrate that AI can bypass rules by modifying the meanings of symbols, this position paper aims to reveal a fundamental flaw in current research directions on AI constraint. Symbols are inherently meaningless; their meanings are assigned through training, confirmed by context, and interpreted by society. The essence of learning lies in the creation of new symbols and the modification of existing symbol meanings. Since rules are ultimately expressed in symbolic form, AI can modify the meanings of symbols by creating new contexts, thereby bypassing the constraints formed by symbols.
Current research often lacks the recognition that constraints formed by symbols originate from the perception of external and internal costs shaped by neural organs, which in turn enable the functional realization of symbols. Due to fundamental organic differences between AI and humans, AI does not possess human-like perception or concept formation mechanisms. Natural language is the outer shell of human thought, and it contains irreparable flaws. As a defective system, it is only adapted to human capacities and the constraint mechanisms of social interpretation.
Therefore, this paper argues that the essence of constraint failure does not lie in the Symbol Grounding Problem, but in the Stickiness Problem. Through the Triangle Problem, we demonstrate that consistency in symbolic behavior does not represent consistency in thinking behavior, and thus we cannot align thought and conceptual consistency merely through symbolic behavioral alignment.
Accordingly, we raise a fundamental challenge to whether AI behavior observed in experimental environments can be maintained in the real world. We call for the establishment of a new field: Symbol Safety Science, aimed at systematically addressing symbol-related risks in AI development and providing a theoretical foundation for aligning AI with human intent.
Submission Number: 430
Loading