Abstract: This paper proposes a framework combining Neural Ordinary Differential Equations (Neural ODEs) and robust control theory to enhance the interpretability and control of large language models (LLMs). By utilizing Neural ODEs to model the dynamic evolution of input-output relationships and introducing control mechanisms to optimize output quality, we demonstrate the effectiveness of this approach across multiple question-answer datasets. Experimental results show that the integration of Neural ODEs and control theory significantly improves output consistency and model interpretability, advancing the development of explainable AI technologies.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: LLM Interpretability,Neural Ordinary Differential Equations (Neural ODEs),Control theory
Contribution Types: Model analysis & interpretability, Theory
Languages Studied: English
Submission Number: 5908
Loading