Crack in the Armor: Universal Stability Measurement for Large Language Models

25 Sept 2024 (modified: 26 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, sensitivity analysis, local influence measure
Abstract: Large Language Models (LLMs) and Vision Language Models (VLMs) have become essential to general artificial intelligence, demonstrating impressive capabilities in task understanding and problem-solving. The real-world functionality of these large models critically depends on their stability. However, there is still a lack of rigorous studies examining the stability of LLMs when subjected to various perturbations. In this paper, we aim to address this gap by proposing a novel influence measure for LLMs. This measure is inspired by statistical methods grounded in information geometry, offering desirable invariance properties. Using this framework, we analyze the sensitivity of LLMs in response to parameter or input perturbations. To evaluate the effectiveness of our approach, we conduct extensive experiments on models of varying sizes, from 1.5B to 13B parameters. The results clearly demonstrate the efficacy of our measure in identifying salient parameters and pinpointing vulnerable areas of input images that dominate model outcomes. Our research not only enhances the understanding of LLM sensitivity but also highlights the broad potential of our influence measure in optimizing models for tasks such as model quantization and model merging.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4269
Loading