Composing Knowledge and Compression Interventions for Language Models

Published: 05 Mar 2024, Last Modified: 08 May 2024ICLR 2024 R2-FM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Model editing, Compression, Interventions, Language models
TL;DR: This paper introduces a framework for composable interventions on language models, demonstrating through knowledge editing and model compression that compression can negate knowledge edits and make models harder to edit
Abstract: Test-time interventions for language models aim to enhance factual accuracy, reduce harmful outputs, and improve model efficiency while avoiding excessive training costs. But existing interventions are developing independently. In practice, multiple interventions must be applied to the same model sequentially. We introduce composable interventions, a framework for studying the impacts of repeatedly intervening on the same language model. To showcase our framework, we compose interventions for two burgeoning interventions: knowledge editing and model compression. We find that compression undoes knowledge edits faster than it decays general model performance. We also find that compressing models makes them harder to edit and show that composing interventions impacts predicted logits.
Submission Number: 85
Loading