No Training Data, No Cry: Model Editing  without Training Data or Fine-tuning

Dhruva Kashyap; Tanay Narshana; Chaitanya Murti; Chiranjib Bhattacharyya

No Training Data, No Cry: Model Editing without Training Data or Fine-tuning

Dhruva Kashyap, Tanay Narshana, Chaitanya Murti, Chiranjib Bhattacharyya

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: pruning, model editing, classwise unlearning

TL;DR: We perform model editing without training data or loss function by analyzing correlations between intermediate representations and recover accuracy by adjusting batch norm statistics

Abstract: Model Editing(ME)--such as classwise unlearning and structured pruning--is a nascent field that deals with identifying editable components that, when modified, significantly change the model's behaviour, typically requiring fine-tuning to regain performance. The challenge of model editing increases when dealing with multi-branch networks(e.g. ResNets) in the data-free regime, where the training data and the loss function are not available. Identifying editable components is more difficult in multi-branch networks due to the coupling of individual components across layers through skip connections. This paper addresses these issues through the following contributions. First, we hypothesize that in a well-trained model, there exists a small set of channels, which we call HiFi channels, whose input contributions strongly correlate with the output feature map of that layer. Finding such subsets can be naturally posed as an expected reconstruction error problem. To solve this, we provide an efficient heuristic called RowSum. Second, to understand how to regain accuracy after editing, we prove, for the first time, an upper bound on the loss function post-editing in terms of the change in the stored BatchNorm(BN) statistics. With this result, we derive BNFix, a simple algorithm to restore accuracy by updating the BN statistics using distributional access to the data distribution. With these insights, we propose retraining free algorithms for structured pruning and classwise unlearning, CoBRA-P and CoBRA-U, that identify HiFi components and retains(structured pruning) or discards(classwise unlearning) them. CoBRA-P achieves at least 50% larger reduction in FLOPS and at least 10% larger reduction in parameters for similar drop in accuracy in the training free regime. In the training regime, for ImageNet, it achieves 60% larger parameter reduction. CoBRA-U achieves, on average, a 94% reduction in forget-class accuracy with a minimal drop in remaining class accuracy.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 14182

Loading