Beyond Masking and Avoidance: Toward True Unlearning

Arman Hatami; Ilya E Monosov

Beyond Masking and Avoidance: Toward True Unlearning

Arman Hatami, Ilya E Monosov

20 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: machine unlearning, representational erasure, linear probes, membership inference, nearest-neighbor purity, Grad-CAM, CKA, CNNs, Vision Transformers, CNR

TL;DR: Beyond output masking: CNR erases forgotten-class features so linear/nonlinear decoders and MI attacks fall to near-chance, with minimal retain-set loss.

Abstract: Current machine unlearning methods reduce predictions for forgotten classes but often leave their internal representations intact, achieving avoidance rather than erasure. We define true unlearning as the elimination of class-specific information from hidden states such that no simple or robust decoder can recover it. We introduce CNR (Class-Specific Neuronal Reset), an architecture-agnostic procedure with three steps: (1) identify class-selective units via mean activation screening, (2) apply targeted resets by fine-tuning on GAN-generated synthetic samples derived from the forget classes to suppress activation of forget-specific pathways, and (3) perform retain-only fine-tuning with regularization to restore global function. Across MNIST, CIFAR-10/100, LFW, and CUB-200-2011 on CNNs and ViTs, prior approaches (gradient ascent, KD-based unlearning, logit masking, retain-only fine-tuning) suppress forget-class accuracy yet still permit decoding above chance from hidden states. CNR drives linear probes, k-NN and SVM decoders, and membership inference attacks to chance performance, while reducing nearest-neighbor label purity to the class prior. It achieves this with minimal retained-class degradation ($\leq 5\%$ drop) and preserved CKA similarity. Grad-CAM and layer-wise analyses confirm targeted class-selective erasure rather than global damage.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 23273

Loading