Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

Sonia Laguna; Jorge da Silva Gonçalves; Moritz Vandenhirtz; Alain Ryser; Irene Cannistraci; Julia E Vogt

Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

Sonia Laguna, Jorge da Silva Gonçalves, Moritz Vandenhirtz, Alain Ryser, Irene Cannistraci, Julia E Vogt

Published: 01 Mar 2026, Last Modified: 05 Apr 2026TTU at ICLR 2026 (Main) OralEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Machine unlearning is rapidly becoming a practical requirement, driven by privacy regulations, data errors, and the need to remove harmful or corrupted training samples. Despite this, most existing methods tackle the problem purely from a post-hoc perspective. They attempt to erase the influence of targeted training samples through parameter updates that typically require access to the full training data. This creates a mismatch with real deployment scenarios where unlearning requests can be anticipated, revealing a fundamental limitation of post-hoc approaches. We propose *unlearning by design*, a novel paradigm in which models are directly trained to support forgetting as an inherent capability. We instantiate this idea with Machine UNlearning via KEY deletion (MUNKEY), a memory augmented transformer that decouples instance-specific memorization from model weights. Here, unlearning corresponds to removing the instance-identifying key, enabling direct zero-shot forgetting without weight updates or access to the original samples or labels. Across natural image benchmarks, fine-grained recognition, and medical datasets, MUNKEY outperforms all post-hoc baselines. Our results establish that unlearning by design enables fast, deployment-oriented unlearning while preserving predictive performance.

Submission Number: 7

Loading