Amnesia: A Stealthy Replay Attack on Continual Learning Dreams

TMLR Paper7272 Authors

31 Jan 2026 (modified: 06 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Continual learning (CL) models rely on experience replay to mitigate catastrophic forgetting, yet their robustness to replay sampling interference is largely unexplored. Existing CL attacks mostly modify inputs or update pipelines (poisoning/backdoors) and lack explicit \emph{auditable} constraints, limiting their realism. Here, \emph{auditability} means that a monitor can verify compliance using sampler-visible telemetry, e.g., logged replay index/label statistics, by checking that the realized replay class histogram stays close to a nominal baseline and that the replay rate is unchanged (per-batch and/or over a rolling window). We study a limited-privilege insider controlling only the replay \emph{index selection}, not pixels, labels, or model parameters, while staying within such auditable limits (e.g., queue priorities). We introduce \textbf{Amnesia}, a replay composition attack maximizing model degradation under two auditable budgets: a visibility budget $\delta$ bounding the $\mathrm{TV}/\mathrm{KL}$ divergence from a nominal class histogram $p_0$, and a mass budget $f$ fixing the replay rate. Amnesia uses a two-step procedure: (i) compute lightweight class utilities (e.g., EMA loss/confidence) to tilt $p_0$ toward harmful classes; (ii) project the tilt back into the $\delta$-ball using efficient $\mathrm{KL}$ (\emph{exponential tilt}) or $\mathrm{TV}$ (\emph{balanced mass redistribution}) optimizers. A windowed scheduler enforces rolling audits. Across challenging CL benchmarks (Split CIFAR-10/100, CORe50, Tiny-ImageNet) and strong replay baselines (ER, ER-ACE, SCR, DER++), Amnesia consistently depresses final accuracy (ACC$\downarrow$) and worsens backward transfer ($-\mathrm{BWT}\uparrow$). The $\mathrm{KL}$ variant achieves high impact while remaining largely undetected by audits, as confirmed empirically under multiple audit schemes (per-batch and rolling-window checks), whereas the $\mathrm{TV}$ variant is more damaging but more easily detected, especially under tight per-class constraints. These results expose \emph{index-only} replay control as a practical, auditable threat surface in CL systems and establish a principled impact-visibility-budget trade-off. Code is available anonymously via \href{https://anonymous.4open.science/r/9124_Amensia/README.md}{Anonymous GitHub}.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: N/A
Assigned Action Editor: ~Eleni_Triantafillou1
Submission Number: 7272
Loading