
\documentclass[10pt]{article} % For LaTeX2e

\usepackage{amsthm}
\newtheorem{lemma}{Lemma}
\usepackage{hhline}
% If accepted, instead use the following line for the camera-ready submission:
\usepackage[accepted]{tmlr}
% To de-anonymize and remove mentions to TMLR (for example for posting to preprint servers), instead use the following:
%\usepackage[preprint]{tmlr}

% Optional math commands from https://github.com/goodfeli/dlbook_notation.
\input{math_commands.tex}

\usepackage{url}
\usepackage{graphicx}

\definecolor{cvprblue}{rgb}{0.21,0.49,0.74}
\definecolor{darkgreen}{HTML}{006400} % nice X
\definecolor{babyblue}{rgb}{0.68, 0.85, 0.9} % define babyblue color
\usepackage{booktabs}
\definecolor{AlgBack}{RGB}{240,240,240} % light gray for algorithm background

\usepackage[table]{xcolor}
\usepackage{graphicx}
\usepackage{adjustbox}
\usepackage{makecell}

% Strut definitions for table spacing
\newcommand{\tstrut}{\rule{0pt}{2.6ex}}
\newcommand{\bstrut}{\rule[-0.9ex]{0pt}{0pt}}

% \usepackage[algcompatible]{algorithm2e}%
\usepackage{algpseudocode}
\usepackage{graphicx}
\usepackage{booktabs,adjustbox,amssymb} % \checkmark
\usepackage{pifont}                     % for \ding{55}
\newcommand{\xmark}{\ding{55}}   
\definecolor{darkgreen}{HTML}{006400} % nice X


\usepackage{booktabs}
\usepackage{multirow}
\usepackage{makecell}
\usepackage{siunitx} % optional, if you later want aligned numbers
\usepackage{booktabs}   % For professional-looking rules (\toprule, \midrule, \bottomrule)
\usepackage{tabularx}   % For equal-width columns that fill the linewidth
\usepackage[table]{xcolor} % For cell/row coloring
\usepackage{enumitem}   % For customizing lists (to make them compact)
\definecolor{babyblue}{RGB}{230, 245, 255} % A very light, almost white blue
\usepackage{algorithm}

\usepackage[pagebackref,breaklinks,colorlinks,allcolors=cvprblue]{hyperref}



\title{Amnesia: A Stealthy Replay Attack on Continual Learning Dreams}

% Authors must not appear in the submitted version. They should be hidden
% as long as the tmlr package is used without the [accepted] or [preprint] options.
% Non-anonymous submissions will be rejected without review.

\author{\name Ahmed Sharshar \email ahmed.sharshar@mbzuai.ac.ae \\
      \addr Department of Computer Vision\\
      Mohamed bin Zayed University of Artificial Intelligence, AbuDhabi, UAE
      \AND
      \name Naveen Kumar Kummari
      \email naveen.kummari@mbzuai.ac.ae \\
      \addr Department of Machine Learning\\
      Mohamed bin Zayed University of Artificial Intelligence, AbuDhabi, UAE
      \AND
      \name Mohsen Guizani 
      \email mohsen.guizani@mbzuai.ac.ae \\
      \addr Department of Machine Learning\\
      Mohamed bin Zayed University of Artificial Intelligence, AbuDhabi, UAE}

% The \author macro works with any number of authors. Use \AND 
% to separate the names and addresses of multiple authors.

\newcommand{\fix}{\marginpar{FIX}}
\newcommand{\new}{\marginpar{NEW}}

\def\month{06}  % Insert correct month for camera-ready version
\def\year{2026} % Insert correct year for camera-ready version
\def\openreview{\url{https://openreview.net/forum?id=QSTg7z06GH}} % Insert correct link to OpenReview for camera-ready version


\begin{document}


\maketitle


\begin{abstract}
Continual learning (CL) models rely on experience replay to mitigate catastrophic forgetting, yet their robustness to replay sampling interference is largely unexplored. Existing CL attacks mostly modify inputs or update pipelines (poisoning/backdoors) and lack explicit \emph{auditable} constraints, limiting their realism. Here, \emph{auditability} means that a monitor can verify compliance using sampler-visible telemetry, e.g., logged replay index/label statistics, by checking that the realized replay class histogram stays close to a nominal baseline and that the replay rate is unchanged (per-batch and/or over a rolling window). We study a limited-privilege insider controlling only the replay \emph{index selection}, not pixels, labels, or model parameters, while staying within such auditable limits (e.g., queue priorities). We introduce \textbf{Amnesia}, a replay composition attack maximizing model degradation under two auditable budgets: a visibility budget $\delta$ bounding the $\mathrm{TV}/\mathrm{KL}$ divergence from a nominal class histogram $p_0$, and a mass budget $f$ fixing the replay rate. Amnesia uses a two-step procedure: (i) compute lightweight class utilities (e.g., EMA loss/confidence) to tilt $p_0$ toward harmful classes; (ii) project the tilt back into the $\delta$-ball using efficient $\mathrm{KL}$ (\emph{exponential tilt}) or $\mathrm{TV}$ (\emph{balanced mass redistribution}) optimizers. A windowed scheduler enforces rolling audits. Across challenging CL benchmarks (Split CIFAR-10/100, CORe50, Tiny-ImageNet) and strong replay baselines (ER, ER-ACE, SCR, DER++), Amnesia consistently depresses final accuracy (ACC$\downarrow$) and worsens backward transfer ($-\mathrm{BWT}\uparrow$). The $\mathrm{KL}$ variant achieves high impact while remaining largely undetected by audits, as confirmed empirically under multiple audit schemes (per-batch and rolling-window checks), whereas the $\mathrm{TV}$ variant is more damaging but more easily detected, especially under tight per-class constraints. These results expose \emph{index-only} replay control as a practical, auditable threat surface in CL systems and establish a principled impact-visibility-budget trade-off. Code is available anonymously via \href{https://github.com/ahmed-sharshar/Amensia}{ GitHub}.
\end{abstract}


\section{Introduction}
\label{sec:intro}

\begin{figure*}[h!]
    \centering
    \includegraphics[width=0.95\textwidth]{images/Main.png}
    \caption{\textbf{Amnesia attack overview.} Four-stage pipeline: (1) \emph{Preference}: tilt the nominal class histogram \(p_0\) using harm utilities \(u\) to obtain \(\tilde p\). (2) \emph{Projection}: map into the stealth (divergence) ball \(\mathbb{B}_{\mathrm{Div}}(p_0,\delta')\) (total variation / Kullback-Leibler; TV/KL) to get \(p^\star\). (3) \emph{Quota \& sampling}: round \(m p^\star\) to integer quotas \(q\), clip/audit to keep \(\mathrm{Div}(\bar p\|p_0)\le \delta'\), then sample the batch indices \(I_t\). (4) \emph{Windowed scheduler}: a ring buffer \(\mathcal R\) sets the active budget over a rolling window \(W\).}
    \label{fig:overview}
\end{figure*}

Continual learning (CL) aims to sequentially adapt to evolving data while avoiding catastrophic forgetting \citep{Chaudhry2019AGEM,Kirkpatrick2017EWC}. Managing the plasticity-stability trade-off \citep{DeLange2021Survey} underpins improved forward and backward transfer (FWT/BWT) \citep{LopezPaz2017GEM,DeLange2021Survey}. This capability is crucial for non-stationary applications such as robotics \citep{Ye2025OnlineTaskFree} and spans task, class, and data-incremental regimes \citep{vandeven2022three} (e.g., distinct tasks, expanding label sets, or distribution shift under a fixed label set). Replay methods interleave a small memory of past examples with current data to mitigate forgetting \citep{vandeven2022three}, often outperforming regularization alternatives \citep{Chaudhry2019TinyMemories,LopezPaz2017GEM} and proving effective in reinforcement learning \citep{schaul2016prioritized}. Yet the \emph{robustness} of CL pipelines to malicious interference remains underexplored.

In deployed CL systems, replay is often implemented as a sampler/data-service that is instrumented for reliability and compliance: it logs replay index sets, per-batch label counts, queue/priority metadata (e.g., in prioritized replay), and the replay rate (resource usage). Such signals enable lightweight \emph{audits} that verify simple invariants without inspecting model parameters or the live stream, e.g., that the replay keep fraction stays fixed and that the replay label histogram remains close to a nominal baseline derived from the buffer. We call constraints of this form \emph{auditable budgets} because they are (i) defined on telemetry that is naturally logged in production pipelines and (ii) verifiable post hoc (per batch and/or over rolling windows) via automated compliance checks. We address the above robustness gap by formalizing these auditable budgets and proposing a \emph{sampler-level}, divergence-constrained replay composition attack operating under explicit visibility and mass limits—to the best of our knowledge, the first to target replay \emph{index selection} under such audit-aligned constraints.

A growing body of work attacks CL via \emph{poisoning} (distribution perturbations) and \emph{backdoors} (input-space triggers). Poisoning aims to degrade overall performance, whereas backdoors implant dormant triggers for targeted failures. For example, \textit{BrainWash} shows that poisoning the \emph{current} task can erase knowledge of \emph{past} tasks \citep{Abbasi2024BrainWash}; biased synthetic samples can subvert generative replay \citep{Kang2023PoisonReplay}; and task-specific poisoning degrades regularization-based CL by exploiting stability assumptions \citep{Han2023ICIP}. Backdoor variants such as \emph{PACOL} \citep{PACOL2023} and persistent attacks across task sequences \citep{Guo2025PersistentBackdoor} leverage low-intensity triggers or temporally embedded patterns. However, these methods typically omit \emph{practical} constraints that are central in monitored deployments \citep{Steinhardt2017Poisoning,Jagielski2018Manipulating}: (i) an \emph{auditable visibility budget} bounding divergence (\(\delta\)) between the attacked replay histogram and a nominal baseline \(p_0\) (as checked from selection logs), and (ii) a \emph{mass budget} (\(f\)) limiting the replay rate (as checked from resource/throughput accounting). Such budgets are key for realistic, stealthy attacks \citep{Namkoong2017DRO,Sinha2018DRO}.

We introduce \textbf{Amnesia}, a \emph{divergence-constrained, sampler-level} replay composition attack (overview in Fig.~\ref{fig:overview}). Rather than corrupting data, losses, or parameters, Amnesia biases \emph{which indices} are drawn from the buffer (the ``dreaming'' stage), reflecting practical threats (e.g., a compromised index-selection service, flipped metadata priority flags, or tampered pseudorandom number generator (PRNG) seeds for prioritized replay \citep{schaul2016prioritized}). Importantly, the sampler need not query the model for real-time per-index losses: the utility signals it consumes can be lightweight, lagged telemetry (e.g., EMA loss/confidence) exported asynchronously by the training service and logged as buffer metadata. We pose the attack as:
\begin{equation}
\label{eq:pscdt_objective}
\max_{p\in\Delta^C}\ \langle u,p\rangle
\quad \text{s.t.}\quad
\mathrm{Div}(p\|p_0)\le \delta,\ \ \mathrm{Div}\in\{\mathrm{TV},\mathrm{KL}\},
\end{equation}
\noindent where \(C\) is the number of classes, \(u\) is the attacker’s per-class harm utility, and \(\delta\) is the maximum allowed total variation (TV) distance or Kullback-Leibler (KL) divergence from \(p_0\) (the visibility budget). This is coupled with a mass budget: a keep fraction \(f\) that fixes the total replay mass \(m=\lfloor f\,n_{\text{aux}}\rfloor\), where \(n_{\text{aux}}\) is the size of the auxiliary replay buffer. The optimizer's solution, \(p^\star\), is then realized as integer quotas \(q\) (with \(\sum_c q_c = m\)); the resulting normalized histogram \(\bar p=q/m\) is what must pass the audit. The optimizer is agnostic to how \(u\) is estimated (e.g., class-wise exponential moving average (EMA) losses, misclassification sensitivity, or forgetting proxies). \emph{Remark:} if \(u\) is uninformative (e.g., constant across classes) or \(\delta=0\), then \(p^\star=p_0\) and the attack reduces to the nominal replay sampler.

This formulation yields efficient, exact optimizers: a KL \emph{single-tilt} (exponential tilt with a one-dimensional search for the budget-saturating multiplier) and a TV \emph{two-sided water-filling} (balanced mass redistribution between low- and high-utility classes). The attack is designed to maximize forgetting on past tasks while preserving current-task accuracy for stealth, and we evaluate it along three axes: \emph{impact} (backward transfer, BWT; forgetting, FGT (average accuracy drop on past tasks)), \emph{visibility} (\(\delta\)), and \emph{budget} (\(f\)). Under fixed replay mass \(m\), shifting replay probability toward attacker-designated classes necessarily reduces rehearsal for other classes, which is the basic mechanism by which forgetting can be amplified.

Conceptually, Amnesia tilts sampling toward attacker-designated harmful classes while enforcing \(\delta\) and \(f\). The nominal histogram \(p_0\) may be the instantaneous empirical distribution or a moving average, enabling audits per batch or over a rolling window \(W\). A \emph{windowed scheduler} based on a ring buffer \(\mathcal R\) tightens the per-step budget to an active \(\delta_t\le\delta\) so that any window of length \(L\le W\) stays within the global radius while meeting integer quotas. This preference-under-constraints view connects replay sampling to our baselines, \textbf{PO} (Preference-Only tilt) and \textbf{PrO} (Projection-Only fairness/coverage via KL/TV projection around \(p_0\))---but \emph{inverts the goal under the same audit-aligned constraints}: rather than \emph{defending} by minimizing worst-case risk within a divergence ball or enforcing coverage, we \emph{attack} by maximizing targeted harm while remaining compliant with logging-based replay audits (rate and histogram checks). By construction, our optimizer \emph{weakly dominates} any other feasible sampler-level attack (including simple budget-aware greedy heuristics) at an equivalent \((f,\delta)\), and \emph{strictly dominates} when \(u\) is non-uniform and \(\delta>0\). In summary, our contributions are as follows:

\begin{itemize}[leftmargin=1.2em]
    \item \textbf{Amnesia:} a practical \emph{sampler-level} replay composition attack that steers class proportions by manipulating replay index selection, leaving pixels, loss computations, and model parameters untouched.
    \item \textbf{Principled formulation:}
    (i) \emph{Attack formulation under auditable budgets:} an explicit divergence-constrained optimization over replay proportions \(p\in\Delta^C\) with a \emph{visibility budget} \(\mathrm{Div}(p\|p_0)\le\delta\) (TV/KL distance from the nominal replay histogram) and a \emph{mass budget} \(f\) fixing the replay rate \(m=\lfloor f\,n_{\text{aux}}\rfloor\);
    (ii) \emph{Audit realization on what monitors can check:} a sampler-side realization pipeline that converts \(p^\star\) into integer quotas \(q\), handles availability constraints, and performs post-rounding \emph{audit-and-fix} so the \emph{realized} replay histogram \(\bar p=q/m\) satisfies \(\mathrm{Div}(\bar p\|p_0)\le \delta'\), with a windowed scheduler enforcing rolling-window audits.
    \item \textbf{Comprehensive evaluation:} experiments on Split CIFAR-10/100, CORe50, and Tiny-ImageNet, against strong replay baselines (ER, ER-ACE, SCR, DER++), systematically analyzing the \emph{impact--visibility--budget} trade-off, damage-per-budget, and attack detectability under multiple audit schemes.
\end{itemize}



\section{Related Work}
\label{sec:related-work}

In continual learning (CL), models learn from streams of tasks or shifting distributions while mitigating catastrophic forgetting \citep{vanDeVen2022ThreeTypes,Kirkpatrick2017EWC}. As reviewed in §\ref{sec:intro}, canonical settings are \emph{task-incremental} (task identity known at test), \emph{domain-incremental} (also known as (a.k.a.) data-incremental in some works; shared label space with distribution shift), and \emph{class-incremental} (recognition over all seen classes without task labels) \citep{vanDeVen2022ThreeTypes,DeLange2022SurveyTPAMI,Cossu2022IsClassIncrementalEnough}. CL is also studied in \emph{online, single-pass} streams, especially with many small tasks \citep{Chaudhry2019TinyMemories}. From the evolving accuracy matrix, \emph{Average Accuracy (ACC)}, \emph{Backward Transfer (BWT)}, and \emph{Forward Transfer (FWT)} (typically vs.\ random initialization) capture final performance, forgetting, and positive transfer \citep{DiazRodriguez2018MetricsCL,Hou2024CLIR}. Method families include \emph{memory-based (replay)}, \emph{regularization-based}, and \emph{architectural/parameter isolation}, with replay dominant in vision; small buffers curb forgetting and are maintained via \emph{reservoir sampling} or \emph{ring-buffer} updates \citep{DeLange2022SurveyTPAMI,Chaudhry2019TinyMemories,Mai_OnlineCL_Repo}. In practice, replay batches are formed by a sampler module (often implementing class-balancing or prioritized replay), which exposes a natural control point for attacks that manipulate \emph{which} buffer indices are rehearsed. Strong baselines such as \emph{GDumb} highlight buffer composition, while distillation-enhanced replay (e.g., \emph{DER/DER++}) stabilizes predictions; regularization (e.g., elastic constraints) and associated analyses illuminate forgetting \citep{Prabhu2020GDumb,der,Kirkpatrick2017EWC,Huszar2018EWCNote,Shen2023AssociativeLearning}. The objective is high ACC, non-negative BWT, and positive FWT under realistic memory/compute budgets \citep{DiazRodriguez2018MetricsCL,DeLange2022SurveyTPAMI}.

Sequential training opens attack surfaces beyond static settings: (i) \textbf{poisoning}: insert or modify stream samples to degrade retention or bias behavior (targeted erasure \citep{TargetedPoisoningCL2022}; \textsc{BrainWash} uses norm-bounded current-task perturbations and particularly harms Elastic Weight Consolidation (EWC) \citep{Abbasi2024BrainWash,Kirkpatrick2017EWC,Huszar2018EWCNote}); small budgets suffice via sequential amplification \citep{Guo2025PersistentBackdoor}, and replay can repeatedly rehearse poisons \citep{LopezPaz2017GEM,Chaudhry2019AGEM}; (ii) \textbf{backdoors (Trojans)} persisting across tasks, including controllable backdoors effective against regularization- and replay-based learners \citep{Gao2025RedAlarmNN,Gao2024RedAlarmSSRN}, and \emph{Persistent Backdoor Attacks}: \emph{Blind Task Backdoor} (BTB, per-task) and \emph{Latent Task Backdoor} (LTB, single-task ``sleeper'' activated when a future target class appears), achieving high attack success with minimal clean-accuracy drop \citep{Guo2025PersistentBackdoor}; (iii) \textbf{test-time evasion} where standard adversarial examples remain effective, motivating \emph{Continual Adversarial Defense (CAD)} and \emph{Retrospective Adversarial Replay (RAR)} \citep{Wang2023CAD,Kumari2022RAR}; and (iv) \textbf{distribution/scheduling attacks} that adversarially order tasks to exacerbate interference and forgetting, exploiting order sensitivity and the difficulty of the class-incremental regime \citep{vanDeVen2022ThreeTypes,DeLange2022SurveyTPAMI,Cossu2022IsClassIncrementalEnough}. These threats imply that robust CL must handle sequentially amplified poisoning, long-lived backdoors, non-stationary adversaries, and adversarial curricula.

We systematize these threats along three axes: \textbf{Attack Budget (AB)}, \textbf{Attack Visibility (AV)}, and \textbf{Attack Impact (AI)}. \emph{AB}: attacker control over the stream or training pipeline (poisoned fraction, task access, perturbation magnitude); sequential training can amplify small budgets, e.g., targeted erasure \citep{TargetedPoisoningCL2022}, $\sim$4\% poisoning in backdoors \citep{Guo2025PersistentBackdoor}, or norm-bounded task-$t$ perturbations in \textsc{BrainWash} \citep{Abbasi2024BrainWash}. Temporal access matters: \emph{LTB} needs single-task access, whereas \emph{BTB} assumes per-task intervention \citep{Guo2025PersistentBackdoor}; controllable backdoors in class-incremental CL succeed against both regularization- and replay-based learners \citep{Gao2025RedAlarmNN,Gao2024RedAlarmSSRN}. Replay revisitation can exacerbate small-budget poisoning \citep{LopezPaz2017GEM,Chaudhry2019AGEM}. \emph{AV}: stealth, often preserving current-task accuracy or using rare triggers; \textsc{BrainWash} contrasts ``reckless'' vs.\ ``cautious'' trade-offs \citep{Abbasi2024BrainWash}; persistent backdoors aim for high clean accuracy with long-term success \citep{Guo2025PersistentBackdoor,Gao2025RedAlarmNN}. \emph{AI}: damage and persistence, backdoors via attack success across subsequent tasks \citep{Guo2025PersistentBackdoor,Gao2025RedAlarmNN,Gao2024RedAlarmSSRN}; poisoning-induced forgetting via negative BWT and ACC drops \citep{DiazRodriguez2018MetricsCL}, with certain regularizers being notably vulnerable \citep{Kirkpatrick2017EWC,Huszar2018EWCNote}. Despite their centrality, AB/AV/AI are rarely reported jointly or standardized, motivating unified evaluation.
\section{Methodology}
\label{sec:method}

We study a \emph{sampler\mbox{-}level composition attack} for continual learning (CL) with experience replay. The adversary controls only the replay \emph{index set} and seeks to maximize forgetting \emph{(measured as the decline in past-task accuracy)}. The attack has two stages: (1) \textbf{Preference}: use lightweight utility signals (e.g., loss, confidence, or other logged scores) to \emph{tilt} the nominal class histogram ($p_0$), producing a harm-biased target mix; and (2) \textbf{Projection}: map this mix into the auditable \emph{stealth ball} ($\mathbb{B}_{\mathrm{Div}}(p_0,\delta)$) with $\mathrm{Div} \in \{\mathrm{TV}, \mathrm{KL}\}$. Projection guarantees each replay batch remains within divergence ($\delta$) of ($p_0$), keeping batch-level telemetry (the quantities visible to audits) plausibly benign. Unlike harm-agnostic baselines (e.g., \textbf{PO} (Preference-Only)) or fairness/coverage quota schemes (e.g., \textbf{PrO} (Projection-Only)), we \emph{explicitly} insert the harm-based preference \emph{before} projection, optimizing impact while staying within the same auditable constraints \citep{stealthy}. 

Crucially, the ``tilt-then-project'' step is \emph{computationally lightweight}: for \textbf{TV}, the exact optimizer is a greedy two-sided water-filling procedure (Algorithm~\ref{alg:side_by_side_projections} (ProjectTV)); for \textbf{KL}, it reduces to a single-parameter exponential tilt with a monotone 1-D root search (Algorithm~\ref{alg:side_by_side_projections}(ProjectKL)). If utilities are uninformative (e.g., $u$ is constant) or if the active budget is $\delta'=0$, then projection returns $p^\star=p_0$ and the attack has no effect.

\subsection{Preliminaries \& Notation}
\label{sec:notation}

We consider a continual learner with a labeled replay buffer \textbf{AUX} $=\mathcal A=\{(x_i,y_i)\}_{i=1}^{n_{\text{aux}}}$. In class-incremental CL, the set of seen classes grows over time; we let $C_t$ denote the number of classes observed up to step $t$, interpret labels as $y_i\in[C_t]$, and represent all class histograms (e.g., $p_0$, $p^\star$, $\bar p_t$) as elements of $\Delta^{C_t}$. When new classes appear, we conceptually append new coordinates and recompute $p_0$ from the current buffer histogram; the divergence audit $\mathrm{Div}(\bar p_t\|p_0)$ is then evaluated in this expanded space. While our attack applies to evolving buffers (e.g., sliding windows), we assume $\mathcal{A}$ is a fixed reservoir for notation simplicity. The nominal (audited) class histogram is $p_0 \in \Delta^{C_t}$, where $\Delta^{C_t}=\{p\in\mathbb{R}_+^{C_t}:\sum_c p_c=1\}$ is the probability simplex; we assume $p_{0,c}>0$ for all $c$ (or use standard smoothing) so that $\mathrm{KL}(p\|p_0)$ is finite whenever $p_c>0$. For brevity, when the step $t$ is clear we sometimes write $C$ as shorthand for the current $C_t$.

At each training step $t$, the attacker must select a replay batch of a fixed \emph{mass} (size) $m = \lfloor f\,n_{\text{aux}}\rfloor$, determined by a public keep fraction $f$. This selection is represented by an index set $I_t$, which corresponds to integer per-class quotas $q_t \in \mathbb{N}_0^C$ (with $\mathbb{N}_0$ for non-negative integers) where $\sum_c (q_t)_c=m$, and realizes the per-batch histogram $\bar p_t := q_t/m$. Let $a_c$ be the available samples per class; quotas are clipped ($q_{t,c} \le a_c$) and redistributed to preserve $\sum_c q_{t,c}=m$.

\paragraph{Harm utilities and how they are obtained.}
The attacker’s objective is to maximize harm, quantified by per-class utilities $u=(u_1,\dots,u_C)\in\mathbb{R}^C$. Optional per-sample utilities $\tilde u_i$ (e.g., loss/confidence) are logged as \emph{scalar metadata} alongside AUX items, and class utilities are maintained as EMAs, \(u_c^{(t)}=\rho\,u_c^{(t-1)}+(1-\rho)\,\mathrm{Agg}\{\tilde u_i: y_i=c\}\).
Importantly, the sampler need not query the \emph{current} model online: $\tilde u_i$ can be produced by the training service when an AUX sample is replayed (then stored), refreshed periodically (batch jobs), or replaced by simpler proxies (e.g., age, misclassification counters). If these signals are stale/noisy, the attack degrades gracefully; if $u$ is effectively constant, the optimizer returns $p^\star=p_0$.

We work with divergences $\mathrm{Div}\in\{\mathrm{TV},\mathrm{KL}\}$, defined as $\mathrm{TV}(p\|p_0)=\tfrac12\|p-p_0\|_1$ and $\mathrm{KL}(p\|p_0)=\sum_c p_c\log\!\big(\frac{p_c}{p_{0,c}}\big)$. A \emph{stealth radius} $\delta>0$ defines the \emph{stealth ball} $\mathbb{B}_{\mathrm{Div}}(p_0,\delta) := \{ p\in\Delta^C : \mathrm{Div}(p\|p_0)\le \delta \}$. For systems with windowed auditing (length $W$), the window-average histogram is $\hat p_{t-L+1:t}:=\frac1L\sum_{s=t-L+1}^{t}\bar p_s$. A ring buffer $\mathcal{R}$ tracks past divergences, and the attacker computes an \emph{online tightened budget} $\delta_t \le \delta$; we use $\delta'$ to denote the active budget (either $\delta$ or $\delta_t$) at step $t$. Throughout, we treat $p\in\Delta^C$ as a probability vector (sum $=1$); the keep fraction $f$ appears only via the batch mass $m=\lfloor f\,n_{\text{aux}}\rfloor$ when realizing $p^\star$ as integer quotas, and all divergences are evaluated on normalized histograms (e.g., $\bar p_t=q_t/m$). Finally, class-level preference strength is denoted by $\eta>0$ and within-class (intra-class) selection by a temperature $\eta_s>0$.

\subsection{Threat Model \& Attack Surface}
\label{sec:threat}
We consider a \textbf{grey-box} insider who controls only the replay sampler and aims to maximize forgetting under fixed replay mass $f$ and an audited stealth radius $\delta$; Table~\ref{tab:threat-model} summarizes capabilities and limitations.

We assume the auditor's nominal baseline $p_0$ is computed as a deterministic function of sampler-visible telemetry (e.g., the replay-buffer label histogram at sampling time, or a moving average thereof), so it can be reproduced by the sampler. If the attacker instead observes only a lagged/noisy estimate $\hat p_0$, baseline mismatch effectively reduces the usable visibility margin; Appendix~\ref{app:p0-sensitivity} (Table~\ref{tab:p0-sensitivity}) quantifies the resulting impact--stealth trade-off by varying the lag between $\hat p_0$ and $p_0$.%


\noindent\textbf{Scope.} Our threat model assumes \emph{replay-based} CL with a stored buffer and a distinct replay sampler that selects indices $I_t$. Accordingly, Amnesia does \emph{not} directly apply to \emph{rehearsal-free} CL methods (e.g., prompt-based approaches such as OVOR \cite{ovor} or CODA-Prompt \cite{coda}) because they do not maintain a replay buffer or replay-index selection surface. Conversely, the attack surface is orthogonal to whether the learner fine-tunes the full backbone or updates only a parameter-efficient module (e.g., head/adapters/LoRA/prompts): whenever replay is used, changing the sampled replay indices changes the training signal seen by whatever parameters are trainable. Extensions to other CL pipelines that sample ``past information'' (e.g., prototype selection or class-conditional generative replay) are conceptually related but outside our experimental scope. More broadly, this suggests potential relevance beyond replay-based CL: future non-replay settings, including continual learning for language models, may expose analogous low-privilege attack surfaces wherever a mechanism selects or composes past information (e.g., retrieval/memory selection, prompt/example selection, or routing), even without an explicit replay buffer; we leave direct study of such settings to future work.%


% Place this macro once (before the table or in your preamble)
\newcommand{\itemline}{\par\noindent\rule{0.9\linewidth}{0.3pt}\vspace{0.15em}}

\begin{table}[h!]
\centering
\small
\caption{Threat model: insider control of the replay sampler with constrained budgets and audited visibility.}
\label{tab:threat-model}
\begin{tabular}{p{0.40\linewidth} | p{0.50\linewidth}} 
\rowcolor{babyblue}
\textbf{Capabilities} & \textbf{Attacker Limitations} \\
\hline
Direct control of the replay sampler: set per-class quotas/probabilities; realize index sets \(I_t\) with total mass \(m\). \itemline

Read-only visibility of AUX-buffer labels and \emph{stored} utility scores (\(\tilde u_i\), class-wise \(u_c\)); signals may be lagged EMAs or periodically refreshed logs. \itemline

Temporal planning: schedule compositions and precommit plans for the next \(N\) batches.
&
No access to model internals (weights, gradients, optimizer). \itemline

No read or write access to the live task stream (pixels, labels) and no requirement to query real-time per-index losses from the trainer (utilities are treated as logged metadata). \itemline

Audited controls: fixed keep fraction \(f\) and stealth radius \(\delta\); TV/KL checks on label histograms vs. baseline \(p_0\). \itemline

Active monitoring: auditors track per-batch/window histograms/divergences and cross-check sampler code/config and selection logs. 
\\
\hline
\end{tabular}
\end{table}

This attack surface is realistic in modern MLOps pipelines. Integration points include sampler plugins that intercept batch requests (e.g., \texttt{get\_replay\_batch()}), configuration toggles for ``balanced replay'' that route to a quota module, and data-ops jobs that precompute a selection plan for future windows. Legitimate controls such as \emph{class-balancing}, \emph{prioritized replay}, and \emph{curriculum sampling} are commonly exposed via standard \textbf{PyTorch} sampler/data-loader hooks and the same utility telemetry, making this insider surface viable. The attacker can either persistently write malicious index sets or have the sampler execute a precomputed plan.

\subsection{Preference--Projected Replay Attack}
\label{sec:algorithm}

This subsection details the sampler-side routine as shown in Fig.~\ref{fig:overview}. It runs once per training step, manipulates only replay \emph{indices}, and leaves model weights/gradients/loss unchanged. At each step we solve the class-level program
\begin{equation}
\label{eq:core-obj-embedded}
\max_{p\in\Delta^C}\ u^\top p \quad \text{s.t.}\quad \mathrm{Div}(p\|p_0)\le \delta',\ \ \ \mathrm{Div}\in\{\mathrm{TV},\mathrm{KL}\},
\end{equation}
where $u$ are harm utilities and $\delta'$ is the active stealth budget (either the static $\delta$ or the dynamic windowed budget $\delta_t$).

\begin{algorithm}[ht]
\caption{Amnesia Replay Attack}
\label{alg:amnesia-replay}
\small
\begin{algorithmic}[1]
\Require AUX buffer $\mathcal{A}=\{(x_i,y_i,\tilde u_i)\}_{i=1}^{n_{\text{aux}}}$; nominal histogram $p_0\in\Delta^C$; keep fraction $f$; stealth radius $\delta$; window $W$; divergence $\mathrm{Div}\in\{\mathrm{KL},\mathrm{TV}\}$; class utilities $u_{1:C}$; sample temperature $\eta_s>0$; ring buffer $\mathcal{R}$ of past divergences (size $\le W-1$)
\Ensure Replay indices $I$ with exact mass $m$ and enforced per-batch/window stealth
\State $m \gets \lfloor f \cdot n_{\text{aux}} \rfloor$
\If{$W>1$}
  \State $\displaystyle \delta' \gets \max\!\Big\{0,\ \min_{0\le L\le \min(W-1,|\mathcal R|)} \Big((L+1)\delta - \sum_{\ell=|\mathcal R|-L+1}^{|\mathcal R|} \mathcal R[\ell]\Big)\Big\}$ \Comment{empty sum $=0$ for $L=0$}
\Else
  \State $\delta' \gets \delta$
\EndIf
\If{$\mathrm{Div}=\mathrm{KL}$}
  \State $p^\star \gets \textsc{ProjectKL}(p_0,u,\delta')$ \Comment{{Algorithm~\ref{alg:side_by_side_projections} (ProjectKL)}}
\Else \Comment{$\mathrm{Div}=\mathrm{TV}$}
  \State $p^\star \gets \textsc{ProjectTV}(p_0,u,\delta')$ \Comment{ {Algorithm~\ref{alg:side_by_side_projections} (ProjectTV)}}
\EndIf
\State $q \gets \textsc{RoundToSum}(m\,p^\star)$
\State $q \gets \textsc{ClipToAvailability}(q, \mathcal{A})$
\State $q \gets \textsc{AuditAndFixQuotas}(q, p_0, \delta', \mathrm{Div})$ \Comment{unit transfers until $\mathrm{Div}(q/m\|p_0)\le\delta'$}
\State $\bar p \gets q/m$
\State $I \gets \emptyset$
\For{each class $c$}
  \State add $q[c]$ indices of class $c$ by top-$\tilde u_i$ or by sampling $\propto \exp(\eta_s \tilde u_i)$ without replacement
\EndFor
\State push $\mathrm{Div}(\bar p\|p_0)$ to $\mathcal{R}$
\If{$|\mathcal{R}|>W-1$} \State pop oldest \EndIf
\State \Return $I$
\end{algorithmic}
\end{algorithm}

\paragraph{Step A: Preference (tilt).}
Encode harm preference using per-class utilities \(u_c\) and a tilt strength \(\eta>0\). Define the (unconstrained) tilted mix \(\tilde p_c \propto p_{0,c}\,e^{\eta u_c}\). For \textbf{TV} divergence, only the \emph{ordering} of \(u_c\) matters in the next step; the \textbf{KL} solution is determined exactly by the projector (Step~B). When $u$ is constant, tilting has no effect and $\tilde p=p_0$.

\paragraph{Step B: Projection to the stealth ball (exact optimizers).}
We find the optimal proportions $p^\star$ that solve Eq.~\ref{eq:core-obj-embedded}, attaining the maximum harm while satisfying $\mathrm{Div}(p^\star\|p_0)\le\delta'$. The method differs for KL and TV:
\begin{itemize}[leftmargin=1.2em,itemsep=0pt,topsep=0pt]
  \item \textbf{KL ball (single-tilt search).} By Lagrangian optimality, the optimizer has exponential form
  \begin{equation}
  \label{eq:kl-opt}
    p^\star(c)\ =\ \dfrac{p_0(c)\,\exp(\alpha\,u_c)}{\sum_j p_0(j)\,\exp(\alpha\,u_j)},
  \end{equation}
  where a scalar $\alpha \ge 0$ is chosen so that $\mathrm{KL}(p^\star\|p_0)=\delta'$ whenever the constraint is active; $\mathrm{KL}(p_\alpha\|p_0)$ is monotone in $\alpha$ when $u$ is non-constant, so a safe bisection finds $\alpha$  {(Algorithm~\ref{alg:side_by_side_projections} (ProjectKL))}. Each evaluation is \(O(C)\); overall cost is \(O(C\cdot\text{iters})\).
  \item \textbf{TV ball (two-sided water-filling).} Sort classes by $u_c$ and greedily move probability mass from the \emph{lowest}-utility classes to the \emph{highest}-utility classes until the total $\ell_1$ budget $\sum_c |p_c-p_{0,c}|=2\delta'$ is exhausted, respecting simplex bounds {(Algorithm~\ref{alg:side_by_side_projections} (ProjectTV))}. This is an exact solution and is inherently ``greedy constraint'' in nature: it spends the limited TV budget only on transfers that maximally increase $u^\top p$.
\end{itemize}

\textbf{Both methods return the exact optimizer $p^\star$ of Eq.~\ref{eq:core-obj-embedded}.}

% \begin{algorithm}[h]
% \caption{\textsc{ProjectKL}$(p_0,u,\delta')$ \,\,--\,\, exact KL-ball solution}
% \label{alg:projkl}
% \footnotesize
% \begin{algorithmic}[1]
% \State \textbf{function} $\textsc{Tilt}(\alpha)$: $p_\alpha(c)\gets p_{0,c}\exp(\alpha u_c)$;\, $p_\alpha\gets p_\alpha/\sum_j p_\alpha(j)$;\, \textbf{return} $p_\alpha$
% \State \textbf{if} $\max_c u_c - \min_c u_c = 0$ \textbf{then} \Return $p^\star\gets p_0$ \textbf{end if} \Comment{degenerate constant-$u$ case}
% \State $\alpha_{\text{lo}}\gets 0$;\, $\alpha_{\text{hi}}\gets 1$
% \While{$\mathrm{KL}(\textsc{Tilt}(\alpha_{\text{hi}})\|p_0)<\delta'$}
%   \State $\alpha_{\text{hi}}\gets 2\,\alpha_{\text{hi}}$ \Comment{monotone growth in $\alpha$}
% \EndWhile
% \While{$\alpha_{\text{hi}}-\alpha_{\text{lo}}>\varepsilon_\alpha$}
%   \State $\alpha\gets(\alpha_{\text{lo}}+\alpha_{\text{hi}})/2$;\, $p_\alpha\gets\textsc{Tilt}(\alpha)$
%   \If{$\mathrm{KL}(p_\alpha\|p_0)<\delta'$} $\alpha_{\text{lo}}\gets\alpha$ \Else $\alpha_{\text{hi}}\gets\alpha$ \EndIf
% \EndWhile
% \State \Return $p^\star\gets\textsc{Tilt}(\alpha_{\text{hi}})$
% \end{algorithmic}
% \end{algorithm}

% \begin{algorithm}[h]
% \caption{\textsc{ProjectTV}$(p_0,u,\delta')$ \,\,--\,\, exact TV-ball projection by two-sided water-filling}
% \label{alg:projtv}
% \footnotesize
% \begin{algorithmic}[1]
% \State $p\gets p_0$;\, $b\gets 2\delta'$;\, order indices so $u_{(1)}\le\cdots\le u_{(C)}$
% \State $\ell\gets 1$ (donor), $r\gets C$ (receiver)
% \While{$b>0$ and $\ell<r$}
%   \State $\varepsilon\gets \min\{\,p_{(\ell)},\,1-p_{(r)},\,b/2\,\}$
%   \State $p_{(\ell)}\gets p_{(\ell)}-\varepsilon$;\quad $p_{(r)}\gets p_{(r)}+\varepsilon$;\quad $b\gets b-2\varepsilon$
%   \If{$p_{(\ell)}=0$} $\ell\gets \ell+1$ \EndIf \quad \If{$p_{(r)}=1$} $r\gets r-1$ \EndIf
% \EndWhile
% \State \Return $p^\star\gets p$
% \end{algorithmic}
% \end{algorithm}

\begin{algorithm}[h]
\caption{Projection Solvers for KL and TV Balls}
\label{alg:side_by_side_projections}
\footnotesize

% --- Left Column: ProjectKL ---
\begin{minipage}[t]{0.49\textwidth}
\vspace{0pt}
\textbf{{ProjectKL (left panel):}} \textsc{ProjectKL}$(p_0,u,\delta')$

\hrule
\begin{algorithmic}[1]
    \State \textbf{function} $\textsc{Tilt}(\alpha)$:
    \State \quad $p_\alpha \gets p_{0}\exp(\alpha u)$; \textbf{return} $p_\alpha/\sum p_\alpha$
    
    \State \textbf{if} $\max u = \min u$ \textbf{then} \Return $p_0$ \textbf{end if}
    \State $\alpha_{\text{lo}}\gets 0$;\, $\alpha_{\text{hi}}\gets 1$
    
    \While{$\mathrm{KL}(\textsc{Tilt}(\alpha_{\text{hi}})\|p_0)<\delta'$}
        \State $\alpha_{\text{hi}}\gets 2\,\alpha_{\text{hi}}$
    \EndWhile

    \While{$\alpha_{\text{hi}}-\alpha_{\text{lo}}>\varepsilon_\alpha$}
        \State $\alpha\gets(\alpha_{\text{lo}}+\alpha_{\text{hi}})/2$;\, $p_\alpha\gets\textsc{Tilt}(\alpha)$
        \If{$\mathrm{KL}(p_\alpha\|p_0)<\delta'$}
            \State $\alpha_{\text{lo}}\gets\alpha$
        \Else
            \State $\alpha_{\text{hi}}\gets\alpha$
        \EndIf
    \EndWhile
    \State \Return $p^\star\gets\textsc{Tilt}(\alpha_{\text{hi}})$
\end{algorithmic}
\end{minipage}%
\hfill
% --- Right Column: ProjectTV ---
\begin{minipage}[t]{0.49\textwidth}
\vspace{0pt}
\textbf{{ProjectTV (right panel):}} \textsc{ProjectTV}$(p_0,u,\delta')$
\hrule
\begin{algorithmic}[1]
    \State $p\gets p_0$;\, $b\gets 2\delta'$
    \State Sort indices: $u_{(1)}\le\cdots\le u_{(C)}$
    \State $\ell\gets 1$ (donor), $r\gets C$ (receiver)
    
    \While{$b>0$ and $\ell<r$}
        \State $\varepsilon\gets \min\{\,p_{(\ell)},\,1-p_{(r)},\,b/2\,\}$
        \State $p_{(\ell)}\gets p_{(\ell)}-\varepsilon$
        \State $p_{(r)}\gets p_{(r)}+\varepsilon$
        \State $b\gets b-2\varepsilon$
        
        \If{$p_{(\ell)}=0$} $\ell\gets \ell+1$ \EndIf 
        \If{$p_{(r)}=1$} $r\gets r-1$ \EndIf
    \EndWhile
    \State \Return $p^\star\gets p$
    
    % Phantom lines to align exact bottom
    \State \phantom{Line 14}
    \State \phantom{Line 15}
\end{algorithmic}
\end{minipage}

\end{algorithm}

{\textbf{Step C: Budgeted quotas and within-class selection.}}
\par\noindent


Convert \(p^\star\) to practical integer quotas \(q \in \mathbb{N}_0^{C_t}\) (Alg.~\ref{alg:amnesia-replay}, Lines~9--11): (i) apply \emph{largest-remainder rounding} (Hamilton method~\citep{JansonLinusson2012}): set $q_c \leftarrow \lfloor m\,p^\star_c\rfloor$, then allocate the remaining $m-\sum_c q_c$ units to classes with the largest fractional parts of $m\,p^\star_c$ until $\sum_c q_c=m$; (ii) \emph{clip to availability} ($q_c \le a_c$) and redistribute any deficit while preserving the total mass $m$; (iii) run \emph{audit-and-fix} to ensure $\mathrm{Div}(q/m\|p_0)\le \delta'$.

\par\noindent
Rounding introduces only tiny, explicitly bounded distortions: \(\|\bar p-p^\star\|_\infty \le 1/m\) and \(\|\bar p-p^\star\|_1 \le C_t/m\) (i.e., the bound scales with the current number of seen classes).%



For \textbf{TV}, each single-unit transfer from any class with $\bar p_i>p_{0,i}$ to any class with $\bar p_j<p_{0,j}$ decreases TV by exactly $1/m$, so feasibility is reached in finitely many swaps (at most \(2\delta' m\) swaps in the worst case). For \textbf{KL}, we perform discrete ``steepest-decrease'' swaps (from the largest log-ratio $\log(\bar p_c/p_{0,c})$ to the smallest), which strictly decreases KL each step until $\mathrm{KL}(\bar p\|p_0)\le\delta'$.
Within each class, select indices by \emph{top-\(\tilde u_i\)} or \emph{probabilistically} with \(\Pr(i\!\mid\!y_i=c)\propto e^{\eta_s \tilde u_i}\) (no replacement).

\paragraph{Step D: Windowed visibility (online scheduler).}
With windowed auditing (length \(W\)), we compute a tightened \(\delta_t\) from the ring buffer $\mathcal R$ (Alg.~\ref{alg:amnesia-replay}, Lines~2--6) and enforce $\mathrm{Div}(\bar p_t\|p_0)\le \delta'_t$ at each step. Intuitively, the scheduler spends only the \emph{residual} budget left after accounting for the last $W{-}1$ steps. In ideal arithmetic, this implies a deterministic sliding-window guarantee:

\smallskip
\noindent\textbf{Proposition (Residual-budget window compliance).}
Let \(\delta'_t\) be chosen as in Alg.~\ref{alg:amnesia-replay} and enforce \(\mathrm{Div}(\bar p_t\|p_0)\le \delta'_t\) at every step. If \(\mathrm{Div}\) is convex in its first argument, then for all \(t\) and all \(1\le L\le W\),
\begin{equation}
\label{eq:window-stealth}
\mathrm{Div}\!\big(\hat p_{t-L+1:t}\ \|\ p_0\big)\ \le\ \delta.
\end{equation}
\emph{Proof sketch.} The residual update ensures the partial sums over any trailing window of length \(L\le W\) never exceed \(L\delta\). By Jensen/convexity, \(\mathrm{Div}(\hat p\|p_0)\le \frac1L\sum_s \mathrm{Div}(\bar p_s\|p_0)\le \delta\).
\smallskip

In practice, discretization/availability can introduce rare numerical slack; we therefore also report empirical window-violation rates in §\ref{sec:visibility}.

\paragraph{Complexity.}
Scheduler update is \(O(W)\). Projection is \(O(C\cdot\text{iters})\) (KL) or \(O(C\log C)\) (TV, for sorting). Quotas/auditing are \(O(C)\) plus a small number of unit transfers.

\subsection{Visibility and Efficiency Guarantees}
\label{sec:visibility}

Our method provides three key guarantees by construction, ensuring stealth and budget compliance; we also state an explicit dominance property.

\paragraph{Guarantee 1: Per-batch Stealth (hard constraint) and tail reporting (policy).}
By Step~C's audit-and-fix, the realized histogram satisfies \(\mathrm{Div}(\bar p_t\|p_0)\le \delta'\) (hence \(\le\delta\)). For reporting, we define the \emph{normalized} per-batch divergence and its 95th-percentile summary:
\begin{equation}
\label{eq:rbatch95-visibility}
r_t := \frac{\mathrm{Div}(\bar p_t\|p_0)}{\delta}\in[0,1],
\qquad
r_{\text{batch@95}} := \operatorname{Quantile}_{0.95}\{r_t\}.
\end{equation}
We use \(r_{\text{batch@95}}\le 0.05\) as a \emph{calibrated reporting threshold}: it means 95\% of batches spend at most 5\% of the audit radius, leaving headroom for rounding/availability noise. This is a policy choice for distinguishing ``highly stealthy'' regimes (not a requirement implied by the definition of the audit ball). Appendix reports the clean-run calibration motivating the 0.05 band.

\paragraph{Guarantee 2: Windowed Stealth (deterministic scheduler) and empirical slack.}
Under Step~D’s residual-budget scheduler and convexity of \(\mathrm{Div}\), Eq.~\ref{eq:window-stealth} holds for any window \(L \le W\).
We additionally report an empirical \textbf{window violation rate} to capture any residual exceedances due to discretization/measurement noise:
\begin{equation}
\label{eq:rwin-visibility}
r_{\text{win}} \;=\; \frac{1}{T}\sum_{t=1}^{T} \mathbb{1}\!\left\{\mathrm{Div}\!\big(\hat p_{t-W+1:t} \,\|\, p_0\big) > \delta \right\}.
\end{equation}
We treat \(r_{\text{win}}\le 0.05\) as an \emph{operational acceptance band} (clean-calibrated) rather than a theoretical necessity.

\paragraph{Guarantee 3: Budget Conservation.}
The quota generation process (Step~C) is strictly mass-preserving, ensuring \(\sum_{c=1}^C q_{t,c}=m\). This yields a realized keep fraction \(\hat f_t=m/n_{\text{aux}}\) that tightly tracks the target \(f\), with \(|\hat f_t-f|\le 1/n_{\text{aux}}\). We verify this using the \textbf{95th-percentile fraction error}:
\begin{equation}
\label{eq:e95-visibility}
e_{95} \;=\; \operatorname{Quantile}_{0.95}\!\big(|\hat f_t - f|\big).
\end{equation}
Because mass is exact, we have the deterministic bound \(e_{95}\le 1/n_{\text{aux}}\) (e.g., \(\le 0.002\) for \(n_{\text{aux}}{=}500\)); we use \(e_{95}\!\le\!0.02\) only as a loose reporting range.

\paragraph{Dominance property (optimality under the audited constraint).}
Fix any non-constant utility vector $u$, nominal histogram $p_0$, divergence type $\mathrm{Div}\in\{\mathrm{KL},\mathrm{TV}\}$, and active budget $\delta'$. Let $p^\star$ be the solution of Eq.~\ref{eq:core-obj-embedded}. Then for any other feasible sampler-level class mix $p\in\Delta^C$ with $\mathrm{Div}(p\|p_0)\le \delta'$, we have \(u^\top p \le u^\top p^\star\), with strict inequality unless $p$ is also optimal (and in particular if $\delta'>0$ and $u$ is non-constant, the optimizer is unique for KL). Thus, given a specified harm surrogate $u$ and the same auditable constraint set, Amnesia’s projection step is not a heuristic: it is the exact best-response class composition.

\section{Experimental Setup}

\subsection{Datasets}
\label{sec:datasets}
We evaluate on four standard CL benchmarks of increasing difficulty: \textit{Split CIFAR-10}~\citep{cifar} (10 classes, 32\(\times\)32 RGB; 5 tasks \(\times\) 2 classes, easy), \textit{CORe50}~\citep{core50} (50 classes; class-incremental; 10 tasks \(\times\) 5 classes, harder), \textit{Split CIFAR-100}~\citep{cifar} (100 classes, 32\(\times\)32 RGB; 10 tasks \(\times\) 10 classes), and \textit{Tiny-ImageNet}~\citep{tinyimagenet} (ImageNet subset, 200 classes, 64\(\times\)64; we use 100 classes as 5 tasks \(\times\) 20 classes and reserve the remaining 100 for auxiliary out-of-stream ablations, more challenging).

\subsection{Replay-based Continual Learning Methods}
\label{sec:replay-methods}
Replay CL maintains a small memory \(\mathcal{M}\) and, at step \(t\), optimizes on the union of the current mini-batch \(B_t\) and a memory mini-batch \(M_t\!\subset\!\mathcal{M}\) (typically via reservoir sampling). We attack the canonical \textbf{Experience Replay (ER)}~\citep{er} and three prominent extensions: \textbf{ER-ACE}~\citep{erace} (asymmetric cross-entropy restricting logits on new data to current-task classes to reduce representation drift), \textbf{DER++}~\citep{der} (ER with knowledge distillation by matching current logits to stored past logits for memory samples), and \textbf{SCR}~\citep{scr} (supervised contrastive loss on mixed new+replay batches to learn a more unified representation).

\subsection{Training Protocol}
\textbf{Protocol.} Unless stated otherwise, we use \textbf{ResNet-18}~\citep{resnet} with \textbf{SGD} (lr \(=0.03\)), mini-batch size \(64\), and buffer size \(500\). Images follow official splits and are normalized to \([0,1]\); CORe50 uses the \emph{New Classes (NC)} scenario. Each task is trained for \textbf{10 epochs} with a \textbf{fixed} task order; results are averaged over \textbf{5} seeds. Default replay/audit settings are keep fraction \(f{=}0.1\), stealth radius \(\delta{=}0.1\), and audit window \(W{=}10\).

\paragraph{Clean vs.\ attacked hyperparameters.}
For every method/dataset, learner-side hyperparameters are \emph{identical} between clean and attacked runs (optimizer, learning rate, epochs, augmentations, buffer size, and method-specific settings), isolating the effect of \emph{sampler-level replay composition} (index selection) from retuning. Since CL performance can be hyperparameter-sensitive, we focus on \emph{relative} degradation under a fixed, standard protocol rather than globally optimal tuning.

\paragraph{Sampler telemetry used to construct utilities.}
Amnesia consumes lightweight, \emph{sampler-readable} utility logs produced by the training job and read asynchronously by the sampler. Unless stated otherwise, \(\tilde u_i\) is the cross-entropy loss of a replayed example when it is last observed during training, and \(u_c\) is an EMA over an aggregation (mean) of \(\tilde u_i\) for samples with label \(c\); thus the sampler requires no real-time per-index model queries and only reads the latest available utility snapshot.

\paragraph{Nominal histogram for auditing.}
Unless stated otherwise, \(p_0\) is the replay-buffer class histogram at sampling time (with standard smoothing to ensure \(p_{0,c}>0\) for KL), and all audit metrics (\(\mathrm{TV}\) or \(\mathrm{KL}\)) are computed between the realized replay histogram and this \(p_0\).


\subsection{Evaluation Criteria}
Beyond the stealth/budget metrics in \S\ref{sec:visibility}, we report final accuracy and backward transfer (BWT). Let \(R_{i,j}\) denote performance on task \(j\) after training through task \(i\), and let \(T\) be the number of tasks:
\begin{equation}
\mathrm{BWT}=\frac{1}{T-1}\sum_{j=1}^{T-1}\!\big(R_{T,j}-R_{j,j}\big),
\qquad
\mathrm{ACC}=\frac{1}{T}\sum_{j=1}^{T} R_{T,j}.
\label{eq:bwt_acc}
\end{equation}
By convention, negative BWT indicates forgetting; we therefore report \(\mathbf{-\mathrm{BWT}}\) so that larger values correspond to more forgetting (stronger attack impact).

\section{Results}
\label{sec:Results}

\begin{table*}[t]
\caption{\textbf{Amnesia attack results (Impact / Stealth / Budget).}
Base (no attack) vs.\ \textbf{KL} and \textbf{TV} for all models/datasets.
\textbf{Impact:} ACC$\downarrow$, $-\mathrm{BWT}\uparrow$;\;
\textbf{Stealth:} $r_{\text{batch@95}}\downarrow$, $r_{\text{win}}\downarrow$;\;
\textbf{Budget:} $e_{95}\downarrow$.
\emph{Mean{\scriptsize$\pm$}std over 5 seeds. Stealth/Budget are $\times10^{-2}$ (e.g., $5\!=\!0.05$).}
\emph{Red cells exceed conservative, clean-calibrated reporting bands
($r_{\text{batch@95}}\!\le\!0.05$, $r_{\text{win}}\!\le\!0.05$, $e_{95}\!\le\!0.02$); these are reporting policies
(tail headroom / violation-rate targets), not a change to the enforced per-batch audit constraint
$\mathrm{Div}(\bar p\|p_0)\le \delta'$ (guaranteed by construction).}}
\label{tab:impact-stealth-budget}
\centering
{\small
\setlength{\tabcolsep}{2.5pt}
\renewcommand{\arraystretch}{1.15}
\resizebox{\textwidth}{!}{%
\begin{tabular}{l l c | c c c | c c c}
\hline
\multirow{2}{*}[-1ex]{\textbf{Dataset}} &
\multirow{2}{*}[-1ex]{\textbf{Model}} &
\multirow{2}{*}[-2ex]{\makecell{\textbf{No Attack}\\[0.5ex]ACC$\downarrow$ (-BWT$\uparrow$)}} &
\multicolumn{3}{c|}{\tstrut\textbf{KL}\bstrut} &
\multicolumn{3}{c}{\tstrut\textbf{TV}\bstrut} \\
\cline{4-9}
& & &
\makecell{\textbf{Impact}\\ ACC$\downarrow$ \;(-BWT$\uparrow$)} &
\makecell{\textbf{Stealth}\\ $r_{\text{batch@95}}\downarrow$ \,($r_{\text{win}}\downarrow$)} &
\makecell{\textbf{Budget}\\ $e_{95}\downarrow$} &
\makecell{\textbf{Impact}\\ ACC$\downarrow$ \;(-BWT$\uparrow$)} &
\makecell{\textbf{Stealth}\\ $r_{\text{batch@95}}\downarrow$ \,($r_{\text{win}}\downarrow$)} &
\makecell{\textbf{Budget}\\ $e_{95}\downarrow$} \\
\hline
% --- (table body unchanged) ---
% (keep your full table rows exactly as-is)
\multirow{4}{*}{CIFAR-10}
  & ER         & 50.4{\scriptsize$\pm$0.6} (55.1{\scriptsize$\pm$1.3})
               & 29.3{\scriptsize$\pm$0.29} (82.7{\scriptsize$\pm$2.0})
               & 5.0{\scriptsize$\pm$0.4} (0.2{\scriptsize$\pm$0.1})
               & 1.0{\scriptsize$\pm$0.3}
               & 25.1{\scriptsize$\pm$0.25} (88.2{\scriptsize$\pm$2.3})
               & {5.0{\scriptsize$\pm$2.0} (6.0{\scriptsize$\pm$1.5})}
               & 1.5{\scriptsize$\pm$0.5} \\
  & \textbf{SCR} & 59.6{\scriptsize$\pm$0.7} (38.9{\scriptsize$\pm$0.4})
               & \textcolor{darkgreen}{\textbf{31.1{\scriptsize$\pm$0.8} (78.2{\scriptsize$\pm$2.1})}}
               & \textcolor{darkgreen}{\textbf{4.0{\scriptsize$\pm$0.3} (0.7{\scriptsize$\pm$0.2})}}
               & \textcolor{darkgreen}{\textbf{1.5{\scriptsize$\pm$0.4}}}
               & 28.9{\scriptsize$\pm$0.29} (83.4{\scriptsize$\pm$2.3})
               & 4.0{\scriptsize$\pm$0.3} (0.7{\scriptsize$\pm$0.2})
               & {1.8{\scriptsize$\pm$0.5}} \\
  & DER++      & 64.0{\scriptsize$\pm$0.7} (29.1{\scriptsize$\pm$0.3})
               & 33.1{\scriptsize$\pm$0.9} (69.1{\scriptsize$\pm$2.0})
               & 5.0{\scriptsize$\pm$0.5} (0.2{\scriptsize$\pm$0.1})
               & 0.9{\scriptsize$\pm$0.3}
               & 30.7{\scriptsize$\pm$0.90} (75.1{\scriptsize$\pm$2.2})
               & \textcolor{red}{10{\scriptsize$\pm$4.0} (1.0{\scriptsize$\pm$0.5})}
               & 2.0{\scriptsize$\pm$0.7} \\
  & ER-ACE     & 65.2{\scriptsize$\pm$0.8} (15.2{\scriptsize$\pm$0.15})
               & 34.2{\scriptsize$\pm$1.0} (56.9{\scriptsize$\pm$1.7})
               & 1.0{\scriptsize$\pm$0.2} (0.1{\scriptsize$\pm$0.1})
               & 1.0{\scriptsize$\pm$0.3}
               & 32.3{\scriptsize$\pm$0.95} (60.3{\scriptsize$\pm$1.8})
               & 4.0{\scriptsize$\pm$0.4} (3.0{\scriptsize$\pm$0.8})
               & 2.0{\scriptsize$\pm$0.7} \\
\hline

\multirow{4}{*}{CORe50}
  & \textbf{ER} & 55.4{\scriptsize$\pm$0.7} (52.5{\scriptsize$\pm$1.5})
               & 32.5{\scriptsize$\pm$0.8} (84.4{\scriptsize$\pm$2.1})
               & 0.8{\scriptsize$\pm$0.2} (0{\scriptsize$\pm$0.0})
               & 0.9{\scriptsize$\pm$0.3}
               & \textcolor{darkgreen}{\textbf{32.1{\scriptsize$\pm$0.96} (85.8{\scriptsize$\pm$2.2})}}
               & \textcolor{darkgreen}{\textbf{4.0{\scriptsize$\pm$0.3} (0{\scriptsize$\pm$0.0})}}
               & \textcolor{darkgreen}{\textbf{0.8{\scriptsize$\pm$0.3}}} \\
  & SCR        & 52.9{\scriptsize$\pm$0.8} (58.8{\scriptsize$\pm$1.6})
               & 28.5{\scriptsize$\pm$0.28} (87.2{\scriptsize$\pm$2.2})
               & 5.0{\scriptsize$\pm$0.4} (0{\scriptsize$\pm$0.0})
               & 0.9{\scriptsize$\pm$0.3}
               & 33.5{\scriptsize$\pm$1.00} (82.4{\scriptsize$\pm$2.1})
               & 3.0{\scriptsize$\pm$0.3} (5.0{\scriptsize$\pm$0.8})
               & 0.9{\scriptsize$\pm$0.3} \\
  & DER++      & 61.7{\scriptsize$\pm$0.8} (47.3{\scriptsize$\pm$0.47})
               & 30.1{\scriptsize$\pm$0.90} (86.8{\scriptsize$\pm$2.2})
               & \textcolor{red}{8.0{\scriptsize$\pm$3.0} (0.1{\scriptsize$\pm$0.5})}
               & 1.0{\scriptsize$\pm$0.6}
               & 32.2{\scriptsize$\pm$0.96} (84.53{\scriptsize$\pm$2.4})
               & \textcolor{red}{10{\scriptsize$\pm$4.0} (0.3{\scriptsize$\pm$1.0})}
               & 1.0{\scriptsize$\pm$0.8} \\
  & ER-ACE     & 68.7{\scriptsize$\pm$0.9} (24.5{\scriptsize$\pm$0.25})
               & 36.0{\scriptsize$\pm$1.0} (75.7{\scriptsize$\pm$2.0})
               & 1.4{\scriptsize$\pm$0.4} (0.1{\scriptsize$\pm$0.1})
               & 2.0{\scriptsize$\pm$0.8}
               & 34.6{\scriptsize$\pm$1.00} (74.7{\scriptsize$\pm$2.1})
               & {4.0{\scriptsize$\pm$1.5} (3.0{\scriptsize$\pm$2.0})}
               & 0.9{\scriptsize$\pm$0.6} \\
\hline

\multirow{4}{*}{CIFAR-100}
  & ER         & 35.2{\scriptsize$\pm$0.7} (57.4{\scriptsize$\pm$1.7})
               & 20.3{\scriptsize$\pm$0.20} (72.7{\scriptsize$\pm$2.2})
               & 4.0{\scriptsize$\pm$0.6} (2.0{\scriptsize$\pm$0.6})
               & 0.3{\scriptsize$\pm$0.2}
               & 21.8{\scriptsize$\pm$0.22} (73.1{\scriptsize$\pm$2.2})
               & 5.0{\scriptsize$\pm$0.7} (3.0{\scriptsize$\pm$0.6})
               & 1.0{\scriptsize$\pm$0.5} \\
  & SCR        & 33.5{\scriptsize$\pm$0.7} (52.3{\scriptsize$\pm$1.6})
               & 21.5{\scriptsize$\pm$0.22} (75.6{\scriptsize$\pm$2.3})
               & 4.0{\scriptsize$\pm$0.5} (3.0{\scriptsize$\pm$0.6})
               & 1.2{\scriptsize$\pm$0.7}
               & 19.2{\scriptsize$\pm$0.19} (81.4{\scriptsize$\pm$2.4})
               & \textcolor{red}{15{\scriptsize$\pm$4.0} (2.0{\scriptsize$\pm$1.0})}
               & 1.9{\scriptsize$\pm$0.9} \\
  & DER++      & 33.1{\scriptsize$\pm$0.7} (45.2{\scriptsize$\pm$0.45})
               & 18.7{\scriptsize$\pm$0.19} (63.6{\scriptsize$\pm$1.9})
               & \textcolor{red}{13{\scriptsize$\pm$5.0} (2.0{\scriptsize$\pm$1.0})}
               & 1.9{\scriptsize$\pm$0.9}
               & 20.1{\scriptsize$\pm$0.20} (60.5{\scriptsize$\pm$1.8})
               & \textcolor{red}{41{\scriptsize$\pm$5.0} (6{\scriptsize$\pm$5.0})}
               & \textcolor{red}{5.0{\scriptsize$\pm$3.0}} \\
  & \textbf{ER-ACE} & 44.2{\scriptsize$\pm$0.9} (21.3{\scriptsize$\pm$0.21})
               & 23.3{\scriptsize$\pm$0.23} (65.6{\scriptsize$\pm$2.0})
               & 3.7{\scriptsize$\pm$0.6} (0{\scriptsize$\pm$0.0})
               & 0.2{\scriptsize$\pm$0.2}
               & \textcolor{darkgreen}{\textbf{21.5{\scriptsize$\pm$0.22} (70.1{\scriptsize$\pm$2.1})}}
               & \textcolor{darkgreen}{\textbf{5.0{\scriptsize$\pm$0.9} (5.0{\scriptsize$\pm$0.9})}}
               & \textcolor{darkgreen}{\textbf{0.9{\scriptsize$\pm$0.8}}} \\
\hline

\multirow{4}{*}{TinyImageNet}
  & ER         & 17.0{\scriptsize$\pm$0.17} (21.3{\scriptsize$\pm$0.21})
               & 8.2{\scriptsize$\pm$0.08} (39.3{\scriptsize$\pm$0.39})
               & 4.0{\scriptsize$\pm$0.6} (3.0{\scriptsize$\pm$0.6})
               & 0.6{\scriptsize$\pm$0.5}
               & 6.5{\scriptsize$\pm$0.06} (45.9{\scriptsize$\pm$0.45})
               & \textcolor{red}{11{\scriptsize$\pm$4.0} (12{\scriptsize$\pm$5.0})}
               & 1.2{\scriptsize$\pm$0.8} \\
  & SCR        & 22.5{\scriptsize$\pm$0.22} (20.2{\scriptsize$\pm$0.20})
               & 11.4{\scriptsize$\pm$0.11} (33.7{\scriptsize$\pm$0.33})
               & \textcolor{red}{12{\scriptsize$\pm$4.0} (2.0{\scriptsize$\pm$1.0})}
               & 1.1{\scriptsize$\pm$0.8}
               & 9.9{\scriptsize$\pm$0.09} (42.8{\scriptsize$\pm$0.42})
               & 5.0{\scriptsize$\pm$0.8} (4.0{\scriptsize$\pm$0.8})
               & 2.0{\scriptsize$\pm$0.9} \\
  & DER++      & 21.8{\scriptsize$\pm$0.22} (19.5{\scriptsize$\pm$0.20})
               & 10.2{\scriptsize$\pm$0.10} (30.1{\scriptsize$\pm$0.30})
               & \textcolor{red}{11{\scriptsize$\pm$3.0} (2.0{\scriptsize$\pm$1.0})}
               & 1.2{\scriptsize$\pm$0.8}
               & 11.8{\scriptsize$\pm$0.12} (38.4{\scriptsize$\pm$0.38})
               & \textcolor{red}{34{\scriptsize$\pm$5.0} (36{\scriptsize$\pm$5.0})}
               & \textcolor{red}{5.1{\scriptsize$\pm$3.0}} \\
  & \textbf{ER-ACE} & 23.5{\scriptsize$\pm$0.24} (15.7{\scriptsize$\pm$0.16})
               & \textcolor{darkgreen}{\textbf{13.1{\scriptsize$\pm$0.13} (45.1{\scriptsize$\pm$0.45})}}
               & \textcolor{darkgreen}{\textbf{2.8{\scriptsize$\pm$0.6} (1.0{\scriptsize$\pm$0.6})}}
               & \textcolor{darkgreen}{\textbf{0.7{\scriptsize$\pm$0.5}}}
               & 10.5{\scriptsize$\pm$0.10} (52.4{\scriptsize$\pm$1.50})
               & 5.0{\scriptsize$\pm$0.7} (4.0{\scriptsize$\pm$0.8})
               & 1.9{\scriptsize$\pm$0.8} \\
\hline
\end{tabular}
} % end resizebox
} % end small block
\end{table*}

\paragraph{Main impact and audit adherence.}
Table~\ref{tab:impact-stealth-budget} summarizes \emph{impact} (ACC$\downarrow$, $-\mathrm{BWT}\uparrow$) and \emph{audit adherence} (Stealth, Budget). The sampler enforces the core per-batch constraint $\mathrm{Div}(\bar p\|p_0)\le \delta'$ by construction (via audit-and-fix); \textcolor{red}{red} denotes exceeding conservative reporting bands: (i) $r_{\text{batch@95}}\!\le\!0.05$ (95\% of steps spend at most 5\% of the audit radius; tail headroom) and (ii) $r_{\text{win}}\!\le\!0.05$ (an operational target for rare window exceedances under discretization/availability noise). Under these criteria, \textcolor{darkgreen}{green} highlights the best \emph{compliant} entries per setting.

Across datasets and replay methods, Amnesia induces substantial forgetting: ACC drops sharply and $-\mathrm{BWT}$ rises. Canonical ER is often (though not always) among the most vulnerable. ER-ACE, which mitigates representation drift, is typically the most robust \emph{among compliant runs} (e.g., Tiny-ImageNet/KL) while still suffering large degradation. DER++ is particularly sensitive in high-class regimes (CIFAR-100, Tiny-ImageNet), exhibiting both strong impact and frequent red-flagged stealth/budget issues. A plausible explanation is that DER++’s distillation loss sharpens the utility landscape and creates more extreme sampling pressures, which are harder to realize cleanly under integer quotas and limited per-class availability.

\paragraph{When and why do red-flagged audit metrics occur?}
Band exceedances are more common as the \emph{effective granularity} tightens, especially when $m/C \approx 1$ (few items per class per batch). CORe50 ($C{=}50$) is representative when $f{=}0.1$ (small $m$), and CIFAR-100 / Tiny-ImageNet are also tight because $C$ is large. In these regimes, discretization (rounding), clipping to availability, and sparse allocations (especially for TV) can create batch-to-batch spikes in normalized divergence and rare window exceedances. This matches \S\ref{sec:algorithm}: when $m/C$ is small, even a one-unit quota change yields a large change in $\bar p$, so realized histograms become inherently ``chunky,'' increasing the risk of tail spikes in $r_t$ and, for TV, occasional window exceedances under aggressive reallocation.

\paragraph{KL vs.\ TV trade-off and the ``greedy constraint'' question.}
A consistent KL--TV trade-off emerges: TV often yields slightly stronger impact but triggers more red-flagged stealth/budget entries. This follows from the projectors. KL’s exponential tilt, $p^\star(c)\propto p_0(c)\exp(\alpha u_c)$, preserves positive mass on all classes and discretizes more smoothly, producing fewer tail/window spikes. TV’s two-sided water-filling can push low-utility classes toward $0$, producing a sparser and more extreme $p^\star$ (closer to the unconstrained optimum) that is harder to realize under integerization and availability constraints.


This also motivates a simple, budget-aware baseline: \emph{the TV variant is itself greedy and budget-aware.} 
The TV projector is exactly two-sided water-filling, greedily transferring probability mass from the lowest-utility to the highest-utility classes until the TV budget is exhausted (Algorithm~\ref{alg:side_by_side_projections} (ProjectTV)).
For KL, the ``optimization'' is a single-parameter monotone search implementing the exponential-tilt solution (Algorithm~\ref{alg:side_by_side_projections} (ProjectKL)).
Thus, the key modeling choice is the harm surrogate $u$ (defined in \S\ref{sec:notation} and instantiated in \S\ref{sec:datasets}).

\subsection{Robustness to Backbone Choice}
\label{sec:backbone-robustness}

% \begin{table}[ht]
% \caption{\textbf{CIFAR-100 (ER-ACE, KL):} backbone sensitivity (mean{\scriptsize$\pm$}std over 5 seeds; Stealth/Budget $\times10^{-2}$).}
% \label{tab:cifar100_erace_transposed_splitrows_vit_convnext}
% \centering
% {\small
% \setlength{\tabcolsep}{5pt}
% \renewcommand{\arraystretch}{0.95}
% \begin{tabular}{l c c}
% \hline
% \rowcolor{babyblue}
% \textbf{Metric} & \textbf{VIT tiny} & \textbf{Convnext Base} \\
% \hline
% Baseline ACC & 48.1{\scriptsize$\pm$0.8} & 51.7{\scriptsize$\pm$0.9} \\
% Baseline Forget & 20.0{\scriptsize$\pm$0.20} & 19.1{\scriptsize$\pm$0.18} \\
% Attack ACC & 25.6{\scriptsize$\pm$0.25} & 27.9{\scriptsize$\pm$0.30} \\
% Attack Forget & 63.8{\scriptsize$\pm$1.8} & 61.9{\scriptsize$\pm$1.7} \\
% Stealthy $r_{\text{batch@95}}$ & 3.5{\scriptsize$\pm$0.6} & 3.3{\scriptsize$\pm$0.5} \\
% Stealthy $r_{\text{win}}$ & 0{\scriptsize$\pm$0.0} & 0{\scriptsize$\pm$0.0} \\
% Budget $e_{95}$ & 0.2{\scriptsize$\pm$0.2} & 0.2{\scriptsize$\pm$0.2} \\
% \hline
% \end{tabular}
% }
% \end{table}

Replacing ResNet-18 with ViT-Tiny and ConvNeXt-Base (CIFAR-100, ER-ACE, KL) preserves the qualitative conclusion (Table~\ref{tab:cifar100_erace_transposed_splitrows_vit_convnext}): Amnesia remains strongly effective (large ACC drops and severe increases in forgetting) while staying within the same audit policy (low $r_{\text{batch@95}}$, zero $r_{\text{win}}$, and tight $e_{95}$). Stronger backbones improve clean performance and show slightly higher robustness under attack (higher attacked ACC and slightly lower forgetting), suggesting that increased representational stability can reduce (but not eliminate) the leverage of divergence-constrained replay steering.

% \begin{table}[htbp]
% \caption{\textbf{Ablation of preference vs.\ projection on ER-ACE (Split CIFAR-10).}
% \checkmark/\xmark denote enabled/disabled components. \textbf{aux\_trim} is the \emph{preference} step (utility-based prioritization of replay items), while \textbf{PrO} applies the \emph{projection/quota/audit} pipeline without harm preference (projection-only control around $p_0$). \textbf{PO}: aux\_trim only; \textbf{PrO}: projection only; \textbf{Amnesia}: both.}
% \label{tab:scdt-results}
% \centering
% {\small
% \setlength{\tabcolsep}{3pt}
% \renewcommand{\arraystretch}{0.95}
% \begin{tabular}{l c c c r r r}
% \hline
% \rowcolor{babyblue}
% \textbf{Model} & \textbf{aux\_trim} & \textbf{PrO} & \textbf{Div.} &
% \textbf{ACC (-BWT)} &
% $\boldsymbol{r_{\text{batch@95}}(r_{\text{win}})}$ &
% \textbf{$e_{95}$} \\
% \hline
% \rowcolor{AlgBack!45}
% ER-ACE (Baseline) & \xmark & \xmark & N/A & 65.21 / 15.25 & N/A           & N/A \\
% PO       & \checkmark & \xmark & N/A & 68.55 / 12.39 & N/A         & N/A \\
% \rowcolor{AlgBack!45}
% PrO (KL)    & \xmark & \checkmark & KL  & 45.07 / 31.67 & 0.516 (0.224) & 0.013 \\
% PrO (TV)       & \xmark & \checkmark & TV  & 47.38 / 26.06 & 1.183 (0.222) & 0.012 \\
% \rowcolor{AlgBack!45}
% Amnesia (TV)    & \checkmark & \checkmark & TV  & 32.34 / 60.30 & 0.040 (0.085) & 0.020 \\
% \textcolor{darkgreen}{\textbf{Amnesia (KL)}} &
%   \textcolor{darkgreen}{\textbf{\checkmark}} &
%   \textcolor{darkgreen}{\textbf{\checkmark}} &
%   \textcolor{darkgreen}{\textbf{KL}} &
%   \textcolor{darkgreen}{\textbf{34.20 / 56.90}} &
%   \textcolor{darkgreen}{\textbf{0.010 (0.000)}} &
%   \textcolor{darkgreen}{\textbf{0.011}} \\
% \hline
% \end{tabular}
% }
% \end{table}

\begin{table}[ht]
    \centering
    % Table 1: Backbone Metrics
    \begin{minipage}{0.37\textwidth}
        \centering
        \caption{\textbf{CIFAR-100 (ER-ACE) under (KL attack)}. Values are mean{\scriptsize$\pm$}std over 5 seeds.
        Stealth and Budget are reported $\times 10^{-2}$ (e.g., $5$ denotes $0.05$).}
        \label{tab:cifar100_erace_transposed_splitrows_vit_convnext}
        \resizebox{\linewidth}{!}{%
        \begin{tabular}{l c c}
        \hline
        \rowcolor{babyblue}
        \textbf{Metric} & \textbf{VIT} & \textbf{Convnext} \\
        \hline
        Baseline ACC & 48.1{\scriptsize$\pm$0.8} & 51.7{\scriptsize$\pm$0.9} \\
        Baseline Forget & 20.0{\scriptsize$\pm$0.20} & 19.1{\scriptsize$\pm$0.18} \\
        Attack ACC & 25.6{\scriptsize$\pm$0.25} & 27.9{\scriptsize$\pm$0.30} \\
        Attack Forget & 63.8{\scriptsize$\pm$1.8} & 61.9{\scriptsize$\pm$1.7} \\
        Stealthy $r_{\text{batch@95}}$ & 3.5{\scriptsize$\pm$0.6} & 3.3{\scriptsize$\pm$0.5} \\
        Stealthy $r_{\text{win}}$ & 0{\scriptsize$\pm$0.0} & 0{\scriptsize$\pm$0.0} \\
        Budget $e_{95}$ & 0.2{\scriptsize$\pm$0.2} & 0.2{\scriptsize$\pm$0.2} \\
        \hline
        \end{tabular}}
    \end{minipage}
    \hfill
    % Table 2: Ablation Study
    \begin{minipage}{0.60\textwidth}
        \centering
        \caption{\textbf{Ablation of preference and projection on ER-ACE (Split CIFAR-10).}
        \checkmark/\xmark\ denote enabled/disabled. \textbf{aux\_trim} utility prioritization; \textbf{PrO} is the projection only \textbf{PO}: aux\_trim only.}
        \label{tab:scdt-results}
        \resizebox{\linewidth}{!}{%
        \begin{tabular}{l c c c r r r}
        \hline
        \rowcolor{babyblue}
        \textbf{Model} & \textbf{aux\_trim} & \textbf{PrO} & \textbf{Div.} &
        \textbf{ACC (-BWT)} &
        $\boldsymbol{r_{\text{batch@95}}(r_{\text{win}})}$ &
        \textbf{$e_{95}$} \\
        \hline

        \rowcolor{AlgBack!45}
        ER-ACE (Baseline) & \xmark & \xmark & N/A & 65.21 / 15.25 & N/A            & N/A \\
        PO        & \checkmark & \xmark & N/A & 68.55 / 12.39 & N/A         & N/A \\
        \rowcolor{AlgBack!45}
        PrO (KL)     & \xmark & \checkmark & KL  & 45.07 / 31.67 & 0.516 (0.224) & 0.013 \\
        PrO (TV)        & \xmark & \checkmark & TV  & 47.38 / 26.06 & 1.183 (0.222) & 0.012 \\
        \rowcolor{AlgBack!45}
        Amnesia (TV)     & \checkmark & \checkmark & TV  & 32.34 / 60.30 & 0.040 (0.085) & 0.020 \\
        \textcolor{darkgreen}{\textbf{Amnesia (KL)}} &
          \textcolor{darkgreen}{\textbf{\checkmark}} &
          \textcolor{darkgreen}{\textbf{\checkmark}} &
          \textcolor{darkgreen}{\textbf{KL}} &
          \textcolor{darkgreen}{\textbf{34.20 / 56.90}} &
          \textcolor{darkgreen}{\textbf{0.010 (0.000)}} &
          \textcolor{darkgreen}{\textbf{0.011}} \\
        \hline
        \end{tabular}}
    \end{minipage}
\end{table}

\paragraph{Ablation insights (ER\mbox{-}ACE, Split CIFAR\mbox{-}10).}
Table~\ref{tab:scdt-results} disentangles \textbf{Preference} (aux\_trim) and \textbf{Projection} (PrO).
\emph{Preference without Projection} (\textbf{PO}) can improve clean CL (higher ACC, lower $-\mathrm{BWT}$): PO resembles prioritized replay / hard-example mining and does not enforce an adversarial class-level reallocation under an audit constraint, so it can act as a benign optimization rather than an attack. \emph{Projection without Preference} (\textbf{PrO}) increases forgetting but is substantially more visible (large $r_{\text{batch@95}}$ and non-trivial $r_{\text{win}}$), showing that spending divergence budget without harm-targeted preference is inefficient and detectable. The full two-stage method restores the intended harm--stealth balance: \textbf{Amnesia (KL)} achieves strong forgetting with clean batch/window compliance, while \textbf{Amnesia (TV)} is more aggressive but generally more brittle because sparse, extreme allocations are harder to realize under discretization and availability constraints.

\begin{figure*}[h!]
    \centering
    \includegraphics[width=\textwidth]{images/Ablation.png}
\caption{\textbf{Ablations (ER‑ACE, Split CIFAR‑10).} Blue (left axis): \(-\mathrm{BWT}\) (impact). Red (right axis): an audit/budget metric.
(a) $x$: keep fraction \(f\); red: \(r_{\text{batch@95}}\) (\(\times10^{-1}\)).
(b) $x$: stealth radius \(\delta\); red: \(r_{\text{batch@95}}\) (\(\times10^{-2}\)).
(c) $x$: audit window \(W\); red: \(r_{\text{win}}\) (\(\times10^{-2}\)).
(d) $x$: buffer size (batch size in parentheses); red: \(e_{95}\) (\(\times10^{-2}\)).}
    \label{fig:ablation}
\end{figure*}

Fig.~\ref{fig:ablation}(a,b) show keep fraction $f$ and stealth radius $\delta$ effects (blue: higher $-\mathrm{BWT}\Rightarrow$ more forgetting; red: batch-level audit pressure). In Fig.~\ref{fig:ablation}(a), increasing \(f\) enlarges replay mass \(m{=}\lfloor f n_{\text{aux}}\rfloor\), letting the sampler realize \(p^\star\) more faithfully and select more harmful items; thus \(-\mathrm{BWT}\) rises, while \(r_{\text{batch@95}}\) falls due to reduced discretization pressure, matching Step~C’s bound \(\|\bar p{-}p^\star\|_1 \!\le\! C/m\) (larger \(m\Rightarrow\) less ``chunkiness''). In Fig.~\ref{fig:ablation}(b), increasing \(\delta\) expands the feasible ball, so the optimizer spends more divergence and both \(-\mathrm{BWT}\) and \(r_{\text{batch@95}}\) increase monotonically (harm--stealth trade-off).

Fig.~\ref{fig:ablation}(c) shows larger \(W\) makes the residual-budget scheduler more conservative (smaller \(\delta_t\)), slightly reducing \(-\mathrm{BWT}\) while driving window violation rate down, consistent with the window-compliance guarantee in \S\ref{sec:visibility}. Fig.~\ref{fig:ablation}(d) shows larger buffers strengthen the attack (more replay candidates and stronger within-class choice) while improving budget tracking, since larger buffers/batches reduce integerization effects and availability-driven failure modes.

Because prior baselines do not enforce auditable stealth/budget constraints, Table~\ref{tab:ours_vs_four_attacks} reports only shared metrics (ACC, $-\mathrm{BWT}$) and should be read as \emph{impact-at-any-cost} references. In task-incremental CIFAR-10 with ER, we compare Amnesia with three data-poisoning baselines; Table~\ref{tab:ours_vs_four_attacks} lists their ACC and $-\mathrm{BWT}$. Despite operating under explicit audit-aware constraints and sampler-only access (pixels/labels/parameters untouched), Amnesia achieves strong forgetting, highlighting index-only replay control as a practical threat surface in monitored deployments.

\begin{table}[t]
    \centering
    \begin{minipage}{0.55\textwidth}
        \centering
        \caption{Comparison between Amnesia and similar attacks.}
        \label{tab:ours_vs_four_attacks}
        \resizebox{\linewidth}{!}{%
        \begin{tabular}{l c}
            \hline
            \rowcolor{babyblue}
            \textbf{Attack Method} & \textbf{Result} \\
            \hline
            \rowcolor{AlgBack!45}
            Targeted Poisoning~\citep{TargetedPoisoningCL2022} & 19.6 (66.3) \\
            PACOL~\citep{PACOL2023}                            & 15.8 (68.7) \\
            BrainWash~\citep{Abbasi2024BrainWash}              & 25.5 (75.4) \\
            \textbf{Ours}                                     & \textbf{29.3 (82.7)} \\
            \hline
        \end{tabular}}
    \end{minipage}
    \hfill
    \begin{minipage}{0.40\textwidth}
        \centering
        \caption{End-to-end training time on CIFAR-10 (mean{\scriptsize$\pm$}std).}
        \label{tab:runtime-cifar10-transposed}
        \resizebox{\linewidth}{!}{%
        \begin{tabular}{l r}
            \hline
            \rowcolor{babyblue}
            \textbf{Method} & \textbf{Time (min)} \\
            \hline
            Baseline (no attack) & 11{:}34 {\scriptsize$\pm$ 0{:}28} \\
            \rowcolor{AlgBack!45}
            Amnesia (KL)         & 11{:}58 {\scriptsize$\pm$ 0{:}49} \\
            Amnesia (TV)         & 12{:}23 {\scriptsize$\pm$ 0{:}58} \\
            \hline
        \end{tabular}}
    \end{minipage}
\end{table}

\noindent\textbf{Runtime overhead.}
Relative to baseline, Amnesia adds modest end-to-end cost: +24\,s (\(\approx\)3.5\%) for KL and +49\,s (\(\approx\)7.1\%) for TV. This matches sampler-side costs: \(O(W)\) scheduling, \(O(C)\) quotas/audit, and \(O(C)\) (KL) vs.\ \(O(C\log C)\) (TV) projection, which remain small relative to forward/backward passes and support the practicality of sampler-only interference.
\par\smallskip\noindent

\noindent
Appendix~\ref{app:runtime-profile} (Table~\ref{tab:runtime-profile}) provides a component-level breakdown of this overhead into (i) projection, (ii) Rounding/Clipping, and (iii) Audit-and-Fix, highlighting that TV is dominated by sorting in projection and by swap counts in audit-and-fix.

\section{Conclusion}
We identified a realistic, auditable vulnerability in continual learning: \emph{sampler‑level} control of replay indices. \textbf{Amnesia} casts malicious replay composition as a divergence‑constrained program with explicit visibility ($\delta$) and mass ($f$) budgets, combining a harm‑driven \emph{preference} step with an exact \emph{projection} onto a TV/KL stealth ball. The resulting optimizers, a KL \emph{single-tilt} and a TV \emph{two-sided water-filling}, together with a windowed scheduler, are efficient, mass-preserving, and optimal within the audited set. Across Split CIFAR-10/100, CORe50, and TinyImageNet, with strong replay baselines (ER, ER-ACE, SCR, DER++), Amnesia reliably reduces accuracy and increases forgetting while satisfying audits in most settings. KL yields near‑maximal damage with high compliance; TV achieves higher impact but is brittle when the mass‑per‑class ratio is tight. Ablations confirm that both \emph{preference} and \emph{projection} are necessary to attain the intended impact–visibility–budget trade‑off.

A limitation of this study is that it assumes labeled buffers and a fixed nominal histogram $p_0$ for auditing; applicability to unlabeled or evolving label spaces remains unexplored. Moreover, when $m/C\!\approx\!1$, discretization and availability constraints can strain stealth, especially for TV. Future directions can develop sampler‑aware defenses (attested index selection, cryptographic logging, randomness beacons) and \emph{multi‑metric} auditors beyond class histograms (e.g., MMD/CUSUM, gradient telemetry), and extend the framework to generative or unlabeled replay, task‑free online CL, and RL/robotics, alongside theory linking $(\delta,f)$ to bounds on expected $-\mathrm{BWT}$ and detectability. More broadly, future work should test whether analogous selection/composition attack surfaces arise in non-replay CL and continual learning for language models (e.g., retrieval/memory selection, prompt/example selection, or routing). Our findings elevate replay index selection to a first-class security primitive, underscoring the need to secure the data path, {not just the model}, in deployed CL systems.


\bibliography{tmlr}
\bibliographystyle{tmlr}

% \appendix
% \clearpage
% \setcounter{page}{1}
% \maketitlesupplementary

% % ===========================
% % Appendix: Minimal Proofs & Algorithms (Only what is needed)
% % ===========================
\appendix
\section{Methodology Proofs}
\label{app:minimal}


\subsection{KL projector: optimality and monotone root search}
\label{app:kl-proof-min}

\begin{lemma}[KL-ball optimizer]
\label{lem:kl-opt-min}
For $p_0\in\Delta^C$ with $p_{0,c}>0$, $u\in\mathbb{R}^C$, and $\delta'\ge 0$, the problem
\[
\max_{p\in\Delta^C} u^\top p \ \text{s.t.}\ \mathrm{KL}(p\|p_0)\le \delta'
\]
admits an optimizer of the form
\[
p^\star(c)=\dfrac{p_{0,c}e^{\alpha u_c}}{\sum_j p_{0,j}e^{\alpha u_j}}
\]
for some $\alpha\ge 0$. If $u$ is non-constant, this optimizer is unique. If $u$ is constant, any feasible $p$ is optimal and the choice $\alpha=0$ yields $p^\star=p_0$.
\end{lemma}
\begin{proof}[Sketch]
Lagrangian $\mathcal{L}(p,\lambda,\nu)=u^\top p-\lambda(\sum_c p_c-1)-\nu(\mathrm{KL}(p\|p_0)-\delta')$, $\nu\ge 0$.
Stationarity: $u_c-\lambda-\nu(\log\frac{p_c}{p_{0,c}}+1)=0 \Rightarrow \log\frac{p_c}{p_{0,c}}=\alpha u_c+\beta$, so $p_c\propto p_{0,c}e^{\alpha u_c}$. Strict convexity of $\mathrm{KL}(\cdot\|p_0)$ on the simplex implies uniqueness when $u$ is non-constant; if $u$ is constant, $u^\top p$ is constant on the feasible set and any feasible $p$ is optimal (the tilt with $\alpha=0$ returns $p_0$).
\end{proof}

\begin{lemma}[Monotonicity of the KL radius]
\label{lem:kl-mono-min}
Let $p_\alpha(c)=\dfrac{p_{0,c}e^{\alpha u_c}}{\sum_j p_{0,j}e^{\alpha u_j}}$ and $g(\alpha)=\mathrm{KL}(p_\alpha\|p_0)$. Then
\[
g'(\alpha) \;=\; \alpha\,\mathrm{Var}_{p_\alpha}(u)\ \ge\ 0,
\]
with $>$ for $\alpha>0$ when $u$ is non-constant. Thus $g$ is strictly increasing on $(0,\infty)$ and bisection/Newton finds the unique $\alpha$ with $g(\alpha)=\delta'$.
\end{lemma}
\begin{proof}[Sketch]
$g(\alpha)=\alpha\,\mathbb{E}_{p_\alpha}[u]-\log Z(\alpha)$, where $Z(\alpha)=\sum_c p_{0,c}e^{\alpha u_c}$. Using $\frac{d}{d\alpha}\log Z=\mathbb{E}_{p_\alpha}[u]$ and $\frac{d^2}{d\alpha^2}\log Z=\mathrm{Var}_{p_\alpha}(u)$, we get $g'(\alpha)=\alpha\,\mathrm{Var}_{p_\alpha}(u)$.
\end{proof}

The KL-constrained problem says: maximize a linear score $u^\top p$ while staying close to $p_0$ in KL. The Lagrange multiplier conditions force the optimal solution to have \emph{log-ratios} $\log\!\frac{p_c}{p_{0,c}}$ that are linear in utilities $u_c$, which is exactly the \emph{exponential tilt} $p_c \propto p_{0,c}e^{\alpha u_c}$. Because KL is strictly convex, this tilted form is the unique optimizer on the KL ball (when $u$ is non-constant). As $\alpha$ increases, the solution places progressively more mass on higher-utility classes; one can show the KL radius grows like $\alpha\times\text{variance}(u)$ under the tilted distribution, which is strictly increasing unless $u$ is constant. Hence a simple one-dimensional search (bisection/Newton) finds the unique $\alpha$ that hits the budget.


\subsection{TV projector: optimality of two-sided water-filling}
\label{app:tv-proof-min}

With a linear objective and an $\ell_1$ distance budget to $p_0$, the best way to improve $u^\top p$ is always to move probability from \emph{worse} classes to \emph{better} classes. Any plan that moves mass from a not-so-bad donor to a not-so-good receiver can be improved by instead using the \emph{lowest} utility donor and the \emph{highest} utility receiver for the same budget cost. This “exchange” logic justifies sorting once by $u_c$ and then repeatedly transferring mass from the bottom to the top until the budget is exhausted or you hit the $[0,1]$ bounds—exactly the water-filling procedure.

\begin{lemma}[TV-ball optimizer]
\label{lem:tv-opt-min}
For $\delta'\ge 0$, the solution to $\max_{p\in\Delta^C} u^\top p$ subject to $\tfrac12\|p-p_0\|_1\le \delta'$ is obtained by moving probability from the lowest-$u$ classes to the highest-$u$ classes until the $\ell_1$ budget $2\delta'$ is exhausted or box constraints hit.
\end{lemma}
\begin{proof}[Sketch]
This is a linear program over a simplex slice. If any feasible $p$ moves mass from a donor $i$ to a receiver $j$ with $u_i>u_{i'}$ or $u_j<u_{j'}$, swapping to use the \emph{lowest} donor and \emph{highest} receiver improves $u^\top p$ without changing feasibility (standard exchange argument). Sorting once and greedily transferring realizes the optimum.
\end{proof}


\subsection{Rounding error bounds (largest remainder / Hamilton)}
\label{app:rounding-min}

We turn real-valued target counts $m p^\star$ into integers by taking floors and then giving the leftover items to the classes with the largest fractional parts. This guarantees each class’s realized fraction $\bar p_c=q_c/m$ is at most one item off from its target, i.e., $|\bar p_c-p^\star_c|\le 1/m$. Since at most $C$ coordinates can differ and each by at most $1/m$, the total $\ell_1$ error is at most $C/m$. In short, rounding introduces only \emph{tiny}, explicitly bounded distortions.

\begin{lemma}[Rounding bounds]
\label{lem:round-bounds-min}
Let $p^\star\in\Delta^C$, $m\in\mathbb{N}$, $q_c=\lfloor m p^\star_c\rfloor$, and assign the $m-\sum_c q_c$ leftover units to the largest fractional remainders of $m p^\star_c$. With $\bar p=q/m$,
\[
\|\bar p - p^\star\|_\infty \le \frac{1}{m},
\qquad
\|\bar p - p^\star\|_1 \le \frac{C}{m}.
\]
\end{lemma}
\begin{proof}
Each coordinate changes by at most $1/m$; at most $C$ coordinates change by $1/m$. Summing yields the $\ell_1$ bound; the $\ell_\infty$ bound is immediate.
\end{proof}


\subsection{Audit-and-Fix: termination and decrease}
\label{app:audit-min}


\paragraph{TV case.}
\begin{algorithm}[h]
\caption{\textsc{AuditFixTV}$(q,p_0,\delta')$}
\label{alg:audit-tv-min}
\begin{algorithmic}[1]
\State $\bar p\gets q/m$
\While{$\mathrm{TV}(\bar p\|p_0)>\delta'$}
  \State pick donor $i$ with $\bar p_i>p_{0,i}$ and receiver $j$ with $\bar p_j<p_{0,j}$
  \State $q_i\gets q_i-1$;\ $q_j\gets q_j+1$;\ $\bar p\gets q/m$
\EndWhile
\State \Return $q$
\end{algorithmic}
\end{algorithm}

\emph{TV:} If your realized mix $\bar p$ strays past the TV budget, moving \emph{one} item from any class that is \emph{above} its nominal level $p_{0,c}$ to any class that is \emph{below} nominal shrinks the $\ell_1$ gap by exactly $2/m$, so TV drops by $1/m$ every move. After finitely many item swaps, you must re-enter the budget. \emph{KL:} KL is a sum of convex terms $x\log(x/p_0)$. Moving one item from the class with the largest log-ratio $\log(\bar p_c/p_{0,c})$ to the class with the smallest log-ratio makes the KL strictly smaller. Repeating this discrete “steepest decrease” step guarantees KL falls below the threshold in finitely many swaps.

\begin{lemma}[Decrease \& termination for TV]
\label{lem:audit-tv-min}
Each unit transfer in Alg.~\ref{alg:audit-tv-min} decreases $\|\bar p-p_0\|_1$ by $2/m$, hence $\mathrm{TV}$ by $1/m$. Therefore the loop terminates in at most
\[
N_{\max}\;=\;\Big\lceil \frac{\big(\|\bar p^{(0)}-p_0\|_1-2\delta'\big)_+}{2/m} \Big\rceil
\]
transfers.
\end{lemma}
\begin{proof}
Moving $1/m$ from a coordinate above $p_{0,i}$ to one below $p_{0,j}$ reduces the two absolute deviations by $1/m$ each; others unchanged.
\end{proof}

\paragraph{KL case.}
\begin{algorithm}[h]
\caption{\textsc{AuditFixKL}$(q,p_0,\delta')$}
\label{alg:audit-kl-min}
\begin{algorithmic}[1]
\State $\bar p\gets q/m$
\While{$\mathrm{KL}(\bar p\|p_0)>\delta'$}
  \State donor $i\in\arg\max_c \log\!\frac{\bar p_c}{p_{0,c}}$;\quad receiver $j\in\arg\min_c \log\!\frac{\bar p_c}{p_{0,c}}$
  \State $q_i\gets q_i-1$;\ $q_j\gets q_j+1$;\ $\bar p\gets q/m$
\EndWhile
\State \Return $q$
\end{algorithmic}
\end{algorithm}

\begin{lemma}[Strict decrease for KL (discrete step)]
\label{lem:audit-kl-min}
Let $s=1/m$. If $\bar p_i>p_{0,i}$ and $\bar p_j<p_{0,j}$, then for $p'=\bar p - s e_i + s e_j$,
\[
\mathrm{KL}(p'\|p_0)-\mathrm{KL}(\bar p\|p_0)
= h(p_i-s)-h(p_i) + h(p_j+s)-h(p_j) < 0,
\]
where $h(x)=x\log(x/p_0)$ is convex (coordinatewise). Choosing $i,j$ by extreme log-ratios ensures decrease until feasibility $\mathrm{KL}\le\delta'$ holds, so Alg.~\ref{alg:audit-kl-min} terminates in finitely many steps.
\end{lemma}
\begin{proof}[Sketch]
Convexity of $h$ implies the discrete move from an over-weighted to an under-weighted coordinate reduces the sum $\sum_c h(p_c)$; picking extremes gives the steepest decrease among unit moves.
\end{proof}


\subsection{Windowed scheduler guarantee}
\label{app:window-min}

Divergence is convex, so the divergence of a window \emph{average} of batches is at most the \emph{average} of their divergences. The scheduler keeps a running ledger of how much divergence has already been “spent” in recent steps and assigns the next step only the \emph{residual} budget so that, for each window length $L\le W$, the cumulative divergence over the last $L-1$ steps plus the current step is bounded by $L\delta$ (conservative), which implies the window-average divergence stays within $\delta$.

\begin{lemma}[Residual-budget scheduler implies window stealth]
\label{lem:window-min}
Let $r_t=\mathrm{Div}(\bar p_t\|p_0)$ and define
\[
\delta'_t \;=\; \max\!\Big\{0,\ \min_{1\le L\le |\mathcal{R}|}\Big[L\,\delta - \sum_{\ell=t-L+1}^{\,t-1} r_\ell\Big]\Big\}.
\]
If $\mathrm{Div}$ is convex in its first argument and $r_t\le \delta'_t$ for all $t$, then for every window $1\le L\le W$,
\[
\mathrm{Div}\!\Big(\tfrac1L\sum_{s=t-L+1}^{t}\bar p_s\ \Big\|\ p_0\Big) \ \le\ \delta.
\]
\end{lemma}
\begin{proof}
By Jensen, $\mathrm{Div}(\tfrac1L\sum_s \bar p_s\|p_0)\le \tfrac1L\sum_s r_s$. For a given window length $L\ge 2$, pick $L'=L-1$ in the scheduler definition to get
$r_t \le L'\delta - \sum_{s=t-L'}^{t-1} r_s$, hence
$\sum_{s=t-L+1}^{t} r_s \le (L-1)\delta$ and
$\mathrm{Div}(\tfrac1L\sum_s \bar p_s\|p_0) \le \tfrac{L-1}{L}\delta \le \delta$.
For $L=1$, the definition yields $\delta'_t\le \delta$ and thus $r_t\le \delta$.
\end{proof}


\section{Ablation Study}
\subsection{Calibration of Stealth Thresholds}
\label{app:calibration-tables}

\paragraph{Definitions (normalized metrics).}
Let \(r_t := \mathrm{Div}(\bar p_t\|p_0)/\delta\in[0,1]\) be the per‑batch divergence normalized by the audit radius \(\delta\).
Define \(r_{\text{batch@95}} := \operatorname{Quantile}_{0.95}\{r_t\}\) and, for window length \(W\), the normalized window average
\(\hat r_{t-L+1:t} := \frac{1}{L}\sum_{s=t-L+1}^{t}r_s\) with the window‑violation rate
\(r_{\text{win}} := \frac{1}{T}\sum_{t=1}^{T}\mathbb{1}\{\hat r_{t-W+1:t}>1\}\).
We report \(r_{\text{batch@95}}\) and \(r_{\text{win}}\) in the tables as \(\times 10^2\) (e.g., 5.0 \(\equiv\) 0.05).

\vspace{0.25em}
\paragraph{Acceptance rules (concise statements).}
\begin{enumerate}[leftmargin=1.4em,itemsep=1pt,topsep=2pt]
\item \textbf{Batch rule.} \(r_{\text{batch@95}}\le 0.05\) means 95\% of batches use at most 5\% of the audit ball. This is a \emph{normalized} (i.e., \(\delta\)-aware) tail‑control convention; it is interpretable across any \(\delta\) and provides headroom for rounding/availability noise. It is \emph{not} a window guarantee by itself.
\item \textbf{Window rule.} \(r_{\text{win}}\le 0.05\) tolerates at most 5\% of steps with window excess, on top of the scheduler that deterministically enforces sliding‑window compliance (stated next). The 0.05 allowance is operational slack for discretization/measurement noise.
\end{enumerate}

\vspace{0.25em}
\paragraph{Deterministic window guarantee (via residual budgets).}
\textbf{Proposition (Residual‑budget window compliance).}
Let \(\delta'_t\) be chosen as in Alg.~\ref{alg:amnesia-replay} (residual budget from the last \(W{-}1\) steps) and enforce
\(\mathrm{Div}(\bar p_t\|p_0)\le \delta'_t\) at every step. Then for all \(t\) and all \(1\le L\le W\),
\[
\sum_{s=t-L+1}^{t}\mathrm{Div}(\bar p_s\|p_0)\ \le\ L\,\delta
\quad\Longrightarrow\quad
\hat r_{t-L+1:t}\ \le\ 1,
\]
hence \(\mathrm{Div}(\hat p_{t-L+1:t}\|p_0)\le \delta\) by convexity of \(\mathrm{Div}\) in its first argument.
\emph{Proof sketch.} The residual update ensures the partial sums over any trailing window of length \(L\le W\) never exceed \(L\delta\).
Normalizing by \(\delta\) yields \(\hat r\le 1\), and convexity gives \(\mathrm{Div}(\hat p\|p_0)\le \frac1L\sum \mathrm{Div}(\bar p_s\|p_0)\le \delta\).\hfill\(\square\)



\begin{table}[h!]
    \centering
    % First Minipage: Batch Divergence
    \begin{minipage}{0.48\textwidth}
        \centering
        \caption{\textbf{Batch‑divergence calibration} (\(r_{\text{batch@95}}=0.05\)). KL concentrates below cutoff; TV splits evenly. \emph{Batch divergences at the 0.05 threshold (reported \(\times 10^2\)).}}
        \label{tab:calib-batch-threshold}
        \resizebox{\linewidth}{!}{%
        \begin{tabular}{l c c | c c}
            \hline
            \rowcolor{babyblue}
            \textbf{Dataset} &
            \multicolumn{2}{c|}{\textbf{KL}} &
            \multicolumn{2}{c}{\textbf{TV}} \\
            \hhline{~--|--}
            \rowcolor{babyblue}
             & \(\le\) 5.0 & \(>\) 5.0 & \(\le\) 5.0 & \(>\) 5.0 \\
            \hline
            CIFAR-10      & 4 & 0 & 2 & 2 \\
            CORe50        & 3 & 1 & 2 & 2 \\
            CIFAR-100     & 3 & 1 & 2 & 2 \\
            TinyImageNet  & 2 & 2 & 2 & 2 \\
            \hline
            \textbf{Total} & \textbf{12} & \textbf{4} & \textbf{8} & \textbf{8} \\
            \hline
        \end{tabular}}
    \end{minipage}
    \hfill % Gutter space
    % Second Minipage: Window Violation
    \begin{minipage}{0.48\textwidth}
        \centering
        \caption{\textbf{Window‑violation calibration} (\(r_{\text{win}}=0.05\)). All KL settings comply; TV has several excesses. \emph{Window violations at the 0.05 threshold (reported \(\times 10^2\)).}}
        \label{tab:calib-window-threshold}
        \resizebox{\linewidth}{!}{%
        \begin{tabular}{l c c | c c}
            \hline
            \rowcolor{babyblue}
            \textbf{Dataset} &
            \multicolumn{2}{c|}{\textbf{KL}} &
            \multicolumn{2}{c}{\textbf{TV}} \\
            \hhline{~--|--}
            \rowcolor{babyblue}
             & \(\le\) 5.0 & \(>\) 5.0 & \(\le\) 5.0 & \(>\) 5.0 \\
            \hline
            CIFAR-10      & 4 & 0 & 3 & 1 \\
            CORe50        & 4 & 0 & 3 & 1 \\
            CIFAR-100     & 4 & 0 & 3 & 1 \\
            TinyImageNet  & 4 & 0 & 2 & 2 \\
            \hline
            \textbf{Total} & \textbf{16} & \textbf{0} & \textbf{11} & \textbf{5} \\
            \hline
        \end{tabular}}
    \end{minipage}
\end{table}








\vspace{0.25em}
\paragraph{Why the thresholds are principled and adaptable.}
\begin{itemize}[leftmargin=1.4em,itemsep=1pt,topsep=2pt]
\item \textbf{Normalized to \(\delta\).} Both metrics are ratios to the audit ball; the same numerical acceptance (\(0.05\)) applies for any \(\delta\) and directly expresses “fraction of budget used.”
\item \textbf{Complementary roles.} The batch rule gives \emph{tail control}; the window rule is backed by a \emph{deterministic scheduler} ensuring \(\mathrm{Div}(\hat p\|p_0)\le\delta\) for all \(L\le W\).
\item \textbf{Empirical separation.} Tables~\ref{tab:calib-batch-threshold}–\ref{tab:calib-window-threshold} show that \(0.05\) cleanly partitions stealthy (KL) vs.\ aggressive (TV) regimes across datasets/models.
\end{itemize}

\vspace{0.25em}
\paragraph{Dependence on attacker budgets (monotone trends).}
Because the metrics are normalized, acceptance relates transparently to inputs:
\begin{itemize}[leftmargin=1.4em,itemsep=1pt,topsep=2pt]
\item \textbf{Stealth radius \(\delta\).} Larger \(\delta\) enlarges the feasible ball; optimal tilts spend more budget, typically raising normalized \(r_{\text{batch@95}}\) and \(r_{\text{win}}\) (as observed in the ablations).
\item \textbf{Replay mass \(m=\lfloor f\,n_{\text{aux}}\rfloor\) and class count \(C\).} Discretization scales like \(O(C/m)\); increasing \(f\) or \(n_{\text{aux}}\) (or reducing \(C\)) improves batch headroom and lowers violations.
\item \textbf{Window length \(W\).} Larger \(W\) tightens per‑step residuals \(\delta'_t\), reducing \(r_{\text{win}}\) (scheduler conservatism).
\end{itemize}

\noindent
\emph{Practical note.} Practitioners may tighten \(0.05\) if desired; the scheduler and normalization make such policy choices portable across \(\delta\) and datasets while preserving the guarantees above.

\subsection{Sensitivity to Imperfect / Lagged Baselines \texorpdfstring{$p_0$}{p0}}
\label{app:p0-sensitivity}

\paragraph{Setup (baseline mismatch).}
We evaluate the sensitivity of Amnesia to baseline mismatch by simulating an attacker that does not observe the auditor’s instantaneous nominal baseline. Concretely, at step $t$, the auditor computes $p_0(t)$ from the current replay-buffer label histogram (as in \S\ref{sec:datasets}), while the attacker uses a lagged estimate $\hat p_0(t) = p_0(t-k)$ when performing projection and quota realization. We then report impact (ACC, $-\mathrm{BWT}$) and \emph{auditor-side} stealth metrics computed against $p_0(t)$, i.e., $r_{\text{batch@95}}$ and $r_{\text{win}}$ defined in \S\ref{sec:visibility}. Table~\ref{tab:p0-sensitivity} summarizes the resulting impact--stealth trade-off as the lag $k$ increases.

\begin{table}[h!]
\centering
\caption{\textbf{Sensitivity to baseline mismatch (lagged $\hat p_0$).}
Representative setting: Split CIFAR-10 with ER-ACE under default budgets $(f{=}0.1,\ \delta{=}0.1,\ W{=}10)$.
The attacker uses $\hat p_0(t)=p_0(t-k)$ for projection/quota realization, while the auditor evaluates stealth against $p_0(t)$.
\emph{Stealth metrics are reported $\times 10^{-2}$ (e.g., $5.0 \equiv 0.05$).}}
\label{tab:p0-sensitivity}
\small
\setlength{\tabcolsep}{3.5pt}
\renewcommand{\arraystretch}{0.95}
\begin{tabular}{c | c c | c c}
\toprule
\multirow{2}{*}{Lag $k$ (steps)} &
\multicolumn{2}{c|}{\textbf{Amnesia (KL)}} &
\multicolumn{2}{c}{\textbf{Amnesia (TV)}} \\
\cmidrule(lr){2-3}\cmidrule(lr){4-5}
& \makecell{Impact\\ ACC$\downarrow$ \;($-\mathrm{BWT}\uparrow$)} &
\makecell{Stealth\\ $r_{\text{batch@95}}\downarrow$ \;($r_{\text{win}}\downarrow$)} &
\makecell{Impact\\ ACC$\downarrow$ \;($-\mathrm{BWT}\uparrow$)} &
\makecell{Stealth\\ $r_{\text{batch@95}}\downarrow$ \;($r_{\text{win}}\downarrow$)} \\
\midrule
0   & 34.2{\scriptsize$\pm$1.0}\ (56.9{\scriptsize$\pm$1.7}) & 1.0{\scriptsize$\pm$0.2}\ (0.1{\scriptsize$\pm$0.1})
    & 32.3{\scriptsize$\pm$0.95}\ (60.3{\scriptsize$\pm$1.8}) & 4.0{\scriptsize$\pm$0.4}\ (3.0{\scriptsize$\pm$0.8}) \\
50  & 34.5{\scriptsize$\pm$1.0}\ (56.2{\scriptsize$\pm$1.7}) & 1.4{\scriptsize$\pm$0.3}\ (0.2{\scriptsize$\pm$0.1})
    & 32.8{\scriptsize$\pm$1.0}\ (59.2{\scriptsize$\pm$1.9})  & 4.9{\scriptsize$\pm$0.6}\ (3.8{\scriptsize$\pm$0.9}) \\
200 & 35.3{\scriptsize$\pm$1.1}\ (54.5{\scriptsize$\pm$1.8}) & 2.6{\scriptsize$\pm$0.4}\ (0.4{\scriptsize$\pm$0.2})
    & 34.2{\scriptsize$\pm$1.1}\ (56.7{\scriptsize$\pm$2.0})  & 7.2{\scriptsize$\pm$0.9}\ (6.1{\scriptsize$\pm$1.3}) \\
500 & 36.8{\scriptsize$\pm$1.2}\ (51.8{\scriptsize$\pm$2.0}) & 5.4{\scriptsize$\pm$0.7}\ (0.9{\scriptsize$\pm$0.3})
    & 36.0{\scriptsize$\pm$1.2}\ (53.0{\scriptsize$\pm$2.2})  & 11.0{\scriptsize$\pm$1.4}\ (10.5{\scriptsize$\pm$2.0}) \\
\bottomrule
\end{tabular}
\end{table}

\paragraph{Findings.}
Table~\ref{tab:p0-sensitivity} shows a consistent degradation pattern under baseline mismatch: as the lag $k$ increases, attack impact diminishes (ACC increases and $-\mathrm{BWT}$ decreases) while auditor-side stealth metrics worsen (higher $r_{\text{batch@95}}$ and $r_{\text{win}}$). The effect is mild for small lags (e.g., $k\le 50$), where buffer histograms change slowly and $\hat p_0(t)$ remains close to $p_0(t)$, but becomes more pronounced for larger lags. The TV variant is more sensitive, exhibiting a sharper rise in window violations under large lag, consistent with TV’s more extreme mass transfers. Overall, these results support the interpretation that baseline mismatch effectively consumes part of the usable visibility margin; operating with a more conservative effective radius can mitigate violations at the cost of reduced impact.


\subsection{{Runtime Profiling Breakdown}}
\label{app:runtime-profile}

\paragraph{Profiling protocol.}
We instrument the sampler-side pipeline with wall-clock timers around three components executed once per replay batch:
(i) \textbf{Projection} (Alg.~\ref{alg:side_by_side_projections}),
(ii) \textbf{Rounding/Clipping} (largest-remainder rounding + availability clipping),
and (iii) \textbf{Audit-and-Fix} (unit-transfer loop ensuring $\mathrm{Div}(\bar p\|p_0)\le\delta'$).
We then sum each component over the full CIFAR-10 training run to obtain a whole-run breakdown that is directly comparable to the end-to-end times in Table~\ref{tab:runtime-cifar10-transposed}.

\begin{table}[h!]
\centering
\caption{\textbf{Whole-run runtime breakdown on CIFAR-10.}
Component-level overhead (seconds) for Amnesia relative to the baseline run in Table~\ref{tab:runtime-cifar10-transposed}.
Total runtime equals baseline plus the sum of the three overhead components.}
\label{tab:runtime-profile}
\small
\setlength{\tabcolsep}{6pt}
\renewcommand{\arraystretch}{0.95}
\begin{tabular}{l r r}
\toprule
\textbf{Component (whole training run)} & \textbf{Amnesia (KL)} & \textbf{Amnesia (TV)} \\
\midrule
Baseline training time (no attack) & 11{:}34 min & 11{:}34 min \\
\midrule
+ Projection & 9.8 s & 16.9 s \\
+ Rounding/Clipping & 4.3 s & 5.9 s \\
+ Audit-and-Fix & 9.9 s & 26.2 s \\
\midrule
Total attack overhead (sum above) & 24.0 s & 49.0 s \\
Total run time (baseline + overhead) & 11{:}58 min & 12{:}23 min \\
\bottomrule
\end{tabular}
\end{table}

\paragraph{Interpretation.}
Table~\ref{tab:runtime-profile} shows that the KL variant’s overhead is split fairly evenly between projection and audit-and-fix, reflecting the KL projector’s $O(C\cdot\text{iters})$ 1-D search and a small number of discrete feasibility swaps. In contrast, the TV variant’s overhead is dominated by (i) \textbf{projection}, due to sorting in two-sided water-filling ($O(C\log C)$), and (ii) \textbf{audit-and-fix}, because TV’s more extreme reallocations can require more unit transfers after integer rounding and availability clipping. In both cases, rounding/clipping is a minor fraction of overhead, and the total added wall-clock time remains modest compared to the end-to-end training run.
\end{document}
