Keywords: Classification, Misclassification repair, Optimization, Control, Stability, Deep learning, Information geometry
TL;DR: DOC stably repairs misclassifications with guarantees and diagnostics, outperforming related methods.
Abstract: In high-risk domains such as autonomous driving and medical diagnosis, classifier misclassifications pose severe risks. Existing repair approaches fall into three categories: test-time adaptation (TTA), adversarial perturbation methods such as PGD and DeepFool, and counterfactual generation (CF). TTA and perturbation methods lack stability guarantees or irreparability diagnosis, while CF targets distributional plausibility rather than direct control.We propose $Direct Output Control (DOC)$, which repairs misclassifications by directly regulating the output distribution without changing model parameters. DOC defines the Fisher–Rao distance as a Lyapunov function, pulls back its gradient through the Jacobian pseudoinverse, and derives minimum-norm perturbations that monotonically reduce error. The framework generalizes to other metrics (e.g., $L_2$) and provides both a theoretical irreparability bound based on Jacobian singular values and inter-class margins, and an empirical diagnostic using Lyapunov decrease.On ImageNet-1k with ResNets and Vision Transformers, DOC outperforms TTA and perturbation methods in repair success while inducing smaller distortions, though at higher inference cost. Our contributions are: (1) a Lyapunov-control formulation with monotonic stability, (2) theoretical analysis including irreparability, minimum-norm, and natural gradient connection, (3) an empirical diagnostic via Lyapunov decrease, and (4) large-scale validation showing Pareto superiority in success–distortion trade-offs.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 16189
Loading