Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large- Language-Model Drift

TMLR Paper5724 Authors

24 Aug 2025 (modified: 04 Sept 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We present Zero-Direction Probing (ZDP), a theoretical framework that characterizes model drift from null directions of transformer activations, requiring no task labels or output evaluations. Under explicit assumptions (A1–A6), We prove: (i) the Variance–Leak Theorem (Thm. 1), (ii) Fisher Null-Conservation (Thm. 3), (iii) a Rank–Leak bound for low-rank updates (Thm. 5), and (iv) a logarithmic-regret guarantee for online null-space trackers (Thm. 4). We further derive a Spectral Null-Leakage (SNL) metric with a non-asymptotic Laurent–Massart tail bound and an MP-edge–style concentration inequality, providing a- priori thresholds for drift under a Gaussian null model. Together, these results establish that “listening to silence”—monitoring the right/left null spaces of layer activations and their Fisher geometry—yields concrete, testable guarantees on representational change. The manuscript is intentionally theory-only; empirical validation and benchmarking are deferred to companion work.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=MLo0rUqkHz&noteId=MLo0rUqkHz
Changes Since Last Submission: fixed template including margin issues etc,
Assigned Action Editor: ~Sebastian_U_Stich1
Submission Number: 5724
Loading