Contextual Learning for Anomaly Detection in Tabular Data

Published: 07 Mar 2026, Last Modified: 07 Mar 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Anomaly detection is critical in domains such as cybersecurity and finance, especially when working with large-scale tabular data. Yet, unsupervised anomaly detection---where no labeled anomalies are available---remains challenging because traditional deep learning methods model a single global distribution, assuming all samples follow the same behavior. In contrast, real-world data often contain heterogeneous contexts (e.g., different users, accounts, or devices), where globally rare events may be normal within specific conditions. We introduce a \emph{contextual learning framework} that explicitly models how normal behavior varies across contexts by learning conditional data distributions $P(\mathbf{Y} \mid \mathbf{C})$ rather than a global joint distribution $P(\mathbf{X})$. The framework encompasses (1) a probabilistic formulation for context-conditioned learning, (2) a principled bilevel optimization strategy for automatically selecting informative context features using early validation loss, and (3) theoretical grounding through variance decomposition and discriminative learning principles. We instantiate this framework using a novel conditional Wasserstein autoencoder as a simple yet effective model for tabular anomaly detection. Extensive experiments across eight benchmark datasets demonstrate that contextual learning consistently outperforms global approaches---even when the optimal context is not intuitively obvious---establishing a new foundation for anomaly detection in heterogeneous tabular data.
Submission Type: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We have carefully addressed all reviewer feedback/concerns and have made the following changes since the last submission. 1. Camera-ready formatting and metadata updates - Switched the paper to the accepted TMLR format. - Restored the full author list and cleaned up author footnotes. - Filled in the camera-ready metadata fields, including month, year, and OpenReview link. - Updated hyperlink formatting to use hidden links. 2. Stronger positioning in related work - Revised the Related Work section to better situate the paper within conditional/contextual anomaly detection. - Added or replaced references to more directly relevant prior work, including UCAD, conDENSE, and newer process-monitoring literature. - Improved the discussion connecting this paper to SPC/MSPC and autoencoder-based process monitoring. 3. Method and theory clarifications - Clarified the role of a single selected context feature and why the current paper focuses on one context variable for interpretability and stability. - Refined the variance-reduction argument so it is presented as a heuristic proposition with more careful wording. - Corrected notation in the CWAE loss description so the context/content variables are stated consistently. - Revised the explanation of why WAE/CWAE is preferred over CVAE-style stochastic inference. 4. Clearer explanation of context selection - Expanded the justification for using a small mostly-normal validation subset during bilevel context selection. - Added references supporting model selection without labeled anomalies. - Clarified the computational tradeoff behind using one epoch as the proxy for context ranking. 5. Bibliography cleanup - Standardized several dataset citations in the bibliography. - Replaced some weaker or less precise references with more appropriate ones. - Removed a duplicate dataset entry and cleaned up a few bibliography formatting details. Summary Relative to the original submission, the camera-ready version keeps the same central method, experimental scope, and main conclusions, but improves the manuscript in four ways: it is now in final TMLR accepted format, it strengthens the literature review and paper positioning, it clarifies the theoretical and methodological presentation of contextual learning and CWAE, and it fixes a small number of citation and minor issues. No major change was made to the paper's core claims.
Assigned Action Editor: ~Markus_Lange-Hegermann1
Submission Number: 6544
Loading