PASTRAL: Privacy-aware AST and TRansformer-based Anomalous command-Line detection

Xiayan Ji; Ecenaz Erdemir; Kyuhong Park; Bhavna Soman; Yi Fan

PASTRAL: Privacy-aware AST and TRansformer-based Anomalous command-Line detection

Xiayan Ji, Ecenaz Erdemir, Kyuhong Park, Bhavna Soman, Yi Fan

Published: 27 Oct 2025, Last Modified: 27 Oct 2025NeurIPS Lock-LLM Workshop 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: security, anomaly detection, large language model, Differential Privacy, Abstract Syntax Trees (AST), application

TL;DR: PASTRAL detects suspicious command-lines without exposing raw data, balancing privacy and utility for secure LLM deployment.

Abstract: Command-lines are a common attack surface in cybersecurity. Yet they often contain sensitive user information, creating a dual challenge: systems must detect suspicious commands accurately while protecting user privacy. Existing approaches typically tackle one challenge without the other. To address this gap, we present PASTRAL, a practical framework for privacy-preserving detection of suspicious command-lines. Our main insight is that suspicious activities are typically rare and highly diverse in large-scale multi-user environments, which makes them naturally well suited to anomaly detection. PASTRAL represents command-lines using language-model and Abstract Syntax Tree (AST)-based embeddings, applies differential privacy (DP) noise injection at the embedding layer, and performs detection with a conditional variational autoencoder. By design, only differentially private embeddings are shared, which provide sufficient signal for accurate detection while abstracting away unnecessary details. Empirical evaluation demonstrates that PASTRAL achieves strong anomaly detection performance and sustains a favorable privacy-utility trade-off. Our real-world case study outlines practical considerations for deploying secure LLM detection systems in production.

Submission Number: 56

Loading