A Test-Time Entropy Minimization Method for Cross-Domain Linguistic Steganalysis

Jiaxuan Wu, Xin Chen, Juan Wen, Wanli Peng, Yiming Xue

Published: 01 Jan 2024, Last Modified: 19 Feb 2025IEEE Signal Process. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The growth of social networks has fueled advancements in text steganography techniques. As a covert communication form, text steganography discreetly embeds information by adding low-amplitude noise, significantly complicating its detection, and making steganographic texts increasingly difficult to identify. Existing steganalysis models achieve high detection accuracy by assuming that the training and testing sets are independent and identically distributed (i.i.d). However, meeting the i.i.d. requirement between training and testing datasets is impractical in real-world scenarios because it is often impossible to pinpoint which texts contain steganography or to identify the algorithms used in their creation. Therefore, obtaining labeled data for training is often unfeasible, and training models with each pair of source and target domains for each task significantly inconveniences the practical application of steganalysis models. Given these detection challenges, we propose a test-time adaptive steganalysis paradigm to accommodate detection scenarios without training data. Employing a generic pre-trained language model as a foundation and optimizing the model during testing allows it to self-adjust to new and varied data sets. The model relies only on the test data and its parameters in this fully test-time adaptation setting. It's important to note that detecting steganographic texts is an immense challenge; thus, we integrate test-time entropy minimization (TTem) to enhance the detection accuracy of steganographic texts. Extensive experiments show that the proposed method achieves good performance for test-time adaptation cross-domain linguistic steganalysis.