Class-Aware Adversarial Unsupervised Domain Adaptation for Linguistic Steganalysis

Zhen Yang, Yufei Luo, Jinshuai Yang, Xin Xu, Ru Zhang, Yongfeng Huang

Published: 01 Jan 2025, Last Modified: 25 Jul 2025IEEE Trans. Inf. Forensics Secur. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent advancements in deep learning have significantly improved linguistic steganalysis, but challenges persist when labeled samples are scarce in the target domain. Existing cross-domain linguistic steganalysis methods seek to improve model generalization by minimizing the domain discrepancy between the source and target domains. However, these steganalysis methods often struggle with incorrect alignment between stego and cover texts in both domains, which hampers the generalization of steganalysis models. Additionally, they struggle to capture domain-specific features of the target domain, reducing the effectiveness of steganalysis models in discriminating stego texts. To address these issues, we propose a novel Class-aware Adversarial unsupervised Domain Adaptation (CADA) method, which operates in two stages. In the first stage, Class-aware Adversarial Pre-Training (CAPT), we design the Weighted Class-Aware Domain Distance (WCADD) to leverage class information of stego and cover texts. This ensures accurate class-aware alignment across domains. In the CAPT stage, the steganalysis model is pre-trained with WCADD, Class-Aware Adversarial Training (CAAT), and Class-Aware Label Smoothing (CALS) to enhance its ability to extract domain-invariant features, thereby improving its generalization. In the second stage, Class-aware Fine-Tuning (CFT), we employ the pre-trained steganalysis model alongside the Class-Aware Progressive Strategy (CAPS) to generate pseudo-labels for the target domain. Fine-tuning the model with these pseudo-labels enhances its ability to recognize domain-specific features, thereby improving its performance in discriminating stego texts within the target domain. Extensive experiments demonstrate that our proposed method outperforms the existing baseline methods.