Semantic Steganography: A Framework for Robust and High-Capacity Information Hiding using Large Language Models
Abstract: In the era of Large Language Models (LLMs), generative linguistic steganography has become a prevalent technique for hiding information within model-generated texts. However, traditional steganography methods struggle to effectively align steganographic texts with original model-generated texts due to the lower entropy of the predicted probability distribution of LLMs. This results in a decrease in embedding capacity and poses challenges for decoding stegos in real-world communication channels.
To address these challenges, we propose a semantic steganography framework based on LLMs, which constructs a semantic space and maps secret messages onto this space using ontology-entity trees. This framework offers robustness and reliability for transmission in complex channels, as well as resistance to text rendering and word blocking. Additionally, the stegos generated by our framework are indistinguishable from the covers and achieve a higher embedding capacity compared to state-of-the-art steganography methods, while producing higher quality stegos.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: privacy and security, steganography, large language models
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Theory
Languages Studied: English, Chinese
Submission Number: 194
Loading