Evaluating Few-Shot Learning Generative Honeypots in A Live Deployment

Jarrod Ragsdale, Rajendra Boppana

Published: 05 Sept 2024, Last Modified: 25 Jan 2026IEEE Cybersecurity and Resilience, 2024EveryoneWM2024 Conference

Abstract: Generative language models have seen an explosion in use for downstream tasks due to their effectiveness in zero or few-shot learning scenarios. One such use case is the ability to emulate command terminals such as a Bash shell in Linux systems. These models’ generative and non-evaluative nature makes them prospective candidates for use in threat engagement via output generation for honeypots. Studies have proposed using generative honeypots but have had limited evaluations in live settings. We deploy and evaluate generative honeypots with and without a context selection mechanism alongside a control honeypot. We found that generative model deployments significantly increased session length without risking compromise and that limited context selection can substantially reduce token usage. To illustrate the TTP-capturing potential of generative honeypots, we dissect some sessions observed during this deployment and discuss how the traffic observed can further refine generative model use and few-shot performance.