Abstract: Generative language models have seen an explosion
in use for downstream tasks due to their effectiveness in zero or
few-shot learning scenarios. One such use case is the ability to
emulate command terminals such as a Bash shell in Linux systems.
These models’ generative and non-evaluative nature makes them
prospective candidates for use in threat engagement via output
generation for honeypots. Studies have proposed using generative
honeypots but have had limited evaluations in live settings. We
deploy and evaluate generative honeypots with and without a
context selection mechanism alongside a control honeypot. We
found that generative model deployments significantly increased
session length without risking compromise and that limited context
selection can substantially reduce token usage. To illustrate the
TTP-capturing potential of generative honeypots, we dissect some
sessions observed during this deployment and discuss how the
traffic observed can further refine generative model use and
few-shot performance.
Loading