Keywords: LLMs, Agents, Penetration Testing, Cybersecurity, Agentic AI, Lang- Graph, Large Language Models, GAAI
TL;DR: A review on the effectiveness of local LLMs for autonomous Cyber-security penetration testing.
Abstract: Recent advances in large language models (LLMs) have opened
new opportunities for automating complex tasks in Cybersecurity,
including offensive operations. However, most existing approaches
to LLM-assisted penetration testing rely on human input or scripted
interactions. This work explores a fully autonomous, local-agent
framework for end-to-end penetration testing, using LangGraph to
structure multi-stage tool-augmented reasoning. Without human
intervention post-launch, selected open-source LLMs were tasked
with scanning, vulnerability analysis, and exploitation against a
standard testbed. Results show that models such as Qwen-14B and
Qwen-32B can successfully execute multiple real-world exploits,
demonstrating that local, API-free LLM agents can move beyond
advisory roles into operational offensive security.
Journal Edition Interest: Yes
Supplementary Material: zip
Submission Number: 46
Loading