Abstract: Fuzz testing is an dynamic program analysis technique designed for discovering vulnerabilities in IoT systems. The core goal is to deliberately feed maliciously crafted inputs into an IoT device or service, triggering vulnerabilities such as system crashes, buffer overflow exploits, and memory corruption, etc. Efficiently generating malicious inputs remains challenging, with leading methods often relying on randomly mutating existing valid inputs. In this work, we propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks to guide future fuzzing explorations. Specifically, we develop a framework that leverages code large language models (LLMs) to guide the mutation process to perform meaningful input mutations. We formulate the mutation process as the sequence-to-sequence modeling, where LLM receives a sequence of bytes and outputs the mutated byte sequence. FuzzCoder is fine-tuned on our created instruction dataset (FuzzInstruct), where the successful fuzzing history is collected from the heuristic fuzzing tool. FuzzCoder can predict mutation positions and strategies for input files to trigger abnormal behaviors of the program. Most importantly, the experiment reveals results that FuzzCoder achieves better fuzzing performance compared to traditional and other American fuzzy lop (AFL)-based fuzzers, such as AFL, AFL++, AFLSmart, etc. On average, FuzzCoder achieves an improvement in code coverage of more than 20%, along with a significant increase in the number of crashes.
External IDs:dblp:journals/iotj/YangWYXYLNSL25
Loading