Keywords: Robot Planning, Robot Learning: Foundation Models, Assistive, Entertainment and Service Robots
TL;DR: We present Robo-Troj, a backdoor attack on LLM-based task planners in robotics. This attack uses trigger words to activate malicious behaviors, revealing vulnerabilities in LLM-based planning and showcasing the need for stronger security in robotics.
Abstract: Abstract—Robots need task planning methods to achieve goals that require more than one action. Recently, large language
models (LLMs) have demonstrated impressive performance in task planning. LLMs can generate a step-by-step solution using
a description of actions and the goal. Despite the successes of LLMs in long-horizon tasks for robot intelligence, there is
little research studying the security aspects of those systems. In this paper, we develop Robo-Troj, the first backdoor attack
specifically designed for LLM-assisted robot planners. Our attack follows the standard practice of LLM usage in robotics where
the backbone LLM is typically frozen and hosted in a central server limiting attacker’s reach. In contrast, our attack injects
backdoor at the fine-tuning stage using a small set of task-specific parameters for each specific robot. In addition, we develop an
optimization method for selecting multiple-trigger words that are most effective for different robot applications. For instance, one
can use unique trigger words, e.g., “herical”, to activate a specific malicious behavior, e.g., cutting hand on a kitchen robot. Through
demonstrating the vulnerability of current LLM-based planners, we aim to advance secured robot intelligence.
Submission Number: 7
Loading