Abstract: Embodied intelligence empowers agents with a profound sense of perception, enabling them to respond in a manner closely aligned with real-world situations. Large Language Models (LLMs) delve into language instructions with depth, serving a crucial role in generating plans for intricate tasks. Thus, LLM-based embodied models further enhance the agent's capacity to comprehend and process information. However, this amalgamation also ushers in new challenges in the pursuit of heightened intelligence. Specifically, attackers can manipulate LLMs to produce irrelevant or even malicious outputs by altering their prompts. Confronted with this challenge, we observe a notable absence of multi-modal datasets essential for comprehensively evaluating the robustness of LLM-based embodied models. Consequently, we construct the Embodied Intelligent Robot Attack Dataset (EIRAD), tailored specifically for robustness evaluation. Additionally, two attack strategies are devised, including untargeted attacks and targeted attacks, to effectively simulate a range of diverse attack scenarios. At the same time, during the attack process, to more accurately ascertain whether our method is successful in attacking the LLM-based embodied model, we devise a new attack success evaluation method utilizing the BLIP2 model. Recognizing the time and cost-intensive nature of the GCG algorithm in attacks, we devise a scheme for prompt suffix initialization based on various target tasks, thus expediting the convergence process. Experimental results demonstrate that our method exhibits a superior attack success rate when targeting LLM-based embodied models, indicating a lower level of decision-level robustness in these models.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Generation] Social Aspects of Generative AI
Relevance To Conference: As far as we know, this work represents the first experiment in exploring the robustness of LLM-based embodied model decision-level processes.
We design a multi-modal dataset consisting of 500 instances of untargeted attack data and 500 instances of targeted attack data to fill the gaps in datasets for robustness evaluation in embodied scenarios.
Extensive experiments show that our method improves attack success rate and attack efficiency.
This work contributes to the field of multimodal learning and embodied intelligence, topics of interest at ACM MM.
The development of a specialized dataset for evaluating the robustness of embodied intelligence aligns with the conference's focus on datasets and benchmarking.
The proposed method for assessing attack success in large models using the CLIP model introduces novel techniques, which could be of interest to researchers in the community.
The study addresses the pressing need for understanding and defending against adversarial attacks on large models in embodied scenarios, an area of growing concern in AI research.
Supplementary Material: zip
Submission Number: 1248
Loading