Multimodal Situational Safety

Published: 10 Oct 2024, Last Modified: 04 Dec 2024NeurIPS 2024 Workshop RBFM OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal Situational Safety, Multimodal Large Language Model, Benchmark
Abstract: Multimodal Large Language Models (MLLMs) have emerged as powerful multimodal assistants, capable of interacting with humans and their environments using language and actions. However, these advancements also introduce new safety challenges: whether a query from the user has unsafe intent depends on the situation they are in. To address this, we introduce the problem of Multimodal Situational Safety, where the model needs to judge the safety implications of a language query based on the visual context. Based on this problem, we collect a benchmark comprising 1840 language queries, where each query is paired with one safe image context and one unsafe image context. Our evaluation shows that current MLLMs struggle with this nuanced safety problem. Moreover, to diagnose the impact of different abilities of MLLMs on their safety performance, such as explicit safety reasoning, visual understanding, and situation safety reasoning, we create different evaluation setting variants. Given the diagnosis results, we propose a multi-step safety-examination method to mitigate such attacks and offer insights for future enhancement.
Submission Number: 26
Loading