FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs

Zhiting Fan; Ruizhe Chen; Tianxiang Hu; Zuozhu Liu

FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs

Zhiting Fan, Ruizhe Chen, Tianxiang Hu, Zuozhu Liu

Published: 22 Jan 2025, Last Modified: 16 Apr 2025ICLR 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Fairness, Benchmark, Large language model

Abstract: The increasing deployment of large language model (LLM)-based chatbots has raised concerns regarding fairness. Fairness issues in LLMs may result in serious consequences, such as bias amplification, discrimination, and harm to minority groups. Many efforts are dedicated to evaluating and mitigating biases in LLMs. However, existing fairness benchmarks mainly focus on single-turn dialogues, while multi-turn scenarios, which better reflect real-world conversations, pose greater challenges due to conversational complexity and risk for bias accumulation. In this paper, we introduce a comprehensive benchmark for fairness of LLMs in multi-turn scenarios, **FairMT-Bench**. Specifically, We propose a task taxonomy to evaluate fairness of LLMs cross three stages: context understanding, interaction fairness, and fairness trade-offs, each comprising two tasks. To ensure coverage of diverse bias types and attributes, our multi-turn dialogue dataset FairMT-10K is constructed by integrating data from established fairness benchmarks. For evaluation, we employ GPT-4 along with bias classifiers like Llama-Guard-3, and human annotators to ensure robustness. Our experiments and analysis on FairMT-10K reveal that in multi-turn dialogue scenarios, LLMs are more prone to generating biased responses, showing significant variation in performance across different tasks and models. Based on these findings, we develop a more challenging dataset, FairMT-1K, and test 15 current state-of-the-art (SOTA) LLMs on this dataset. The results highlight the current state of fairness in LLMs and demonstrate the value of this benchmark for evaluating fairness of LLMs in more realistic multi-turn dialogue contexts. This underscores the need for future works to enhance LLM fairness and incorporate FairMT-1K in such efforts. Our code and dataset are available at https://github.com/FanZT6/FairMT-bench.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7273

Loading