Abstract: In decision making, generating alternative solutions is crucial for solving a problem. However, cognitive biases can impede this process by constraining individual decision makers' creativity. To address this issue, we introduce a new task for automatically generating alternatives, inspired by the process of human "brainstorming". We define alternative options based on atomic action components and present a dataset of 106 annotated Reddit r/Advice posts containing unique alternative options extracted from users' replies. We also introduce new metrics to assess the quality of generated components, including distinctiveness, creativity, upvote-weighted, crowd intersection, and final commit intersection scores. As a baseline, we evaluated the large language models (LLMs) LLaMa3:8b, LLaMa3.1:8b, and Gemma 2:9b on the alternative component generation task. On the one hand, models demonstrated high creativity (ability to generate options beyond what Reddit users suggested) and performed well at proposing distinct alternatives. A subset of generated components was manually evaluated and found overall useful. This indicates that LLMs might be used to extend lists of alternative options, helping decision makers consider a problem from different perspectives. On the other hand, LLMs' outputs often failed to align with human suggestions, implying that they still tend to miss important components.
Paper Type: Long
Research Area: Generation
Research Area Keywords: human evaluation,automatic evaluation,efficient models,few-shot generation,analysis,text-to-text generation,inference methods, prompting, applications
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis, Theory
Languages Studied: English
Submission Number: 2103
Loading