Did You Get That? Evaluating GPT-4’s Ability to Identify Additional ContextDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: We evaluate GPT-4s performance when extracting additional user context (that is, additional information offered by the user) in goal-oriented conversations.
Abstract: In recent years, Large language models (LLMs) have emerged as powerful knowledge bases. Despite increasing adoption, little is known about their true capabilities. We evaluate the strengths and weaknesses of the state-of-the-art in LLMs when identifying additional context in dialogue. We define additional context as information supplied by the user that is not directly asked of them. We specifically evaluate GPT-4 and its ability to recognize such information. While GPT-4 can accurately identify additional information in some sentences, it fails to identify additional context more than 22% of the time. By understanding these limitations, we can remain aware of pitfalls and harness LLMs within the scope of their abilities.
Paper Type: long
Research Area: Dialogue and Interactive Systems
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview