Turing Test in the Era of LLM

06 Nov 2023 (modified: 26 Jan 2024)PKU 2023 Fall CoRe SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: communication
Abstract: The Turing test has long been a cornerstone for evaluating machine intelligence through conversational interactions with humans. However, as the field of natural language processing (NLP) has rapidly advanced, concerns have arisen regarding the test's ability to assess the depth of a machine's comprehension versus its capacity to mimic human speech. This paper reimagines the evaluation of machine understanding, introducing two pivotal properties: pragmatic reasoning and commonsense. Pragmatic reasoning delves into a machine's capacity to engage in real-world scenarios, considering context and practical implications, while commonsense evaluates an AI system's ability to comprehend the embodied attributes and nuanced meanings of words. These core properties provide a comprehensive and meaningful assessment of machine understanding, addressing the limitations of both the traditional Turing test and contemporary benchmarks. In the era of large language models (LLMs), this reimagined approach guides the development of more sophisticated and human-like AI agents, bringing us closer to the realization of truly intelligent artificial agents.
Submission Number: 123
Loading