Towards Concept-Aware Large Language Models

Chen Shani; Jilles Vreeken; Dafna Shahaf

Towards Concept-Aware Large Language Models

Chen Shani, Jilles Vreeken, Dafna Shahaf

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Theme Track: Large Language Models and the Future of NLP

Submission Track 2: Linguistic Theories, Cognitive Modeling, and Psycholinguistics

Keywords: Concepts, Pretrained Large Language Models

TL;DR: Exploring to what extent do pretrained LLMs grasp human concepts and how to enhance their understanding of concepts with and without training.

Abstract: Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human concepts and their structure. We then discuss ways to develop concept-aware LLMs, taking place at different stages of the pipeline. We sketch a method for pretraining LLMs using concepts, and also explore the simpler approach that uses the output of existing LLMs. Despite its simplicity, our proof-of-concept is shown to better match human intuition, as well as improve the robustness of predictions. These preliminary results underscore the promise of concept-aware LLMs.

Submission Number: 2052

Loading