Iterative Theory of Mind Assay of Multimodal AI Models

Rohini Elora Das; Rajarshi Das; Niharika Maity; Sreerupa Das

Iterative Theory of Mind Assay of Multimodal AI Models

Rohini Elora Das, Rajarshi Das, Niharika Maity, Sreerupa Das

Published: 18 Jun 2024, Last Modified: 26 Jul 2024ICML 2024 Workshop on LLMs and Cognition PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Theory of Mind, Cognition, AI

TL;DR: We use iterative Theory of Mind tests to reveal limitations in current multimodal AI’s ability to create a consistent world model and we identify new multimodal confabulations.

Abstract: The concept of artificial general intelligence (AGI) has sparked intense debates across various sectors, fueled by the capabilities of Large Language Model-based AI systems like ChatGPT. However, the AI community remains divided on whether such models truly understand language and its contexts. Developing multimodal AI systems, which can engage with the user in multiple input and output modalities, is seen as a crucial step towards AGI. We employ a novel iterated Theory of Mind (iToM) test to reveal limitations of current multimodal LLMs like ChatGPT 4o in converging to coherent and unified internal world models which results in illogical and inconsistent user interactions both within and across the different input and output modalities. We also identify new multimodal confabulations ("hallucinations"), particularly in languages with less training data, such as Bengali.

Submission Number: 72

Loading