Keywords: large language models, empirical analysis, causal facts
TL;DR: We hypothesize that LLMs exploit correlations between the questions on causal relations with their expected (or "right") causal answers.
Abstract: Large Language Models (LLMs) are subject to an ongoing heated debate, leaving open the question of progress towards AGI and dividing the community into two camps: the ones who see the arguably impressive results as evidence to the scaling hypothesis, and the others who are worried about the lack of interpretability and reasoning capabilities. By investigating to which extent causal representations might be captured by LLMs, we make a humble effort towards resolving the ongoing philosophical conflicts. We hypothesize that causal facts are part of the training data and that the LLM are capable of picking up correlations between the questions on causal relations with their expected (or ``right'') causal answers. We study this hypothesis two-fold, (1) by analyzing the LLM's causal question answering capabilities and (2) by probing the LLM's embeddings for correlations on the causal facts. Our analyses suggests that LLMs are somewhat capable of answering causal queries the right way through memorization of the corresponding question-answer pair. However, more importantly, the evidence suggests that LLMs do not perform causal reasoning to arrive at their answers.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
5 Replies
Loading