- Keywords: Reasoning, Bayesian probability
- TL;DR: A method of multi-hop reasoning analysis
- Abstract: Emerging pre-trained language models (PTLMs), such as BERT and RoBERTa, have already achieved great success on many natural language understanding (NLU) tasks, spurring widespread interest for their potential in scientific and social areas, with accompanying criticism on their ambiguousness of reasoning, especially multi-hop cases. Concretely, many studies have pointed out that these models lack true understandings of the reasoning process. In this work, we focus on multi-hop reasoning processes of PTLMs and perform an analysis on a logical reasoning dataset, Soft Reasoner. We first extend the dataset by constructing the implicit intermediate results during multi-hop reasoning in a semi-automatic way. Surprisingly, when testing on the extended dataset, PTLMs can even predict the correct conclusion when they cannot judge the corresponding intermediate results. To further analyze this phenomenon, we further compare PTLMs' reasoning processes with Bayesian inference processes to simulate humans' reasoning procedure. Results show that if a model is more in line with the Bayesian process, it tends to have a better generalization ability. Our Bayesian process method can be used as a method to evaluate the generalization ability of models.
- Track: Short paper