A Probabilistic Explanation for VoE-based EvaluationDownload PDF

Published: 23 Jan 2023, Last Modified: 05 May 2023PKU CoRe 22Fall OralReaders: Everyone
Keywords: VoE paradigm, physics reasoning, likelihood ratio theory
TL;DR: we propose a theoretical framework for VoE-based evaluation and design one model, two metrics for this problem.
Abstract: Visual grounded \emph{Violation of expectations} (VoE) paradigm is widely used to evaluate the physics learning capability of both humans and machines. It does this by measuring the prediction error, or \emph{surprise}, of a physics learning model in a given scene. Despite intuitive formulation and perfect alignment with developmental psychology, the design of evaluation protocol based on \textit{surprise} score is empirical. We point out the potential risks behind the traditional \textit{surprise} score design and provide a probabilistic explanation of VoE paradigm based on \textit{likelihood ratio theory}. Guided by the theoretical framework, we propose two novel and extensible surprise scores that are theoretically sounded. Furthermore, we implement a simple yet novel baseline based on PredRNN~\cite{wang2017predrnn} that demonstrates the ability to perform physical reasoning through direct \emph{pixel-level prediction}. Our model outperforms a strong \emph{object-level prediction} baseline PLATO, achieving an overall accuracy of 90.0\% on the \texttt{Probe} dataset, compared to 73.4\% for PLATO (with overall accuracy $73.4\%$). Additionally, we conduct experiments using our newly proposed metric.
Supplementary Material: zip
1 Reply

Loading