Abstract: Explainable artificial intelligence (XAI) methods rely on access to model architecture and parameters that is not always feasible for most users, practitioners, and regulators. Drawing inspiration from cognitive psychology, we present a case for response times (RTs) as a technique for XAI. RTs are observable without access to the model. Moreover, dynamic inference models performing conditional computation generate variable RTs for visual learning tasks depending on hierarchical representations. We show that MSDNet, a conditional computation model with early-exit architecture, exhibits slower RT for images with more complex features in the ObjectNet test set, as well as the human phenomenon of scene grammar, where object recognition depends on intra-scene object-object relationships. These results cast a light on MSDNet's hierarchical feature space without opening the black box and illustrate the promise of RT as a technique for XAI.
0 Replies
Loading