In a previous review cycle, reviewers kindly pointed out several 
limitations of the original version of our manuscript. We summarize 
these comments below and address them in the new and improved version. 

(1) “Motivate the proposed approach”: we have restructured the paper 
and now clarify the goals and proposed methodology.

(2) “Compare to other AI models on human reaction time”: we now compare 
the proposed technique to an existing computational model of human 
reaction time (Spoerer et al., 2020), and extensively evaluate dynamic 
depth neural networks by (Zhang et al., 2019) and (Huang et al., 2018) 
in comparison to this model, as well as to previous and new human 
observers. We also collect and evaluate human and network data on blur 
and color, as well as additional human trials on noise. We also discuss 
related techniques and approaches in related work.

References: 

[1] Courtney J Spoerer, Tim C Kietzmann, Johannes Mehrer, Ian Charest, 
and Nikolaus Kriegeskorte. Recurrent neural networks can explain 
flexible trading of speed and accuracy in biological vision. PLoS 
computational biology, 16(10):e1008215, 2020.

[2] Linfeng Zhang, Zhanhong Tan, Jiebo Song, Jingwei Chen, Chenglong 
Bao, and Kaisheng Ma. Scan: Ascalable neural networks framework towards 
compact and efficient models. NeurIPS, 2019.

[3]Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, 
and Kilian Weinberger. Multi-scale dense networks for resource efficient 
image classification. ICLR, 2018.
