TL;DR: Reproducibilty Report for MLRC Fall 2021 Challenge
Abstract: Reproducibility Summary Scope of Reproducibility This report covers our reproduction effort of the paper `Differentiable Spatial Planning using Transformers' by Chaplot et al. In this paper, the problem of spatial path planning in a differentiable way is considered. They show that their proposed method of using Spatial Planning Transformers outperforms prior data-driven models and leverages differentiable structures to learn mapping without a ground truth map simultaneously. We verify these claims by reproducing their experiments and testing their method on new data. We also investigate the stability of planning accuracy with maps with increased obstacle complexity. Efforts to investigate and verify the learnings of the Mapper module were met with failure stemming from a paucity of computational resources and unreachable authors. Methodology The authors' source code and datasets are not open-source yet. Hence, we reproduce the original experiments using source code written from scratch. We generate all synthetic datasets ourselves following similar parameters as described in the paper. Training the mapper module required loading our synthetic dataset over 1.6 TB in size, which could not be completed. Results We reproduced the accuracy for the SPT planner module to within 14.7% of reported value, which, while outperforming the baselines in select cases, fails to support the paper's conclusion that it outperforms the baselines. However, we achieve a similar drop-off in accuracy in percentage points over different model settings. We suspect that the vagueness in the accuracy metric leads to the absolute difference of 14.7% despite the paper being reproducible. We further improve the reproduced figures by increasing model complexity. The Mapper module's accuracy could not be tested. What was easy Model architecture and training details were enough to easily reproduce. What was difficult We lost significant time in generating all synthetic datasets, especially the dataset for the Mapper module that required us to set up the Habitat Simulator and API. The ImageExtractor API was broken, and workarounds had to be implemented. The final dataset approached 1.6 TB in size, and we could not arrange enough computational resources and expertise to handle the GPU training. Furthermore, the description of the action prediction accuracy metric used is vague and could be one of the possible reasons behind the non-reproducibility of the results. Communication with original authors The authors of the paper could not be reached even after multiple attempts.
Paper Url: http://proceedings.mlr.press/v139/chaplot21a.html
Paper Venue: ICML 2021
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2208.09536/code)