[Re] Differentiable Spatial Planning using Transformers

Blind Submission by Fall[Re] Differentiable Spatial Planning using Transformers

Published: 11 Apr 2022, Last Modified: 15 Jun 2025RC2021Readers: EveryoneShow BibtexShow Revisions
TL;DR: Reproducibilty Report for MLRC Fall 2021 Challenge
Abstract: Reproducibility Summary Scope of Reproducibility This report covers our reproduction effort of the paper `Differentiable Spatial Planning using Transformers' by Chaplot et al. In this paper, the problem of spatial path planning in a differentiable way is considered. They show that their proposed method of using Spatial Planning Transformers outperforms prior data-driven models and leverages differentiable structures to learn mapping without a ground truth map simultaneously. We verify these claims by reproducing their experiments and testing their method on new data. We also investigate the stability of planning accuracy with maps with increased obstacle complexity. Efforts to investigate and verify the learnings of the Mapper module were met with failure stemming from a paucity of computational resources and unreachable authors. Methodology The authors' source code and datasets are not open-source yet. Hence, we reproduce the original experiments using source code written from scratch. We generate all synthetic datasets ourselves following similar parameters as described in the paper. Training the mapper module required loading our synthetic dataset over 1.6 TB in size, which could not be completed. Results We reproduced the accuracy for the SPT planner module to within 14.7% of reported value, which, while outperforming the baselines in select cases, fails to support the paper's conclusion that it outperforms the baselines. However, we achieve a similar drop-off in accuracy in percentage points over different model settings. We suspect that the vagueness in the accuracy metric leads to the absolute difference of 14.7% despite the paper being reproducible. We further improve the reproduced figures by increasing model complexity. The Mapper module's accuracy could not be tested. What was easy Model architecture and training details were enough to easily reproduce. What was difficult We lost significant time in generating all synthetic datasets, especially the dataset for the Mapper module that required us to set up the Habitat Simulator and API. The ImageExtractor API was broken, and workarounds had to be implemented. The final dataset approached 1.6 TB in size, and we could not arrange enough computational resources and expertise to handle the GPU training. Furthermore, the description of the action prediction accuracy metric used is vague and could be one of the possible reasons behind the non-reproducibility of the results. Communication with original authors The authors of the paper could not be reached even after multiple attempts.
Paper Venue: ICML 2021
Community Implementations: CatalyzeX 2 code implementations

Reply Type:
Author:
Visible To:
Hidden From:
4 Replies
[–][+]

Paper Decision

Decision by Program ChairsPaper Decision

ML Reproducibility Challenge 2021 Fall Program Chairs
09 Apr 2022, 20:29ML Reproducibility Challenge 2021 Fall Paper76 DecisionReaders: EveryoneShow Revisions
Decision: Accept
Comment: Following the recommendation of reviewers and meta-reviewer, the paper is accepted for ML Reproducibility Challenge 2021, and will be published in the upcoming special edition of ReScience Journal.
[–][+]

Meta Review of Paper76 by Area Chair zD4P

Meta Review of Paper76 by Area Chair zD4P

ML Reproducibility Challenge 2021 Fall Paper76 Area Chair zD4P
08 Apr 2022, 04:44ML Reproducibility Challenge 2021 Fall Paper76 Meta ReviewReaders: EveryoneShow Revisions
Metareview:

A great reproducubility study. Even though there is some discrepancy in the results presented here with that of the original paper, the paper does a good job at explaining the reason behind such behaviour which is indeed a good contribution.

Confidence: 4: The area chair is confident but not absolutely certain
Recommendation: Accept
[–][+]

Reproducibility Review - Differentiable Spatial Planning using Transformers

Official Review of Paper76 by Reviewer aWkcReproducibility Review - Differentiable Spatial Planning using Transformers

ML Reproducibility Challenge 2021 Fall Paper76 Reviewer aWkc
07 Mar 2022, 20:46ML Reproducibility Challenge 2021 Fall Paper76 Official ReviewReaders: EveryoneShow Revisions
Review:

The original authors make three claims and the reproducibility report sets out to verify these. The authors were unable to reproduce the results of the original paper to a high accuracy (Results were different by 14.7% which is a large margin), though they observe similar drop-off accuracy in different model settings. They state that this is possibly a result of experimenting on an entirely different dataset. The authors were being unable to contact the authors of the original paper and due to the fact that the training required large compute not all the results were verifiable. However, the results that were verified seems plausible and the structure of the paper and format are very good and easy to understand. They also attempt to visually describe their work which should be commended.

Rating: 9: Top 15% of accepted papers, strong accept
Confidence: 2: The reviewer is willing to defend the evaluation, but it is quite likely that the reviewer did not understand central parts of the paper
[–][+]

Review for reproducibility challange: Differentiable Spatial Planning using Transformers

Official Review of Paper76 by Reviewer pJA8Review for reproducibility challange: Differentiable Spatial Planning using Transformers

ML Reproducibility Challenge 2021 Fall Paper76 Reviewer pJA8
07 Mar 2022, 06:31ML Reproducibility Challenge 2021 Fall Paper76 Official ReviewReaders: EveryoneShow Revisions
Review:

The authors have made a reasonable effort in reproducing the original work, "Differentiable Spatial Planning using Transformers". The authors implemented the algorithm from scratch due to lack of open-source code from the original authors. A big challenge the authors encountered was the computational resource, which is very understandable and the relevant data that the authors share could be useful for further research in this direction. In terms of reproducing the original results, the authors did reasonable hyper-parameters tuning and also applied the method to new tasks that weren't done in the original work. These results help assess the reproducibility of the original work and thus I think it is a good contribution to the reproducibility challenge.

Rating: 7: Good paper, accept
Confidence: 3: The reviewer is fairly confident that the evaluation is correct