Temporal- and Viewpoint-Invariant Registration for Under-Canopy Footage using Deep-Learning-based Bird's-Eye View Prediction

Jiawei Zhou, Ruben Mascaro, Cesar Cadena, Margarita Chli, Lucas Teixeira

Published: 2024, Last Modified: 15 Jan 2025IROS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Conducting visual assessments under the canopy using mobile robots is an emerging task in smart farming and forestry. However, it is challenging to register images across different data-collection days, especially across seasons, due to the self-occluding geometry and temporal dynamics in forests and orchards. This paper proposes a new approach for registering under-canopy image sequences in general and in these situations. Our methodology leverages standard GPS data and deep-learning-based perspective to bird’s-eye view conversion to provide an initial estimation of the positions of the trees in images and their association across datasets. Furthermore, it introduces an innovative strategy for extracting tree trunks and clean ground surfaces from noisy and sparse 3D reconstructions created from the image sequences, utilizing these features to achieve precise alignment. Our robust alignment method effectively mitigates position and scale drift, which may arise from GPS inaccuracies and Sparse Structure from Motion (SfM) limitations. We evaluate our approach on three challenging real-world datasets, demonstrating that our method outperforms ICP-based methods on average by 50%, and surpasses FGR and TEASER++ by over 90% in alignment accuracy. These results highlight our method’s cost efficiency and robustness, even in the presence of severe outliers and sparsity. https://github.com/VIS4ROB-lab/bev_undercanopy_registration