Multi-Modal Object Tracking and Image Fusion With Unsupervised Deep Learning

Nicholas LaHaye, Jordan Ott, Michael Joseph Garay, Hesham El-Askary, Erik Linstead

Published: 08 Aug 2019, Last Modified: 26 Jan 2026IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSINGEveryoneCC BY 4.0

Abstract: The number of different modalities for remote sen- sors continues to grow, bringing with it an increase in the volume and complexity of the data being collected. Although these datasets individually provide valuable information, in aggregate they pro- vide additional opportunities to discover meaningful patterns on a large scale. However, the ability to combine and analyze disparate datasets is challenged by the potentially vast parameter space that results from aggregation. Each dataset in itself requires instrument- specific and dataset-specific knowledge. If the intention is to use multiple, diverse datasets, one needs an understanding of how to translate and combine these parameters in an efficient and effec- tive manner. While there are established techniques for combining datasets from specific domains or platforms, there is no generic, automated method that can address the problem in general. Here, we discuss the application of deep learning to track objects across different image-like data-modalities, given data in a similar spatio- temporal range, and automatically co-register these images. Using deep belief networks combined with unsupervised learning meth- ods, we are able to recognize and separate different objects within image-like data in a structured manner, thus making progress to- ward the ultimate goal of a generic tracking and fusion pipeline requiring minimal human intervention.