Abstract: Information from multiple heterogeneous data sources (e.g. visible and infrared) or representations (e.g. intensity and edge) have become increasingly important in many video-based applications. Fusion of information from these sources is critical to improve the robustness of related visual information processing systems. In this paper we propose a data fusion approach via sparse representation with applications to robust visual tracking. Specifically, the image patches from different sources of each target candidate are concatenated into a one-dimensional vector that is then sparsely represented in the target template space. The template space representation, which naturally fuses information from different sources, brings several benefits to visual tracking. First, it inherits robustness to appearance contaminations from the previously proposed sparse trackers. Second, it provides a flexible framework that can easily integrate information from different data sources. Third, it can be used for handling various number of data sources, which is very useful for situations where the data inputs arrive at different frequencies. The sparsity in the representation is achieved by solving an ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -regularized least squares problem. The tracking result is then determined by finding the candidate with the smallest approximation error. To propagate the results over time, the sparse solution is combined with the Bayesian state inference framework using the particle filter algorithm. We conducted experiments on several real videos with heterogeneous information sources. The results show that the proposed approach can track the target more robustly than several state-of-the-art tracking algorithms.
0 Replies
Loading