Abstract: The problem of tracking people using multiple cameras is of much current interest as a means of providing cues for audio-visual blind source separation in dynamic environments. Here we investigate the use of one of the current state-of-the-art techniques in object recognition combined with one of the most popular methods of modelling object motion, particle filters, for tracking people. The dictionary learning or Bag-of-Words approach to object recognition has proved to be very effective in recent years, as shown in a number of large comparisons such as the PASCAL Visual Object recognition Challenge (VOC). In this paper we use this proven object recognition method within the framework of a particle filter. This provides a more accurate and robust tracking of people in a multiple camera environment. We also demonstrate that the dictionary learning approach can provide a principled method for the fusion of multiple features.
Loading