Automatic drive annotation via multimodal latent topic model

Takashi Bando, Kazuhito Takenaka, Shogo Nagasaka, Tadahiro Taniguchi

2013 (modified: 03 Nov 2022)IROS 2013Readers: Everyone

Abstract: Time-series driving behavioral data and image sequences captured with car-mounted video cameras can be annotated automatically in natural language, for example, “in a traffic jam,” “leading vehicle is a truck,” or “there are three and more lanes.” Various driving support systems nowadays have been developed for safe and comfortable driving. To develop more effective driving assist systems, abstractive recognition of driving situation performed just like a human driver is important in order to achieve fully cooperative driving between the driver and vehicle. To achieve human-like annotation of driving behavioral data and image sequences, we first divided continuous driving behavioral data into discrete symbols that represent driving situations. Then, using multimodal latent Dirichlet allocation, latent driving topics laid on each driving situation were estimated as a relation model among driving behavioral data, image sequences, and human-annotated tags. Finally, automatic annotation of the behavioral data and image sequences can be achieved by calculating the predictive distribution of the annotations via estimated latent-driving topics. The proposed method intuitively annotated more than 50,000 pieces of frame data, including urban road and expressway data. The effectiveness of the estimated drive topics was also evaluated by analyzing the performances of driving-situation classification. The topics represented the drive context efficiently, i.e., the drive topics lead to a 95% lower-dimensional feature space and 6% higher accuracy compared with a high-dimensional raw-feature space. Moreover, the drive topics achieved performance almost equivalent performance to human annotators, especially in classifying traffic jams and the number of lanes.

0 Replies