Context-Informed Sequence Classification: A Multimodal Approach to Vehicle Diagnostics

Published: 01 Mar 2026, Last Modified: 31 Mar 2026ICLR 2026 TSALM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Presentation Attendance: No, we cannot present in-person
Keywords: event sequences, multimodal learning, time series, sequence classification
TL;DR: We present BiCarFormer, a multimodal bidirectional Transformer that enhances vehicle fault diagnostics by fusing high-dimensional discrete error codes with continuous environmental sensory data.
Abstract: Effective vehicle diagnostics are critical for safety and predictive maintenance but often rely solely on asynchronous discrete sequences of Diagnostic Trouble Codes (DTCs), overlooking valuable environmental context. This paper introduces BiCarFormer, a multimodal bidirectional Transformer that fuses DTC sequences with tokenized sensory data (temperature, pressure, humidity) via a co-attention mechanism and special embeddings. By integrating these heterogeneous modalities, BiCarFormer addresses the complexity and noise inherent in real-world automotive data. Evaluations on a large-scale fleet dataset of 22,137 error codes and 360 error patterns demonstrate that our approach significantly outperforms single-modality baselines. We also show that in this setting that Transformer can learn fluctuation of quantized continuous value through attention.
Track: Industry and Applications Track (max 2 pages)
Submission Number: 18
Loading