JoLT: Jointly Learned Representations of Language and Time-Series

Published: 27 Oct 2023, Last Modified: 11 Nov 2023DGM4H NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: Time-series and Text Foundation Models, Machine Learning for Healthcare, ECG Interpretation
TL;DR: We introduce a method to jointly learn representations of time-series and text for the purpose of clinical ECG interpretation
Abstract: Time-series and text data is prevalent in healthcare and frequently exist in tandem, for e.g., in electrocardiogram (ECG) interpretation reports. Yet, these modalities are typically modeled independently. Even studies that jointly model time-series and text do so by converting time-series to images or graphs. We hypothesize that explicitly modeling time-series jointly with text can improve tasks such as summarization and question answering for time-series data, which have received little attention so far. To address this gap, we introduce JoLT to jointly learn desired representations from pre-trained time-series and text models. JoLT utilizes a Querying Transformer (Q-Former) to align the time-series and text representations. Our experiments on a large real-world electrocardiography dataset for medical time-series summarization show that JoLT outperforms state-of-the-art image captioning and medical question-answering approaches, and that the decoder architecture, size, and pre-training data can vary the performance on said tasks.
Submission Number: 55