Abstract: This work investigates whether time series of clinical measurements can be understood as being generated by meaningful physiological states whose succession follows compositional principles.
Since there is no obvious definition of elementary components and composition rules in time series, we approach this task by first conceptualizing compositionality in time series data as a property of the data generation process, and then study data-driven learning procedures that can revert this process by deconstructing times series into elementary states and composition rules.
Our empirical pipeline involves a symbolization of time series and a data augmentation procedure to synthesize full time series in a compositional manner.
We propose two empirically testable conditions for compositionality that are motivated from a domain adaptation perspective.
Both tests infer the similarity of the distributions of clinical time series and of compositionally synthesized data from the expected risk of time series forecasting models trained and tested on original and synthesized data.
Our experimental results show that the test set performance achieved by training on compositionally synthesized data is comparable to training on original clinical time series data, and that evaluation of models on compositionally synthesized test data shows similar results to evaluating on original test data.
In both experiments, performance based on compositionally synthesized data by far surpasses that based on synthetic data that were created by randomization-based data augmentation.
This work sheds some light on the compositional nature of clinical time series and introduces a general theoretically motivated framework work to empirically assess the compositionality of an unspecified data-generating process.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Taylor_W._Killian1
Submission Number: 3510
Loading