HawkesVAE: Sequential Patient Event Synthesis for Clinical Trials

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Synthetic Data Generation, Sequential Event Generation, VAE, Hawkes
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: VAE + Hawkes Process to generate synthetic data
Abstract: Generating sequential events data, such as adverse patient events, can provide valuable insights for clinical trial development, pharmaceutical research, patient modeling, and more. One approach to generate such data is by using generative AI models, which can synthesize data that resembles real-world data. However, in the domains such as clinical trials, patient data is especially limited. Data generation methods from literature such as LSTM, Probabilistic Auto-regressive, and Diffusion-based data generators struggle with this particular task off the shelf, as we show empirically. To address this task, we propose HawkesVAE, a Variational Autoencoder (VAE) that models events using Hawkes Processes (HP). Hawkes Processes are specialized statistical models designed specifically for the task of event and time-gap prediction, and VAEs enable an end-to-end generative design. Additionally, traditional VAEs rely solely on embeddings to decode events, but in a data-limited setting, this approach can have issues fitting the data. Therefore, we experiment with different ways of allowing the decoder to access varying amounts of information from the input events. Our experiments show that HawkesVAE outperforms other methods in terms of fidelity and allows the generation of highly accurate event sequences in multiple real-world sequential event datasets with only a small amount of external information. Furthermore, our empirical experiments demonstrate that HawkesVAE generates data that allows for superior utility over existing baselines while maintaining privacy.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2751
Loading