Keywords: data scarcity, drug discovery, adverse drug reactions (ADRs), meta-learning
TL;DR: Adverse drug reactions (ADRs) are challenging to predict because datapoints are scarce and labels are incomplete. Meta-learning with neural processes improves ADR classification accuracy and calibration with respect to chemoinformatics baselines.
Abstract: Adverse drug reactions (ADRs) are a major source of concern in the development of novel pharmaceuticals. ADRs may be identified in the late stages of development or even after commercialization, which may lead to failure or discontinuation after spending enormous resources on candidate molecules. Thus, predicting ADRs early in the process could help reduce costs by avoiding future failures. However, due to the low number of drugs approved, the amount of historical datapoints on ADRs is limited, which makes their prediction challenging for traditional chemoinformatics methods. Interestingly, each approved drug may have been annotated for hundreds of ADRs, which opens the door to framing ADR prediction as a multi-task or meta-learning problem. In this work, we adopt a meta-learning approach to ADR prediction by applying conditional neural processes (CNPs) to the publicly available Side Effect Resource (SIDER). Our results suggest that CNPs are competitive against single-task baselines even when trained on sparse datasets with missing labels. Furthermore, we find that their predictions are well-calibrated. Finally, we evaluate their performance on ADRs associated to different physiological systems and confirm good predictions across organ classes. Our findings suggest that meta-learning strategies may be beneficial for data-limited clinical endpoints like ADRs.
Primary Subject Area: Domain specific data issues
Paper Type: Research paper: up to 8 pages
DMLR For Good Track: Participate in DMLR for Good Track
Participation Mode: In-person
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 69
Loading