- Keywords: Compound Repositioning, Model Evaluation, Machine Learning, Text-mined Databases, SemMedDB, UMLS. DrugCentral, PMIDs
- TL;DR: We partitioned a text-mined database by time, and performed compound repositioning by training on contemporary data and testing on future indications.
- Abstract: Computational compound repositioning has the potential for identifying new uses for existing drugs. New algorithms and data source aggregation strategies provide ever-improving results via in silico metrics. However, the number of compounds successfully repositioned via computational screening remains low. Using a text-mined database, we applied a previously described network-based computational repositioning algorithm, yielding strong results via cross-validation, averaging 0.95 AUROC on test-set indications. The text-mined data was then used to build networks corresponding to different time-points in biomedical knowledge. Training the algorithm on contemporary and indications and testing on future showed a marked reduction in performance, peaking in performance metrics with the 1985 network at an AUROC of .797. Examining performance reductions due to removal of specific types of relationships highlighted the importance of drug-drug and disease-disease similarity metrics. Using data from future timepoints, we demonstrate that further acquisition of these kinds of data may help improve computational results. We also suggest focusing efforts on improving algorithmic performance in a time-resolved paradigm may further improve computational repositioning predictions.
- Archival status: Non-Archival
- Subject areas: Machine Learning, Applications: Biomedicine