Advanced ML Approaches to PGx Recommendations in Precision Medicine

Michael Zastrozhin, Danika Gupta, Nisha Talagala, Jason Akram, Roman Grachev, Allan Gobbs, Nasreen Karaf, Alex Timoshenko

Published: 2024, Last Modified: 08 Sept 2025COMPSAC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We present a machine learning model to predict pharmacogenetic (PGx) guidelines using a novel dataset assembled from the PharmGKB knowledge base, incorporating chemical structure, genetic information and recommendation annotations. To facilitate modeling, free text recommendations were processed via domain expertise assisted NLP to create three categories: Standard Dose, Adjusted Dose, and Alternate Drug. We compared several models including Multi-Layer Perceptron, K Nearest Neighbors, Random Forest, Logistic Regression, Linear SVC, and XGBoost. XGBoost excelled, combining predictive power with explainability, achieving an accuracy of 89.14% and F1 scores from 0.85-0.90, with precision and recall of 0.83-0.97 and 0.82-0.97 respectively. Such models can accelerate PGx guidelines, enhancing personalized medicine in clinical settings.

External IDs:dblp:conf/compsac/ZastrozhinGTAGGKT24