Predicting COVID-19 disease severity from SARS-CoV-2 spike protein sequence by mixed effects machine learning
Abstract: Highlights•We propose to use global SARS-CoV-2 sequence data to predict severe COVID-19 disease.•Global sequence and patient outcome data vary greatly over time and between regions.•Mixed effects machine learning with GPBoost outperform fixed effect methods.•Trained models can provide early warnings for risks of emerging viral variants.•GPBoost modeling can also identify key mutations of potential clinical interest.
Loading