Supervised learning and model analysis with compositional dataDownload PDFOpen Website

Published: 2023, Last Modified: 01 Jan 2024PLoS Comput. Biol. 2023Readers: Everyone
Abstract: Author summary In recent years, advances in gene sequencing technology have allowed scientists to examine entire microbial communities within genetic samples. These communities interact with their surroundings in complex ways, potentially benefiting or harming the host they inhabit. However, analyzing the microbiome—the measured microbial community—is challenging due to the compositionality and sparsity of the data. In this study, we developed a statistical framework called KernelBiome to model the relationship between the microbiome and a target of interest, such as the host’s disease status. We utilized a type of machine learning model called kernel methods and adapted them to handle the compositional and sparse nature of the data, while also incorporating prior expert knowledge. Additionally, we introduced two new measures to help interpret the contributions of individual compositional components. Our approach also demonstrated that kernel methods increase interpretability in analyzing microbiome data. To make KernelBiome as accessible as possible, we have created an easy-to-use software package for researchers and practitioners to apply in their work.
0 Replies

Loading