Algebraic shortcuts for leave-one-out cross-validation in supervised network inference.

Michiel Stock, Tapio Pahikkala, Antti Airola, Willem Waegeman, Bernard De Baets

2020 (modified: 09 Nov 2022)Briefings in Bioinformatics2020Readers: Everyone

Abstract: Supervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein–ligand interaction, protein–protein interaction and gene regulatory networks. Many supervised techniques for network prediction use linear models on a possibly nonlinear pairwise feature representation of edges. Recently, much emphasis has been placed on the correct evaluation of such supervised models. It is vital to distinguish between using a model to either predict new interactions in a given network or to predict interactions for a new vertex not present in the original network. This distinction matters because (i) the performance might dramatically differ between the prediction settings and (ii) tuning the model hyperparameters to obtain the best possible model depends on the setting of interest. Specific cross-validation schemes need to be used to assess the performance in such different prediction settings.

0 Replies