- Keywords: data imputation, functional connectivity
- Abstract: The best performing, connectome-based predictive models of behavior use multiple sources of data (i.e. predicting latent variables generated from a battery of behavioral measures). However, as the number of sources increases, the chances of missing a portion of the behavioral measures also increases, hindering downstream analyses. The most common strategy for handling missing data is to remove participants with missing values and run the analysis only using the complete cases. This approach hinders downstream predictive modeling algorithms that rely on large data sets for training. To allow participants with missing data to be retained for training, we included a data imputation step in connectome-based predictive modeling (CPM) to estimate missing values in the behavioral measures. Performance is evaluated by the improvement of predicting power compared with complete case study. Experimental results show that imputation of missing behavioral measures improves CPM performance when the predictability of that behavioral measure is relatively high. Overall, our results suggest that increasing the size of training data via data imputation may be a valuable step for datasets with missing behavioral data.