Class imbalance should not throw you off balance: Choosing the right classifiers and performance metrics for brain decoding with imbalanced data

Published: 01 Jan 2023, Last Modified: 25 Jan 2025NeuroImage 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Class imbalance is common issue in the application of machine learning (ML) to neuroscience and can have severe consequences if not handled properly.•The impact of increasing data imbalance on ML performance is assessed for various levels of imbalance using simulated data, as well as EEG, MEG and fMRI recordings.•In highly imbalanced data, the commonly used Accuracy metric yields misleadingly high performances that result from systematically predicting the majority class.•The balanced accuracy (BAcc) metric is recommended as a default evaluation metric for ML, when seeking to minimize overall classification error.•A list of recommendations for dealing with imbalanced data is provided, and open-source code is made available to allow for further investigation.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview