Multimodal missing data in healthcare: A comprehensive review and future directions

Published: 01 Jan 2025, Last Modified: 13 May 2025Comput. Sci. Rev. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rapid advancement in healthcare data collection technologies and the importance of using multimodal data for accurate diagnosis leads to a surge in multimodal data characterized by different types, structures, and missing values. Machine learning algorithms for predicting or analyzing usually demand the completeness of data. As a result, handling missing data has become a critical concern in the healthcare sector. This survey paper comprehensively reviews recent works on handling multimodal missing data in healthcare. We emphasize methods for synthesizing data from various modalities or multiple sources in imputing missing data, including early fusion, late fusion, and intermediate fusion methods for missing data imputation. The main objective of this study is to identify gaps in the surveyed area and list future tasks and challenges in handling multimodal missing data in healthcare. This review is valuable for researchers and practitioners in healthcare data analysis. It provides insights into using fusion methods to improve data quality and healthcare outcomes.
Loading