Multitree GP-Based Feature Learning for Multimodal Medical Image Classification

Zhicheng Wu, Bing Xue, Mengjie Zhang

Published: 2025, Last Modified: 07 Jan 2026CEC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Multimodal medical image classification (MMIC) refers to the process of extracting and combining information from various modalities to classify medical images, ultimately improving diagnostic accuracy. Most existing methods extract discriminative features from different modalities for specific tasks but often lack adaptability to other tasks. Additionally, they suffer from poor interpretability, which is critical in medical image analysis, as understanding the decision-making process is essential. Genetic programming (GP), particularly multitree GP, provides a flexible framework for evolving multimodal features. However, current multitree GP methods only perform feature-level fusion and remain underexplored in medical multimodal feature learning. To address these issues, this paper proposes a novel multitree GP method, Multimodal Feature GP (MFGP), to automatically extract informative feature vectors from different modalities. To fully utilize both modality-specific features and fused multimodal features, we integrate feature-level fusion and decision-level fusion strategies into our framework. The performance of the proposed method is evaluated on two distinct MMIC tasks, namely polyp classification and glaucoma classification, representing different medical scenarios. The results are compared with both single-modality and multimodal methods. Experimental results demonstrate that the proposed method significantly outperforms all single-modality approaches and most multimodal benchmark methods. Further analysis reveals that the evolved models can effectively capture the unique characteristics of different modalities.

External IDs:dblp:conf/cec/Wu0Z25