CoMO-NAS: Core-Structures-Guided Multi-Objective Neural Architecture Search for Multi-Modal Classification
Abstract: Most existing NAS-based multi-modal classification (MMC-NAS) methods are optimized using the classification accuracy. They can not simultaneously provide multiple models with diverse perferences such as model complex and classification performance for meeting different users' demands.Combining NAS-MMC with multi-objective optimization is a nature way for this issue. However, the challenge problem of this solution is the high computation cost. For multi-objective optimization, the computing bottleneck is pareto front search. Some higher-quality MMC models (namely core structures, CSs) consisting of high-quality features and fusion operators are easier to identify. We find that CSs have a close relation with the pareto front (PF), i.e., the individuals lying in PF contain the CSs. Based on the finding, we propose an efficient multi-objective neural architecture search for multi-modal classification by applying CSs to guide the PF search (CoMO-NAS). In conclusion, experimental results thoroughly demonstrate the effectiveness of our CoMO-NAS. Compared to state-of-the-art competitors on benchmark multi-modal tasks, we achieve comparable performance with lower model complexity in shorter search time.
Primary Subject Area: [Content] Multimodal Fusion
Secondary Subject Area: [Content] Multimodal Fusion
Relevance To Conference: The focus of this paper is to efficiently obtain various multi-modal classification models tailored to different preference tasks using multi-objective neural architecture search (NAS). To address the time-consuming challenge of traditional multi-objective algorithms, we found that core structures (CSs) from NAS-MMC, comprising high-performing features and fusion operators, can guide the Pareto front (PF) search in multi-objective optimization. This strategy significantly enhances search efficiency and solution quality. Based on these findings, we propose a method called Core Structure-guided multi-objective meural architecture search (CoMO-NAS). To the best of our knowledge, CoMO-NAS is the first to introduce the concept of multi-objective algorithms into the field of MMC-NAS. It can efficiently provide multiple optimized solutions for different scenarios. We conducted extensive experimental comparisons on multiple multi-modal tasks, and the results demonstrate that CoMO-NAS outperforms state-of-the-art multi-modal feature fusion methods in terms of search time and the number of model parameters.
Supplementary Material: zip
Submission Number: 3521
Loading