Online Multimodal Learning with Human-in-the-Loop

17 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal learning, online learning, human-in-the-loop learning, reference understanding
Abstract: We study the online multimodal learning (OML) problem, wherein a model is not frozen at any point in time but instead dynamically adapts its structure and parameters to learn new multimodal concepts and associations without forgetting the learned ones throughout its lifetime. To address this challenge, we propose a brain-inspired neural network with a hierarchical and modular architecture, named OML. Based on the characteristics of different hierarchies and modules, we design different types of artificial neuron models. The network includes ascending, descending, and lateral pathways, which ensure that all modalities can cooperate and interact with each other during online learning. Additionally, we develop a reference extraction algorithm that autonomously identifies the precise features to which a word refers. During online learning, the network performs conflict checking between the current input and the knowledge already learned from previous data. If a conflict occurs, the network is capable of posing appropriate questions to the user and updating itself based on the user's answers. All the designs make our method do learning like the way humans do. Experimental results demonstrate that our method can effectively handle the online multimodal learning.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 9106
Loading