A Mathematical Framework to Characterize the Dependency Structures in Multimodal Learning with Minimax Principle

Abstract: Multimodal learning is an increasingly important research topic. Exploiting conditional dependency across multiple modalities has been shown useful for estimating of the multimodal joint distribution, especially when the number of training samples is insufficient. However, it is difficult to theoretically characterize such conditional dependency structure. To address this issue, we establish a mathematical framework and formulate the estimation problem based on the minimax principle. Then, we propose an estimator which is close to the analytical solution of the problem under a mild assumption on the sample size. Moreover, the proposed estimator is a linear combination of the learning results from the true dependency structure and the conditional one. The combining coefficient is related to three aspects: the number of training samples, the fitness of the conditional dependency structure, and the cardinality of each modality. Finally, numerical simulations are provided to verify our theoretical results that the proposed estimator is close to the optimal solution of the formulated problem.
0 Replies
Loading