Abstract: Multimodal knowledge graph completion (MKGC) has been a popular research topic in recent years. However, existing methods rarely consider the alignment of different entity modalities in the process of multimodal fusion, and often lack sufficient attention to the semantic information conveyed by relations, thus resulting in unsatisfactory completion performance. To address these two issues, we propose a new MKGC model called C2RS. This model first designs a cross-modal consistency contrastive learning task to align different entity modalities for accurate entity representation. Then, C2RS develops a relation semantic encoding module based on the distributions of knowledge graph (KG) triples to extract the semantic information of relations for comprehensive relation representation. Finally, we encode the candidate triples with a triple encoder and identify the correct entities through a scoring function to complete the multimodal KG. According to the extensive experiments on three public MKGC datasets, C2RS obviously outperforms the baseline methods.
External IDs:doi:10.1109/tai.2025.3548621
Loading