Abstract: Integrated with multi-modal learning, knowledge graphs (KGs) as structured knowledge repositories, enhance AI's capability to process and understand complex, real-world data. This paper provides a comprehensive survey of cutting-edge research on KG-aware multi-modal learning, providing task definitions, evaluation benchmarks, and detailed insights into key breakthroughs. Furthermore, we also discuss current challenges, highlighting emerging trends and future research directions.
Paper Type: Long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: knowledge graphs, multimodality, knowledge augmented, knowledge base QA, vision question answering
Contribution Types: Surveys
Languages Studied: English
Submission Number: 167
Loading