Automated Concept Map Extraction from Text

ACL ARR 2024 June Submission2382 Authors

15 Jun 2024 (modified: 19 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Concept maps are summaries of nodes and relations from text in a directed graph format that can foster students' learning and understanding. However, manually constructing them is a challenging task. Automatic concept map extraction methods have emerged, standardly with a pipeline approach consisting of methods to extract entities and their relations. Yet, existing methods face efficiency limitations: 1) they are not capable of dealing with big corpora, 2) they are not open-access architectures, 3) they rely on the existence of annotated datasets. To bridge these gaps, we introduce a novel, modularized and open-source methods for concept map extraction that addresses efficiency by using semantic and sub-symbolic techniques with a new preliminary summarisation component. Moreover, we compare the pipeline approaches with three end-to-end Large Language Models methods. The best models for our pipeline and our end-to-end baseline achieve state-of-the-art results on METEOR metrics, with F1 scores of $25.69$ and $28.5$ respectively and on ROUGE-2 recall, with scores of $24.26$ and $24.3$. This contribution advances the task of automated concept map extraction, opening doors to wider applications supporting learning. The code is open-access and available
Paper Type: Long
Research Area: Summarization
Research Area Keywords: Summarization, Machine Learning for NLP, Language Modeling, Information Extraction
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 2382