A Multi-Granularity Semantic-Enhanced Model for Concept Extraction on Chinese MOOCsDownload PDF


16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: This paper propose a multi-granularity semantic-enhanced model for chinese course concept extraction.
Abstract: As online education becomes popular, open course platforms represented by MOOCs have collected a large number of course videos. How to identify and extract course concepts in MOOC videos accurately has become a fundamental problem in course content analysis and recommendation. However, since the course concepts in video subtitles are complex and diverse, using character features is not enough to understand concept semantics and identify their boundaries. Thus, we propose a Multi-Granularity Semantic-Enhanced (MGSE) model, which unifies information at word and context granularity, to enhance character representations encoded by a pre-trained language model. For word granularity, we design a word assignment policy and a word quality evaluation strategy. For context granularity, we devise a dual-channel attention module to fuse global and similar context information relevant to course concepts. Experimental results on computer courses and economic courses in MoocData show that MGSE outperforms the baselines significantly. The ablation experiment proves that the semantics with various kinds of granularity help the course concept extraction.
Paper Type: long
Research Area: Information Extraction
Contribution Types: NLP engineering experiment
Languages Studied: Chinses
0 Replies
