Abstract: Competency tagging is essential in both academic and industrial domains, facilitating alignment of learning content, job posting and resumes with specific competencies. However, the manual tagging process is time-consuming, labor-intensive, and expensive. In this study, we propose semantic retrieval-based method for automated competency tagging. Particularly, we explore the potential of large language models (LLMs) to encode text data from learning content and competency descriptions. Subsequently, we employ similarity search to retrieve the most pertinent competency tags corresponding to a given learning content document. We investigated semantic search at different levels of granularity: per document, per paragraph, and per sentence. We further fine-tuned the LLM using the Low-Rank Adaptation (LoRA) technique. Our method yielded promising results, achieving a recall@10 of 80.29% when tested on 164 pages of learning content associated with 96 competencies. These findings highlight the effectiveness of fine-tuned LLMs, which enhanced recall@10 by 6%.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: Information Retrieval, Tagging, NLP Applications, Semantics
Contribution Types: NLP engineering experiment, Reproduction study
Languages Studied: English
Submission Number: 3520
Loading