Abstract: We propose a compositional entity modeling framework for requirement extraction from Online Job Advertisements (OJAs), reframing the task from token classification to joint entity and relation classification to capture complex, multi-component requirement structures. Using an annotated dataset of 500 German OJAs, our empirical analysis reveals the prevalence of conjoined requirement structures and the importance of modeling complex semantic relationships between requirement components. Transformer-based models trained on our data achieve F1-scores of 0.856 for entity extraction and 0.911 for relation classification, demonstrating the effectiveness of our approach. This framework offers analytical benefits for labor market research and applications like skills monitoring or job-to-candidate matching, and we release our dataset to foster further research.
Paper Type: Long
Research Area: Information Extraction
Research Area Keywords: Computational Social Science, Information Extraction
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis
Languages Studied: German
Submission Number: 4518
Loading