SkillBERT: “Skilling” the BERT to classify skills!Download PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Abstract: In the age of digital recruitment, job posts can attract a large number of applications, and screening them manually can become a very tedious task. These recruitment records are stored in the form of tables in our recruitment database (Electronic Recruitment Records, referred to as ERRs). We have released a de-identified ERR dataset to the public domain. We also propose a BERT-based model, SkillBERT, the embeddings of which are used as features for classifying skills present in the ERRs into groups referred to as "competency groups". A competency group is a group of similar skills and it is used as matching criteria (instead of matching on skills) for finding the overlap of skills between the candidates and the jobs. This proxy match takes advantage of the BERT's capability of deriving meaning from the structure of competency groups present in the skill dataset. In our experiments, the SkillBERT, which is trained from scratch on the skills present in job requisitions, is shown to be better performing than the pre-trained BERT and the Word2Vec. We have also explored K-means clustering and spectral clustering on SkillBERT embeddings to generate cluster-based features. Both algorithms provide similar performance benefits. Last, we have experimented with different machine learning algorithms like Random Forest, XGBoost, and a deep learning algorithm Bi-LSTM . We did not observe a significant performance difference among the algorithms, although XGBoost and Bi-LSTM perform slightly better than Random Forest. The features created using SkillBERT are most predictive in the classification task, which demonstrates that the SkillBERT is able to capture information about the skills' ontology from the data. We have made the source code and the trained models of our experiments publicly available.
One-sentence Summary: Learning semantic information from the skill-information of both candidates and jobs that could make hiring an efficient and intuitive process
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=kymWU15YXH
7 Replies

Loading