Learning Résumé Embeddings with Search Data and Transformers

Jonathan Hourany, Aaron Zira, Ignacio Avas, Nicolas Thiebaut

Published: 01 Jan 2023, Last Modified: 06 Jun 2025SMC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In typical search engines, users are exposed to an interface that combines free-form textual fields and hard filters. After the initial query, users implicitly express their preferences through clicks, purchases, or connection requests. In this paper, we propose leveraging implicit user feedback to learn similarity metrics between search results that receive attention together. The learned similarity model can notably be used to serve recommendations. It captures a form of latent similarity that is not known to the search engine but learned from implicit user feedback. We introduce a method to create a dataset suited for similarity learning from search data. Using this dataset, we use contrastive learning and train a similarity model that outputs large scores for pairs of results that tend to receive the same attention from users and low scores otherwise. Our experiments show that BERT models and their siamese equivalents (Sentence BERT) produce meaningful similarity metrics when fine-tuned on the dataset built from search data.