Abstract: We live in one of the most prolific eras of scientific research. More than 15,000 research papers are submitted to the Arxiv preprint server every month. It has become impossible for any researcher to keep track or search for papers relevant to his/her field of study. In this project, we address this embarrassment of riches by exploring six different deep learning architectures: TF-IDF, Doc2Vec, LSTM, RoBERTa, GPT-2 and Sentence-BERT to create an arxiv paper recommender. Our central aim is to create meaningful document embeddings for each paper based on its abstract. The embeddings are used to curate recommendations based on the cosine similarity distance between a given paper and the rest of the corpus. Our fine-tuned transformer architectures provide impressive recommendations that are at par with the current state of the art. We argue that our citation-agnostic content based models lead to more democratic and meaningful recommendations.
0 Replies
Loading