Tracing architecture of machine learning models through their mentions in scholarly articles

Saurav Karmakar, Julia Kamps

Published: 01 Jan 2024, Last Modified: 20 May 2025DSIT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Relation extraction, is a pivotal task in NLP, impacts information retrieval, natural language understanding (NLU) and knowledge generation. Machine learning model has coined itself as the most influential term in this era of deep learning and LLM. In scientific text how machine learning models relate with other key entities, holds always a quintessentially interesting topic. Knowing the origins of machine learning model in terms of their architecture open a crucial tunnel of understanding towards its characteristics. In this paper we experiment on tracing the machine learning model architecture of the machine learning models from their mentions in scholarly texts. We attack this problem in supervised approach; first we identify multiple machine learning model oriented entities present in a sentence and then figure out if each of such entities are based on another such entity through binary (’based on’ and other relation) classification task. We report our findings with four state of the art baseline models. The findings report here exemplary performance with LUKE model as winner. The presence of ’based on’ relation has quite low evidence support, which effected the performance result of the models, which inspire for further explorations for improvement.