Comparison of Different Approaches to Patent Search

Published: 01 Jan 2024, Last Modified: 21 Oct 2024SIU 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Patents are highly important sources of technical information for inventors and engineers. Patent search allows interested parties to quickly access the necessary information found in patents. Since it is a time-consuming task that requires expertise, conventional information retrieval methods are insufficient in the field of patent search and computer-aided solutions are needed. In this study, the performance of TF-IDF (Term Frequency - Inverse Document Frequency) and vector embed-dings approaches in patent retrieval is compared using precision, recall and mean reciprocal rank (MRR) metrics.Within the scope of this study, the dataset was limited to patents in the aerospace field, and an empirical evaluation was carried out by using sample queries obtained from engineers working in this field. In the embedded vector approach, two different models, a general-purpose language model and a patent-specific language model were tested. It has been shown that the embedded vector approach implemented with BERT-based language models gives better results than the basic TF-IDF approach in the field of patent mining.
Loading