Vapur: A Search Engine to Find Related Protein - Compound Pairs in COVID-19 LiteratureDownload PDF

Sep 03, 2020 (edited Oct 10, 2020)EMNLP 2020 Workshop NLP-COVID SubmissionReaders: Everyone
  • Keywords: biochemical relation extraction, text mining, search engine, cord-19
  • TL;DR: We present a relation-extraction based search engine to find related protein - chemical pairs in CORD-19.
  • Abstract: Coronavirus Disease of 2019 (COVID-19) created dire consequences globally and triggered an intense scientific effort from different domains. The resulting publications created a huge text collection in which finding the studies related to a biomolecule of interest is challenging for general purpose search engines because the publications are rich in domain specific terminology. Here, we present Vapur: an online COVID-19 search engine specifically designed to find related protein - chemical pairs. Vapur is empowered with a relation-oriented inverted index that is able to retrieve and group studies for a query biomolecule with respect to its related entities. The inverted index of Vapur is automatically created with a BioNLP pipeline and integrated with an online user interface. The online interface is designed for the smooth traversal of the current literature by domain researchers and is publicly available at
