Information Retrieval and Extraction on COVID-19 Clinical Articles Using Graph Community Detection and Bio-BERT Embeddings
Keywords: NLP, BERT, Bio-BERT, Document Embeddings, Graph Theory, Network Analysis, Community Detection, Extractive Summarization, Page Rank Algorithm
TL;DR: Information Retrieval system and Question-Answer Bot on a corpus of scientific articles related to COVID-19
Abstract: In this paper, we present an information retrieval system on a corpus of scientific articles related to COVID-19. We build a similarity network on the articles where similarity is determined via shared citations and biological domain-specific sentence embeddings. Ego-splitting community detection on the article network is employed to cluster the articles and then the queries are matched with the clusters. Extractive summarization using BERT and PageRank methods is used to provide responses to the query. We also provide a Question-Answer bot on a small set of intents to demonstrate the efficacy of our model for an information extraction module.
5 Replies
Loading