Abstract: In this paper, we aim to develop a self-supervised grounding of Covid-related medical text based on the actual spatial relationships between the referred anatomical concepts. More specifically, we learn to project sentences into a physical space defined by a three-dimensional anatomical atlas, allowing for a visual approach to navigating Covid-related literature. We design a straightforward and empirically effective training objective to reduce the curated data dependency issue. We use Bert as the main building block of our model and perform a comparison of two Bert variants pre-trained on general-purpose text - BertBASE and BertSmall, with three domain-specific pre-trained alternatives - BioBert, SciBert and ClinicalBert. We perform a quantitative analysis that demonstrates that the model learns a context-aware mapping while being trained with self-supervision in the form of medical term occurrences. We illustrate two potential use-cases for our approach, one in interactive, 3D data exploration, and the other in document retrieval. To accelerate research in this direction, we make public all trained models, the data we use, and our codebase. Finally, we also release a web tool for document retrieval and a visualization tool.
7 Replies
Loading