Health Data Information Retrieval For Improved Simulation

Mario Ciampi, Giuseppe De Pietro, Elio Masciari, Stefano Silvestri

Published: 2020, Last Modified: 27 Jun 2023PDP 2020Readers: Everyone

Abstract: In this paper we propose an architecture specifically devoted to the analysis of huge natural language biomedical textual collections, with the purpose of searching for semantic similarity in order to obtain useful hints for effective simulation that could help physicians in diagnosis tasks. We leverage Word Embedding models trained with word2vec algorithm and a Big Data architecture for their processing and management. We performed some preliminary analyses using a dataset extracted from the whole PubMed library and we developed a web front-end to show the usability of this methodology in a real context.

0 Replies