Summarizing the content of Electronic Health Records and medical reports with a Large Language Model and Vision Language Model-Based processing of data
Abstract: From the arrival of patients at a health facility to their discharge, a vast amount of highly valuable medical data are collected and gathered in Electronic Health Records (EHRs). However, current data management faces a certain number of limitations, linked with the amount and type of data used (tables, reports, images, etc.), that could hinder the efficiency of medical services. As a consequence, the analysis of records could be long and laborious for a medical personnel member, whereas the admission of patients in emergency situations calls for efficiency.
This paper presents a flexible Generative Artificial Intelligence-based framework for the processing of EHRs data. Through the use of Large Language Models and Vision Language Models, medical data are analyzed and aggregated in a single document summarizing the key information of a patient based on his/her medical history. This multimodal framework takes advantage of the strengths of language models to process structured data, medical reports, and medical images using text analysis, images processing, and Optical Character Recognition (OCR).
Experiments, conducted using hospital EHRs data from the Medical Information Mart for Intensive Care IV (compiling data from Beth Israel Deaconess Medical Center, Boston). and Language Models (including Mistal, Deepseek, LLaMA, Gemma, and LLaVA models) executed locally for medical data confidentiality, underscore promising results towards automated mutimodal processing of EHRs through summarization of reports in summaries 11 times shorter (for best LLMs) and the generation of image description with an extraction of texts with OCR.
Paper Type: Long
Research Area: Summarization
Research Area Keywords: multi-document summarization,multimodal summarization,extractive summarisation,sentence compression
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 6162
Loading