Open Platform for the De-identification of Burned-in Texts in Medical Images using Deep Learning

Quentin Langlois, Nicolas Szelagowski, Jean Vanderdonckt, Sébastien Jodogne

Published: 2024, Last Modified: 07 May 2026BIOSTEC (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: While the de-identification of DICOM tags is a standardized, well-established practice, the removal of protected health information that is burned into the pixels of medical images is a more complex challenge for which Deep Learning is especially well adapted. Unfortunately, there is currently a lack of accurate, effective, and freely available tools to this end. This motivates the release of a new benchmark dataset, together with free and open-source software that implements suitable Deep Learning algorithms, with the objective of improving patient confidentiality. The proposed methods consist in adapting regular scene-text detection models (SSD and TextBoxes) to the task of image de-identification. It is shown that the fine-tuning of such generic scene-text detection models on medical images significantly improves performance. The developed algorithms can be applied either from the command line or using a Web interface that is tightly integrated with a free and open-source PACS ser

External IDs:dblp:conf/biostec/LangloisSVJ24