Inference of captions from histopathological patchesDownload PDF

  • Keywords: Histopathology, caption prediction, adenocarcinoma, stomach
  • TL;DR: We compiled and publicly released a captioned dataset of gastric adenocarcinoma H&E-stained patches, and we trained a baseline attention model to predict the captions.
  • Abstract: Computational histopathology has made significant strides in the past few years, slowly getting closer to clinical adoption. One area of benefit would be the automatic generation of diagnostic reports from H&E-stained whole slide images which would further increase the efficiency of the pathologists' routine diagnostic workflows. In this study, we compiled a dataset of histopathological captions of stomach adenocarcinoma endoscopic biopsy specimens which we extracted from diagnostic reports and paired with patches extracted from the associated whole slide images. The dataset contains a variety of gastric adenocarcinoma subtypes. We trained a baseline attention-based model to predict the captions from features extracted from the patches and obtained promising results. We make the captioned dataset of 260K patches publicly available.
