LILE: Look In-Depth before Looking Elsewhere -- A Dual Attention Network using Transformers for Cross-Modal Information Retrieval in Histopathology Archives

Danial Maleki; Hamid Tizhoosh

LILE: Look In-Depth before Looking Elsewhere -- A Dual Attention Network using Transformers for Cross-Modal Information Retrieval in Histopathology Archives

Danial Maleki, Hamid Tizhoosh

Published: 28 Feb 2022, Last Modified: 20 Jul 2025MIDL 2022Readers: Everyone

Keywords: cross-modality retrieval, histopathology, Attention, Transformer

TL;DR: A novel dual attention network for cross-modal retrieval task.

Abstract: The volume of available data has grown dramatically in recent years in many applications. Furthermore, the age of networks that used multiple modalities separately has practically ended. Therefore, enabling bidirectional cross-modality data retrieval capable of processing has become a requirement for many domains and disciplines of research. This is especially true in the medical field, as data comes in a multitude of types, including various types of images and reports as well as molecular data. Most contemporary works apply cross attention to highlight the essential elements of an image or text in relation to the other modalities and try to match them together. However, regardless of their importance in their own modality, these approaches usually consider features of each modality equally. In this study, self-attention as an additional loss term will be proposed to enrich the internal representation provided into the cross attention module. This work suggests a novel architecture with a new loss term to help represent images and texts in the joint latent space. Experiment results on two benchmark datasets, i.e. MS-COCO and ARCH, show the effectiveness of the proposed method.

Registration: I acknowledge that publication of this at MIDL and in the proceedings requires at least one of the authors to register and present the work during the conference.

Authorship: I confirm that I am the author of this work and that it has not been submitted to another publication before.

Paper Type: both

Primary Subject Area: Integration of Imaging and Clinical Data

Secondary Subject Area: Application: Histopathology

Confidentiality And Author Instructions: I read the call for papers and author instructions. I acknowledge that exceeding the page limit and/or altering the latex template can result in desk rejection.

Code And Data: ARCH dataset: https://warwick.ac.uk/fac/cross_fac/tia/data/arch MS-COCO dataset: https://cocodataset.org/#home The Github page for the code implementation is almost ready and will be released soon.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/lile-look-in-depth-before-looking-elsewhere-a/code)

5 Replies

Loading