Vision Intelligence Assisted Lung Function Estimation Based on Transformer Encoder-Decoder Network With Invertible Modeling

Liuyin Chen, Di Lu, Jianxue Zhai, Kaican Cai, Long Wang, Zijun Zhang

Published: 01 Jan 2024, Last Modified: 13 Nov 2024IEEE Trans. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Lung function evaluation is important to many medical applications, but conducting pulmonary function tests is constrained by different conditions. This article presents a pioneer study of an integrated invertible deep learning method for lung function estimation via using computed tomography (CT) images. First, the projection method is proposed to flatten the three-dimensional (3-D) image onto a two-dimensional (2-D) plane, with preserving location information in 3-D. Next, the MBConv transformer-based encoder–decoder structure is developed to extract latent features. Finally, we develop an invertible normalizing flow (NF) model to infer lung function based on the extracted features and design two loss functions for two directions. The method enables both estimating the lung function based on CT images and metadata as well as generating the corresponding simulated CT image according to the lung function. Computational studies show that the proposed regression model outperforms all state-of-the-art image regression models. A comprehensive comparative analysis also demonstrates the effectiveness of using generated images and confirms the superiority of the proposed method. To the best of our knowledge, this work is the first of its kind in combining encoder–decoder network with NFs to ensure the effectiveness of the fully invertible framework, especially in lung CT image analysis.