XRayWizard: Reconstructing 3-D lung surfaces from a single 2-D chest x-ray image via Vision Transformer.

Shi, Zhiyi; Geng, Kaiwen; Zhao, Xiaoyan; Mahmoudi, Farhad; Haas, Christopher J; Leader, Joseph K; Duman, Emrah; Pu, Jiantao

Shi, Zhiyi; Geng, Kaiwen; Zhao, Xiaoyan; Mahmoudi, Farhad; Haas, Christopher J; Leader, Joseph K; Duman, Emrah; Pu, Jiantao.

Afiliação

Shi Z; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Geng K; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Zhao X; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Mahmoudi F; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Haas CJ; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Leader JK; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Duman E; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Pu J; Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

Med Phys ; 51(4): 2806-2816, 2024 Apr.

Article em En | MEDLINE | ID: mdl-37819009

RESUMO

BACKGROUND: Chest x-ray is widely utilized for the evaluation of pulmonary conditions due to its technical simplicity, cost-effectiveness, and portability. However, as a two-dimensional (2-D) imaging modality, chest x-ray images depict limited anatomical details and are challenging to interpret. PURPOSE: To validate the feasibility of reconstructing three-dimensional (3-D) lungs from a single 2-D chest x-ray image via Vision Transformer (ViT). METHODS: We created a cohort of 2525 paired chest x-ray images (scout images) and computed tomography (CT) acquired on different subjects and we randomly partitioned them as follows: (1) 1800 - training set, (2) 200 - validation set, and (3) 525 - testing set. The 3-D lung volumes segmented from the chest CT scans were used as the ground truth for supervised learning. We developed a novel model termed XRayWizard that employed ViT blocks to encode the 2-D chest x-ray image. The aim is to capture global information and establish long-range relationships, thereby improving the performance of 3-D reconstruction. Additionally, a pooling layer at the end of each transformer block was introduced to extract feature information. To produce smoother and more realistic 3-D models, a set of patch discriminators was incorporated. We also devised a novel method to incorporate subject demographics as an auxiliary input to further improve the accuracy of 3-D lung reconstruction. Dice coefficient and mean volume error were used as performance metrics as the agreement between the computerized results and the ground truth. RESULTS: In the absence of subject demographics, the mean Dice coefficient for the generated 3-D lung volumes achieved a value of 0.738 ± 0.091. When subject demographics were included as an auxiliary input, the mean Dice coefficient significantly improved to 0.769 ± 0.089 (p < 0.001), and the volume prediction error was reduced from 23.5 ± 2.7%. to 15.7 ± 2.9%. CONCLUSION: Our experiment demonstrated the feasibility of reconstructing 3-D lung volumes from 2-D chest x-ray images, and the inclusion of subject demographics as additional inputs can significantly improve the accuracy of 3-D lung volume reconstruction.

Assuntos

Pulmão; Tórax; Humanos; Raios X; Pulmão/diagnóstico por imagem; Tomografia Computadorizada por Raios X/métodos; Processamento de Imagem Assistida por Computador/métodos

Palavras-chave

2D chest xray; 3D reconstruction; deep learning; lung reconstruction; vision transformer

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Tórax / Pulmão Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Tórax / Pulmão Idioma: En Ano de publicação: 2024 Tipo de documento: Article