RESUMO
PURPOSE: Measuring the size of nodules on chest CT is important for lung cancer staging and measuring therapy response. 3D volumetry has been proposed as a more robust alternative to 1D and 2D sizing methods. There have also been substantial advances in methods to reduce radiation dose in CT. The purpose of this work was to investigate the effect of dose reduction and reconstruction methods on variability in 3D lung-nodule volumetry. METHODS: Reduced-dose CT scans were simulated by applying a noise-addition tool to the raw (sinogram) data from clinically indicated patient scans acquired on a multidetector-row CT scanner (Definition Flash, Siemens Healthcare). Scans were simulated at 25%, 10%, and 3% of the dose of their clinical protocol (CTDIvol of 20.9 mGy), corresponding to CTDIvol values of 5.2, 2.1, and 0.6 mGy. Simulated reduced-dose data were reconstructed with both conventional filtered backprojection (B45 kernel) and iterative reconstruction methods (SAFIRE: I44 strength 3 and I50 strength 3). Three lab technologist readers contoured "measurable" nodules in 33 patients under each of the different acquisition/reconstruction conditions in a blinded study design. Of the 33 measurable nodules, 17 were used to estimate repeatability with their clinical reference protocol, as well as interdose and inter-reconstruction-method reproducibilities. The authors compared the resulting distributions of proportional differences across dose and reconstruction methods by analyzing their means, standard deviations (SDs), and t-test and F-test results. RESULTS: The clinical-dose repeatability experiment yielded a mean proportional difference of 1.1% and SD of 5.5%. The interdose reproducibility experiments gave mean differences ranging from -5.6% to -1.7% and SDs ranging from 6.3% to 9.9%. The inter-reconstruction-method reproducibility experiments gave mean differences of 2.0% (I44 strength 3) and -0.3% (I50 strength 3), and SDs were identical at 7.3%. For the subset of repeatability cases, inter-reconstruction-method mean/SD pairs were (1.4%, 6.3%) and (-0.7%, 7.2%) for I44 strength 3 and I50 strength 3, respectively. Analysis of representative nodules confirmed that reader variability appeared unaffected by dose or reconstruction method. CONCLUSIONS: Lung-nodule volumetry was extremely robust to the radiation-dose level, down to the minimum scanner-supported dose settings. In addition, volumetry was robust to the reconstruction methods used in this study, which included both conventional filtered backprojection and iterative methods.
Assuntos
Pulmão/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Simulação por Computador , Humanos , Imagens de Fantasmas , Doses de Radiação , Reprodutibilidade dos Testes , Tomografia Computadorizada por Raios X/instrumentaçãoRESUMO
Quantitative imaging biomarkers are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure quantitative imaging biomarkers have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatability. The second study is a large phantom study allowing assessment of four algorithms' bias and reproducibility for measuring tumor volume and the change in tumor volume. The third study is a small clinical study of patients whose tumors were measured on two occasions. This study allows a direct assessment of six algorithms' performance for measuring tumor change. With these three examples we compare and contrast study designs and performance metrics, and we illustrate the advantages and limitations of various common statistical methods for quantitative imaging biomarker studies.
Assuntos
Algoritmos , Biomarcadores , Diagnóstico por Imagem , Nódulo Pulmonar Solitário/diagnóstico , Estatística como Assunto , Viés , Humanos , Imagens de Fantasmas , Reprodutibilidade dos Testes , Projetos de PesquisaRESUMO
Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.
Assuntos
Algoritmos , Biomarcadores , Diagnóstico por Imagem , Projetos de Pesquisa , Estatística como Assunto , Viés , Simulação por Computador , Humanos , Imagens de Fantasmas , Padrões de Referência , Reprodutibilidade dos TestesRESUMO
RATIONALE AND OBJECTIVES: To estimate and statistically compare the bias and variance of radiologists measuring the size of spherical and complex synthetic nodules. MATERIALS AND METHODS: This study did not require the institutional review board approval. Six radiologists estimated the size of 10 synthetic nodules embedded within an anthropomorphic thorax phantom from computed tomography scans at 0.8- and 5-mm slice thicknesses. The readers measured the nodule size using unidimensional (1D) longest in-slice dimension, bidimensional (2D) area from longest in-slice and longest perpendicular dimension, and three-dimensional (3D) semiautomated volume. Intercomparisons of bias (difference between average and true size) and variance among methods were performed after converting the 2D and 3D estimates to a compatible 1D scale. RESULTS: The relative biases of radiologists with the 3D tool were -1.8%, -0.4%, -0.7%, -0.4%, and -1.6% for 10-mm spherical, 20-mm spherical, 20-mm elliptical, 10-mm lobulated, and 10-mm spiculated nodules compared to 1.4%, -0.1%, -26.5%, -7.8%, and -39.8% for 1D. The three-dimensional measurements were significantly less biased than 1D for elliptical, lobulated, and spiculated nodules. The relative standard deviations for 3D were 7.5%, 3.9%, 3.6%, 9.7%, and 8.3% compared to 5.7%, 2.6%, 20.3%, 5.3%, and 16.4% for 1D. Unidimensional sizing was significantly less variable than 3D for the lobulated nodule and significantly more variable for the ellipsoid and spiculated nodules. Three-dimensional bias and variability were smaller for thin 0.8-mm slice data compared to thick 5.0-mm data. CONCLUSIONS: The study shows that radiologist-controlled 3D volumetric lesion sizing can not only achieve smaller bias but also achieve similar or smaller variability compared to 1D sizing, especially for complex lesion shapes.