RESUMEN
Background Digital breast tomosynthesis (DBT) helps reduce recall rates and improve cancer detection compared with two-dimensional (2D) mammography but has a longer interpretation time. Purpose To evaluate the effect of DBT slab thickness and overlap on reader performance and interpretation time in the absence of 1-mm slices. Materials and Methods In this retrospective HIPAA-compliant multireader study of DBT examinations performed between August 2013 and July 2017, four fellowship-trained breast imaging radiologists blinded to final histologic findings interpreted DBT examinations by using a standard protocol (10-mm slabs with 5-mm overlap, 1-mm slices, synthetic 2D mammogram) and an experimental protocol (6-mm slabs with 3-mm overlap, synthetic 2D mammogram) with a crossover design. Among the 122 DBT examinations, 74 mammographic findings had final histologic findings, including 31 masses (26 malignant), 20 groups of calcifications (12 malignant), 18 architectural distortions (15 malignant), and five asymmetries (two malignant). Durations of reader interpretations were recorded. Comparisons were made by using receiver operating characteristic curves for diagnostic performance and paired t tests for continuous variables. Results Among 122 women, mean age was 58.6 years ± 10.1 (standard deviation). For detection of malignancy, areas under the receiver operating characteristic curves were similar between protocols (range, 0.83-0.94 vs 0.84-0.92; P ≥ .63). Mean DBT interpretation time was shorter with the experimental protocol for three of four readers (reader 1, 5.6 minutes ± 1.7 vs 4.7 minutes ± 1.4 [P < .001]; reader 2, 2.8 minutes ± 1.1 vs 2.3 minutes ± 1.0 [P = .001]; reader 3, 3.6 minutes ± 1.4 vs 3.3 minutes ± 1.3 [P = .17]; reader 4, 4.3 minutes ± 1.0 vs 3.8 minutes ± 1.1 [P ≤ .001]), with 72% reduction in both mean number of images and mean file size (P < .001 for both). Conclusion A digital breast tomosynthesis reconstruction protocol that uses 6-mm slabs with 3-mm overlap, without 1-mm slices, had similar diagnostic performance compared with the standard protocol and led to a reduced interpretation time for three of four readers. © RSNA, 2020 See also the editorial by Chang in this issue.
Asunto(s)
Neoplasias de la Mama/diagnóstico por imagen , Competencia Clínica , Mamografía/métodos , Intensificación de Imagen Radiográfica/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Anciano , Detección Precoz del Cáncer/métodos , Femenino , Humanos , Persona de Mediana Edad , Mejoramiento de la Calidad , Estudios RetrospectivosRESUMEN
Deep learning is the state-of-the-art machine learning approach. The success of deep learning in many pattern recognition applications has brought excitement and high expectations that deep learning, or artificial intelligence (AI), can bring revolutionary changes in health care. Early studies of deep learning applied to lesion detection or classification have reported superior performance compared to those by conventional techniques or even better than radiologists in some tasks. The potential of applying deep-learning-based medical image analysis to computer-aided diagnosis (CAD), thus providing decision support to clinicians and improving the accuracy and efficiency of various diagnostic and treatment processes, has spurred new research and development efforts in CAD. Despite the optimism in this new era of machine learning, the development and implementation of CAD or AI tools in clinical practice face many challenges. In this chapter, we will discuss some of these issues and efforts needed to develop robust deep-learning-based CAD tools and integrate these tools into the clinical workflow, thereby advancing towards the goal of providing reliable intelligent aids for patient care.
Asunto(s)
Aprendizaje Profundo , Diagnóstico por Computador , Diagnóstico por Imagen , Interpretación de Imagen Asistida por Computador , HumanosRESUMEN
Annotating lesion locations by radiologists' manual marking is a key step to provide reference standard for the training and testing of a computer-aided detection system by supervised machine learning. Inter-reader variability is not uncommon in readings even by expert radiologists. This study evaluated the variability of the radiologist-identified pulmonary emboli (PEs) to demonstrate the importance of improving the reliability of the reference standard by a multi-step process for performance evaluation. In an initial reading of 40 CTPA PE cases, two experienced thoracic radiologists independently marked the PE locations. For markings from the two radiologists that did not agree, each radiologist re-read the cases independently to assess the discordant markings. Finally, for markings that still disagreed after the second reading, the two radiologists read together to reach a consensus. The variability of radiologists was evaluated by analyzing the agreement between two radiologists. For the 40 cases, 475 and 514 PEs were identified by radiologists R1 and R2 in the initial independent readings, respectively. For a total of 545 marks by the two radiologists, 81.5% (444/545) of the marks agreed but 101 marks in 36 cases differed. After consensus, 65 (64.4%) and 36 (35.6%) of the 101 marks were determined to be true PEs and false positives (FPs), respectively. Of these, 48 and 17 were false negatives (FNs) and 14 and 22 were FPs by R1 and R2, respectively. Our study demonstrated that there is substantial variability in reference standards provided by radiologists, which impacts the performance assessment of a lesion detection system. Combination of multiple radiologists' readings and consensus is needed to improve the reliability of a reference standard.
Asunto(s)
Angiografía por Tomografía Computarizada/métodos , Embolia Pulmonar/diagnóstico por imagen , Humanos , Variaciones Dependientes del Observador , Arteria Pulmonar/diagnóstico por imagen , Radiólogos , Estándares de Referencia , Reproducibilidad de los Resultados , Estudios Retrospectivos , Sensibilidad y EspecificidadRESUMEN
OBJECTIVE: The purpose of this study is to evaluate the diagnostic accuracy of a process incorporating computer-aided detection (CAD) for the detection and prevention of retained surgical instruments using a novel nondeformable radiopaque µTag. MATERIALS AND METHODS: A high-specificity CAD system was developed iteratively from a training set (n = 540 radiographs) and a validation set (n = 560 radiographs). A novel test set composed of 700 thoracoabdominal radiographs (410 with a randomly placed µTag and 290 without a µTag) was obtained from 10 cadavers embedded with confounding iatrogenic objects. Data were analyzed first by the blinded CAD system; radiographs coded as negative (n = 373) were then independently reviewed by five blinded radiologists. The reference standard was the presence of a µTag. Sensitivity and specificity were calculated. Interrater agreement was assessed with Cohen kappa values. Mean (± SD) image analysis times were calculated. RESULTS: The high-specificity CAD system had one false-positive (sensitivity, 79.5% [326/410]; specificity, 99.7% [289/290]). A combination of the CAD system and one failsafe radiologist had superior sensitivity (98.5% [404/410] to 100% [410/410]) and specificity (99.0% [287/290] to 99.7% [289/290]), with 327 (47%) radiographs not requiring immediate radiologist review. Interrater agreement was almost perfect for all radiologist pairwise comparisons (κ = 0.921-0.992). Cumulative mean image analysis time was less than one minute (CAD, 29 ± 2 seconds; radiologists, 26 ± 16 seconds). CONCLUSION: The combination of a high-specificity CAD system with a failsafe radiologist had excellent diagnostic accuracy in the rapid detection of a nondeformable radiopaque µTag.
Asunto(s)
Diagnóstico por Computador , Cuerpos Extraños/diagnóstico por imagen , Radiografía Abdominal/métodos , Anciano de 80 o más Años , Cadáver , Humanos , Sensibilidad y EspecificidadRESUMEN
PURPOSE: To develop a quantitative measure of bone marrow changes in magnetic resonance (MR) images and investigate its capability for assessment of treatment response for patients with multiple myeloma (MM). MATERIALS AND METHODS: This study was retrospective, institutional review board approved, and HIPAA compliant. Informed consent was waived. Patients (n = 64; mean age, 58.8 years [age range, 27-75 years]) who were diagnosed with MM and underwent autologous bone marrow stem cell transplantation (BMT) were evaluated. A pair of spinal MR examinations performed before and after BMT was collected from each patient's records. A three-dimensional dynamic intensity entropy transformation (DIET) method was developed to transform MR T1-weighted signal voxel by voxel to a quantitative entropy enhancement value (qEEV), from which predictor variables were derived to train a linear discriminant analysis classifier by using a leave-one-out method. The output of the linear discriminant analysis provided a qEEV-based response index for quantitative assessment of treatment response. The performance of quantitative response index for the discrimination of responder and nonresponder patients was evaluated by receiver operating characteristic curve analysis. RESULTS: Among the 46 and 18 clinically diagnosed responder and nonresponder patients, the quantitative response index at a chosen decision threshold correctly identified 42 responder and 17 nonresponder patients. The agreement between the DIET method and the clinical outcome reached 0.922 (59 of 64; κ = 0.816; area under the receiver operating characteristic curve, 0.886 ± 0.042). CONCLUSION: This study demonstrated the feasibility of quantitative response index to differentiate responder and nonresponder patients and had substantial agreement with clinical outcomes, which indicated that this quantitative measure has the potential to be an image biomarker to assess MM treatment response.
Asunto(s)
Interpretación de Imagen Asistida por Computador/métodos , Imagen por Resonancia Magnética/métodos , Mieloma Múltiple/patología , Mieloma Múltiple/terapia , Adulto , Anciano , Femenino , Humanos , Imagenología Tridimensional , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , Resultado del TratamientoRESUMEN
OBJECTIVE: The purpose of this study was to evaluate the accuracy of our autoinitialized cascaded level set 3D segmentation system as compared with the World Health Organization (WHO) criteria and the Response Evaluation Criteria In Solid Tumors (RECIST) for estimation of treatment response of bladder cancer in CT urography. MATERIALS AND METHODS: CT urograms before and after neoadjuvant chemo-therapy treatment were collected from 18 patients with muscle-invasive localized or locally advanced bladder cancers. The disease stage as determined on pathologic samples at cystectomy after chemotherapy was considered as reference standard of treatment response. Two radiologists measured the longest diameter and its perpendicular on the pre- and posttreatment scans. Full 3D contours for all tumors were manually outlined by one radiologist. The autoinitialized cascaded level set method was used to automatically extract 3D tumor boundary. The prediction accuracy of pT0 disease (complete response) at cystectomy was estimated by the manual, autoinitialized cascaded level set, WHO, and RECIST methods on the basis of the AUC. RESULTS: The AUC for prediction of pT0 disease at cystectomy was 0.78 ± 0.11 for autoinitialized cascaded level set compared with 0.82 ± 0.10 for manual segmentation. The difference did not reach statistical significance (p = 0.67). The AUCs using RECIST criteria were 0.62 ± 0.16 and 0.71 ± 0.12 for the two radiologists, both lower than those of the two 3D methods. The AUCs using WHO criteria were 0.56 ± 0.15 and 0.60 ± 0.13 and thus were lower than all other methods. CONCLUSION: The pre- and posttreatment 3D volume change estimates obtained by the radiologist's manual outlines and the autoinitialized cascaded level set segmentation were more accurate for irregularly shaped tumors than were those based on RECIST and WHO criteria.
Asunto(s)
Tomografía Computarizada por Rayos X/métodos , Neoplasias de la Vejiga Urinaria/diagnóstico por imagen , Urografía/métodos , Adulto , Anciano , Cistectomía , Femenino , Humanos , Imagenología Tridimensional , Masculino , Persona de Mediana Edad , Terapia Neoadyuvante , Invasividad Neoplásica , Estadificación de Neoplasias , Valor Predictivo de las Pruebas , Estudios Retrospectivos , Resultado del Tratamiento , Neoplasias de la Vejiga Urinaria/tratamiento farmacológico , Neoplasias de la Vejiga Urinaria/patología , Neoplasias de la Vejiga Urinaria/cirugía , Organización Mundial de la SaludRESUMEN
PURPOSE: To investigate the dependence of microcalcification cluster detectability on tomographic scan angle, angular increment, and number of projection views acquired at digital breast tomosynthesis ( DBT digital breast tomosynthesis ). MATERIALS AND METHODS: A prototype DBT digital breast tomosynthesis system operated in step-and-shoot mode was used to image breast phantoms. Four 5-cm-thick phantoms embedded with 81 simulated microcalcification clusters of three speck sizes (subtle, medium, and obvious) were imaged by using a rhodium target and rhodium filter with 29 kV, 50 mAs, and seven acquisition protocols. Fixed angular increments were used in four protocols (denoted as scan angle, angular increment, and number of projection views, respectively: 16°, 1°, and 17; 24°, 3°, and nine; 30°, 3°, and 11; and 60°, 3°, and 21), and variable increments were used in three (40°, variable, and 13; 40°, variable, and 15; and 60°, variable, and 21). The reconstructed DBT digital breast tomosynthesis images were interpreted by six radiologists who located the microcalcification clusters and rated their conspicuity. RESULTS: The mean sensitivity for detection of subtle clusters ranged from 80% (22.5 of 28) to 96% (26.8 of 28) for the seven DBT digital breast tomosynthesis protocols; the highest sensitivity was achieved with the 16°, 1°, and 17 protocol (96%), but the difference was significant only for the 60°, 3°, and 21 protocol (80%, P < .002) and did not reach significance for the other five protocols (P = .01-.15). The mean sensitivity for detection of medium and obvious clusters ranged from 97% (28.2 of 29) to 100% (24 of 24), but the differences fell short of significance (P = .08 to >.99). The conspicuity of subtle and medium clusters with the 16°, 1°, and 17 protocol was rated higher than those with other protocols; the differences were significant for subtle clusters with the 24°, 3°, and nine protocol and for medium clusters with 24°, 3°, and nine; 30°, 3°, and 11; 60°, 3° and 21; and 60°, variable, and 21 protocols (P < .002). CONCLUSION: With imaging that did not include x-ray source motion or patient motion during acquisition of the projection views, narrow-angle DBT digital breast tomosynthesis provided higher sensitivity and conspicuity than wide-angle DBT digital breast tomosynthesis for subtle microcalcification clusters.
Asunto(s)
Enfermedades de la Mama/diagnóstico por imagen , Calcinosis/diagnóstico por imagen , Intensificación de Imagen Radiográfica/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Femenino , Humanos , Fantasmas de Imagen , Intensificación de Imagen Radiográfica/instrumentación , Sensibilidad y Especificidad , Interfaz Usuario-ComputadorRESUMEN
Objective. Digital breast tomosynthesis (DBT) has significantly improved the diagnosis of breast cancer due to its high sensitivity and specificity in detecting breast lesions compared to two-dimensional mammography. However, one of the primary challenges in DBT is the image blur resulting from x-ray source motion, particularly in DBT systems with a source in continuous-motion mode. This motion-induced blur can degrade the spatial resolution of DBT images, potentially affecting the visibility of subtle lesions such as microcalcifications.Approach. We addressed this issue by deriving an analytical in-plane source blur kernel for DBT images based on imaging geometry and proposing a post-processing image deblurring method with a generative diffusion model as an image prior.Main results. We showed that the source blur could be approximated by a shift-invariant kernel over the DBT slice at a given height above the detector, and we validated the accuracy of our blur kernel modeling through simulation. We also demonstrated the ability of the diffusion model to generate realistic DBT images. The proposed deblurring method successfully enhanced spatial resolution when applied to DBT images reconstructed with detector blur and correlated noise modeling.Significance. Our study demonstrated the advantages of modeling the imaging system components such as source motion blur for improving DBT image quality.
Asunto(s)
Mamografía , Mamografía/métodos , Humanos , Difusión , Procesamiento de Imagen Asistido por Computador/métodos , Mama/diagnóstico por imagen , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/fisiopatología , Rayos X , Movimiento , Femenino , Movimiento (Física)RESUMEN
Survival prediction post-cystectomy is essential for the follow-up care of bladder cancer patients. This study aimed to evaluate artificial intelligence (AI)-large language models (LLMs) for extracting clinical information and improving image analysis, with an initial application involving predicting five-year survival rates of patients after radical cystectomy for bladder cancer. Data were retrospectively collected from medical records and CT urograms (CTUs) of bladder cancer patients between 2001 and 2020. Of 781 patients, 163 underwent chemotherapy, had pre- and post-chemotherapy CTUs, underwent radical cystectomy, and had an available post-surgery five-year survival follow-up. Five AI-LLMs (Dolly-v2, Vicuna-13b, Llama-2.0-13b, GPT-3.5, and GPT-4.0) were used to extract clinical descriptors from each patient's medical records. As a reference standard, clinical descriptors were also extracted manually. Radiomics and deep learning descriptors were extracted from CTU images. The developed multi-modal predictive model, CRD, was based on the clinical (C), radiomics (R), and deep learning (D) descriptors. The LLM retrieval accuracy was assessed. The performances of the survival predictive models were evaluated using AUC and Kaplan-Meier analysis. For the 163 patients (mean age 64 ± 9 years; M:F 131:32), the LLMs achieved extraction accuracies of 74%~87% (Dolly), 76%~83% (Vicuna), 82%~93% (Llama), 85%~91% (GPT-3.5), and 94%~97% (GPT-4.0). For a test dataset of 64 patients, the CRD model achieved AUCs of 0.89 ± 0.04 (manually extracted information), 0.87 ± 0.05 (Dolly), 0.83 ± 0.06~0.84 ± 0.05 (Vicuna), 0.81 ± 0.06~0.86 ± 0.05 (Llama), 0.85 ± 0.05~0.88 ± 0.05 (GPT-3.5), and 0.87 ± 0.05~0.88 ± 0.05 (GPT-4.0). This study demonstrates the use of LLM model-extracted clinical information, in conjunction with imaging analysis, to improve the prediction of clinical outcomes, with bladder cancer as an initial example.
RESUMEN
Purpose To evaluate the feasibility of leveraging serial low-dose CT (LDCT) scans to develop a radiomics-based reinforcement learning (RRL) model for improving early diagnosis of lung cancer at baseline screening. Materials and Methods In this retrospective study, 1951 participants (female patients, 822; median age, 61 years [range, 55-74 years]) (male patients, 1129; median age, 62 years [range, 55-74 years]) were randomly selected from the National Lung Screening Trial between August 2002 and April 2004. An RRL model using serial LDCT scans (S-RRL) was trained and validated using data from 1404 participants (372 with lung cancer) containing 2525 available serial LDCT scans up to 3 years. A baseline RRL (B-RRL) model was trained with only LDCT scans acquired at baseline screening for comparison. The 547 held-out individuals (150 with lung cancer) were used as an independent test set for performance evaluation. The area under the receiver operating characteristic curve (AUC) and the net reclassification index (NRI) were used to assess the performances of the models in the classification of screen-detected nodules. Results Deployment to the held-out baseline scans showed that the S-RRL model achieved a significantly higher test AUC (0.88 [95% CI: 0.85, 0.91]) than both the Brock model (AUC, 0.84 [95% CI: 0.81, 0.88]; P = .02) and the B-RRL model (AUC, 0.86 [95% CI: 0.83, 0.90]; P = .02). Lung cancer risk stratification was significantly improved by the S-RRL model as compared with Lung CT Screening Reporting and Data System (NRI, 0.29; P < .001) and the Brock model (NRI, 0.12; P = .008). Conclusion The S-RRL model demonstrated the potential to improve early diagnosis and risk stratification for lung cancer at baseline screening as compared with the B-RRL model and clinical models. Keywords: Radiomics-based Reinforcement Learning, Lung Cancer Screening, Low-Dose CT, Machine Learning © RSNA, 2024 Supplemental material is available for this article.
Asunto(s)
Detección Precoz del Cáncer , Neoplasias Pulmonares , Tomografía Computarizada por Rayos X , Humanos , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/diagnóstico , Persona de Mediana Edad , Masculino , Femenino , Detección Precoz del Cáncer/métodos , Anciano , Tomografía Computarizada por Rayos X/métodos , Estudios Retrospectivos , Dosis de Radiación , Estudios de Factibilidad , Aprendizaje Automático , Tamizaje Masivo/métodos , Pulmón/diagnóstico por imagen , RadiómicaRESUMEN
Early diagnosis of lung cancer can significantly improve patient outcomes. We developed a Growth Predictive model based on the Wasserstein Generative Adversarial Network framework (GP-WGAN) to predict the nodule growth patterns in the follow-up LDCT scans. The GP-WGAN was trained with a training set (N = 776) containing 1121 pairs of nodule images with about 1-year intervals and deployed to an independent test set of 450 nodules on baseline LDCT scans to predict nodule images (GP-nodules) in their 1-year follow-up scans. The 450 GP-nodules were finally classified as malignant or benign by a lung cancer risk prediction (LCRP) model, achieving a test AUC of 0.827 ± 0.028, which was comparable to the AUC of 0.862 ± 0.028 achieved by the same LCRP model classifying real follow-up nodule images (p = 0.071). The net reclassification index yielded consistent outcomes (NRI = 0.04; p = 0.62). Other baseline methods, including Lung-RADS and the Brock model, achieved significantly lower performance (p < 0.05). The results demonstrated that the GP-nodules predicted by our GP-WGAN model achieved comparable performance with the nodules in the real follow-up scans for lung cancer diagnosis, indicating the potential to detect lung cancer earlier when coupled with accelerated clinical management versus the current approach of waiting until the next screening exam.
RESUMEN
The diagnosis of severe COVID-19 lung infection is important because it carries a higher risk for the patient and requires prompt treatment with oxygen therapy and hospitalization while those with less severe lung infection often stay on observation. Also, severe infections are more likely to have long-standing residual changes in their lungs and may need follow-up imaging. We have developed deep learning neural network models for classifying severe vs. non-severe lung infections in COVID-19 patients on chest radiographs (CXR). A deep learning U-Net model was developed to segment the lungs. Inception-v1 and Inception-v4 models were trained for the classification of severe vs. non-severe COVID-19 infection. Four CXR datasets from multi-country and multi-institutional sources were used to develop and evaluate the models. The combined dataset consisted of 5748 cases and 6193 CXR images with physicians' severity ratings as reference standard. The area under the receiver operating characteristic curve (AUC) was used to evaluate model performance. We studied the reproducibility of classification performance using the different combinations of training and validation data sets. We also evaluated the generalizability of the trained deep learning models using both independent internal and external test sets. The Inception-v1 based models achieved AUC ranging between 0.81 ± 0.02 and 0.84 ± 0.0, while the Inception-v4 models achieved AUC in the range of 0.85 ± 0.06 and 0.89 ± 0.01, on the independent test sets, respectively. These results demonstrate the promise of using deep learning models in differentiating COVID-19 patients with severe from non-severe lung infection on chest radiographs.
RESUMEN
Innovation in medical imaging artificial intelligence (AI)/machine learning (ML) demands extensive data collection, algorithmic advancements, and rigorous performance assessments encompassing aspects such as generalizability, uncertainty, bias, fairness, trustworthiness, and interpretability. Achieving widespread integration of AI/ML algorithms into diverse clinical tasks will demand a steadfast commitment to overcoming issues in model design, development, and performance assessment. The complexities of AI/ML clinical translation present substantial challenges, requiring engagement with relevant stakeholders, assessment of cost-effectiveness for user and patient benefit, timely dissemination of information relevant to robust functioning throughout the AI/ML lifecycle, consideration of regulatory compliance, and feedback loops for real-world performance evidence. This commentary addresses several hurdles for the development and adoption of AI/ML technologies in medical imaging. Comprehensive attention to these underlying and often subtle factors is critical not only for tackling the challenges but also for exploring novel opportunities for the advancement of AI in radiology.
RESUMEN
The adoption of artificial intelligence (AI) tools in medicine poses challenges to existing clinical workflows. This commentary discusses the necessity of context-specific quality assurance (QA), emphasizing the need for robust QA measures with quality control (QC) procedures that encompass (1) acceptance testing (AT) before clinical use, (2) continuous QC monitoring, and (3) adequate user training. The discussion also covers essential components of AT and QA, illustrated with real-world examples. We also highlight what we see as the shared responsibility of manufacturers or vendors, regulators, healthcare systems, medical physicists, and clinicians to enact appropriate testing and oversight to ensure a safe and equitable transformation of medicine through AI.
RESUMEN
OBJECTIVES: The purpose of this study was to retrospectively evaluate the effect of 3-dimensional automated ultrasound (3D-AUS) as an adjunct to digital breast tomosynthesis (DBT) on radiologists' performance and confidence in discriminating malignant and benign breast masses. METHODS: Two-view DBT (craniocaudal and mediolateral oblique or lateral) and single-view 3D-AUS images were acquired from 51 patients with subsequently biopsy-proven masses (13 malignant and 38 benign). Six experienced radiologists rated, on a 13-point scale, the likelihood of malignancy of an identified mass, first by reading the DBT images alone, followed immediately by reading the DBT images with automatically coregistered 3D-AUS images. The diagnostic performance of each method was measured using receiver operating characteristic (ROC) curve analysis and changes in sensitivity and specificity with the McNemar test. After each reading, radiologists took a survey to rate their confidence level in using DBT alone versus combined DBT/3D-AUS as potential screening modalities. RESULTS: The 6 radiologists had an average area under the ROC curve of 0.92 for both modalities (range, 0.89-0.97 for DBT and 0.90-0.94 for DBT/3D-AUS). With a Breast Imaging Reporting and Data System rating of 4 as the threshold for biopsy recommendation, the average sensitivity of the radiologists increased from 96% to 100% (P > .08) with 3D-AUS, whereas the specificity decreased from 33% to 25% (P > .28). Survey responses indicated increased confidence in potentially using DBT for screening when 3D-AUS was added (P < .05 for each reader). CONCLUSIONS: In this initial reader study, no significant difference in ROC performance was found with the addition of 3D-AUS to DBT. However, a trend to improved discrimination of malignancy was observed when adding 3D-AUS. Radiologists' confidence also improved with DBT/3DAUS compared to DBT alone.
Asunto(s)
Neoplasias de la Mama/diagnóstico por imagen , Imagenología Tridimensional , Ultrasonografía Mamaria/métodos , Adulto , Anciano , Biopsia , Femenino , Humanos , Persona de Mediana Edad , Fantasmas de Imagen , Proyectos Piloto , Curva ROC , Intensificación de Imagen Radiográfica/métodos , Estudios Retrospectivos , Sensibilidad y Especificidad , Programas InformáticosRESUMEN
Objective. Digital breast tomosynthesis (DBT) is a quasi-three-dimensional breast imaging modality that improves breast cancer screening and diagnosis because it reduces fibroglandular tissue overlap compared with 2D mammography. However, DBT suffers from noise and blur problems that can lower the detectability of subtle signs of cancers such as microcalcifications (MCs). Our goal is to improve the image quality of DBT in terms of image noise and MC conspicuity.Approach. We proposed a model-based deep convolutional neural network (deep CNN or DCNN) regularized reconstruction (MDR) for DBT. It combined a model-based iterative reconstruction (MBIR) method that models the detector blur and correlated noise of the DBT system and the learning-based DCNN denoiser using the regularization-by-denoising framework. To facilitate the task-based image quality assessment, we also proposed two DCNN tools for image evaluation: a noise estimator (CNN-NE) trained to estimate the root-mean-square (RMS) noise of the images, and an MC classifier (CNN-MC) as a DCNN model observer to evaluate the detectability of clustered MCs in human subject DBTs.Main results. We demonstrated the efficacies of CNN-NE and CNN-MC on a set of physical phantom DBTs. The MDR method achieved low RMS noise and the highest detection area under the receiver operating characteristic curve (AUC) rankings evaluated by CNN-NE and CNN-MC among the reconstruction methods studied on an independent test set of human subject DBTs.Significance. The CNN-NE and CNN-MC may serve as a cost-effective surrogate for human observers to provide task-specific metrics for image quality comparisons. The proposed reconstruction method shows the promise of combining physics-based MBIR and learning-based DCNNs for DBT image reconstruction, which may potentially lead to lower dose and higher sensitivity and specificity for MC detection in breast cancer screening and diagnosis.
Asunto(s)
Neoplasias de la Mama , Calcinosis , Humanos , Femenino , Mamografía/métodos , Mama/diagnóstico por imagen , Neoplasias de la Mama/diagnóstico por imagen , Redes Neurales de la Computación , Sensibilidad y Especificidad , Calcinosis/diagnóstico por imagenRESUMEN
Accurate survival prediction for bladder cancer patients who have undergone radical cystectomy can improve their treatment management. However, the existing predictive models do not take advantage of both clinical and radiological imaging data. This study aimed to fill this gap by developing an approach that leverages the strengths of clinical (C), radiomics (R), and deep-learning (D) descriptors to improve survival prediction. The dataset comprised 163 patients, including clinical, histopathological information, and CT urography scans. The data were divided by patient into training, validation, and test sets. We analyzed the clinical data by a nomogram and the image data by radiomics and deep-learning models. The descriptors were input into a BPNN model for survival prediction. The AUCs on the test set were (C): 0.82 ± 0.06, (R): 0.73 ± 0.07, (D): 0.71 ± 0.07, (CR): 0.86 ± 0.05, (CD): 0.86 ± 0.05, and (CRD): 0.87 ± 0.05. The predictions based on D and CRD descriptors showed a significant difference (p = 0.007). For Kaplan-Meier survival analysis, the deceased and alive groups were stratified successfully by C (p < 0.001) and CRD (p < 0.001), with CRD predicting the alive group more accurately. The results highlight the potential of combining C, R, and D descriptors to accurately predict the survival of bladder cancer patients after cystectomy.