Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
J Am Coll Radiol ; 2024 Jul 03.
Article in English | MEDLINE | ID: mdl-38969253

ABSTRACT

OBJECTIVE: Mammography and MRI screening typically occur in combination or in alternating sequence. We compared multimodality screening performance accounting for the relative timing of mammography and MRI and overlapping follow-up periods. METHODS: We identified 8,260 screening mammograms performed 2005 to 2017 in the Breast Cancer Surveillance Consortium, paired with screening MRIs within ±90 days (combined screening) or 91 to 270 days (alternating screening). Performance for combined screening (cancer detection rate [CDR] per 1,000 examinations and sensitivity) was calculated with 1-year follow-up for each modality, and with a single follow-up period treating the two tests as a single test. Alternating screening performance was calculated with 1-year follow-up for each modality and also with follow-up ending at the next screen if within 1 year (truncated follow-up). RESULTS: For 3,810 combined screening pairs, CDR per 1,000 screens was 6.8 (95% confidence interval [CI]: 4.6-10.0) for mammography and 12.3 (95% CI: 9.3-16.4) for MRI as separate tests compared with 13.1 (95% CI: 10.0-17.3) as a single combined test. Sensitivity of each test was 48.1% (35.0%-61.5%) for mammography and 79.7% (95% CI: 67.7%-88.0%) for MRI compared with 96.2% (95% CI: 85.9%-99.0%) for combined screening. For 4,450 alternating screening pairs, mammography CDR per 1,000 screens changed from 3.6 (95% CI: 2.2-5.9) to zero with truncated follow-up; sensitivity was incalculable (denominator = 0). MRI CDR per 1,000 screens changed from 12.1 (95% CI 9.3-15.8) to 11.7 (95% CI: 8.9-15.3) with truncated follow-up; sensitivity changed from 75.0% (95% CI 63.8%-83.6%) to 86.7% (95% CI 75.5%-93.2%). DISCUSSION: Updating auditing approaches to account for combined and alternating screening sequencing and to address outcome attribution issues arising from overlapping follow-up periods can improve the accuracy of multimodality screening performance evaluation.

4.
J Am Coll Radiol ; 2024 May 22.
Article in English | MEDLINE | ID: mdl-38789066

ABSTRACT

With promising artificial intelligence (AI) algorithms receiving FDA clearance, the potential impact of these models on clinical outcomes must be evaluated locally before their integration into routine workflows. Robust validation infrastructures are pivotal to inspecting the accuracy and generalizability of these deep learning algorithms to ensure both patient safety and health equity. Protected health information concerns, intellectual property rights, and diverse requirements of models impede the development of rigorous external validation infrastructures. The authors propose various suggestions for addressing the challenges associated with the development of efficient, customizable, and cost-effective infrastructures for the external validation of AI models at large medical centers and institutions. The authors present comprehensive steps to establish an AI inferencing infrastructure outside clinical systems to examine the local performance of AI algorithms before health practice or systemwide implementation and promote an evidence-based approach for adopting AI models that can enhance radiology workflows and improve patient outcomes.

6.
Radiol Artif Intell ; 6(3): e230375, 2024 May.
Article in English | MEDLINE | ID: mdl-38597784

ABSTRACT

Purpose To explore the stand-alone breast cancer detection performance, at different risk score thresholds, of a commercially available artificial intelligence (AI) system. Materials and Methods This retrospective study included information from 661 695 digital mammographic examinations performed among 242 629 female individuals screened as a part of BreastScreen Norway, 2004-2018. The study sample included 3807 screen-detected cancers and 1110 interval breast cancers. A continuous examination-level risk score by the AI system was used to measure performance as the area under the receiver operating characteristic curve (AUC) with 95% CIs and cancer detection at different AI risk score thresholds. Results The AUC of the AI system was 0.93 (95% CI: 0.92, 0.93) for screen-detected cancers and interval breast cancers combined and 0.97 (95% CI: 0.97, 0.97) for screen-detected cancers. In a setting where 10% of the examinations with the highest AI risk scores were defined as positive and 90% with the lowest scores as negative, 92.0% (3502 of 3807) of the screen-detected cancers and 44.6% (495 of 1110) of the interval breast cancers were identified with AI. In this scenario, 68.5% (10 987 of 16 040) of false-positive screening results (negative recall assessment) were considered negative by AI. When 50% was used as the cutoff, 99.3% (3781 of 3807) of the screen-detected cancers and 85.2% (946 of 1110) of the interval breast cancers were identified as positive by AI, whereas 17.0% (2725 of 16 040) of the false-positive results were considered negative. Conclusion The AI system showed high performance in detecting breast cancers within 2 years of screening mammography and a potential for use to triage low-risk mammograms to reduce radiologist workload. Keywords: Mammography, Breast, Screening, Convolutional Neural Network (CNN), Deep Learning Algorithms Supplemental material is available for this article. © RSNA, 2024 See also commentary by Bahl and Do in this issue.


Subject(s)
Artificial Intelligence , Breast Neoplasms , Early Detection of Cancer , Mammography , Humans , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , Breast Neoplasms/diagnosis , Female , Mammography/methods , Norway/epidemiology , Retrospective Studies , Middle Aged , Early Detection of Cancer/methods , Aged , Adult , Mass Screening/methods , Radiographic Image Interpretation, Computer-Assisted/methods
8.
Eur Radiol ; 2024 Mar 25.
Article in English | MEDLINE | ID: mdl-38528136

ABSTRACT

OBJECTIVE: To explore the ability of artificial intelligence (AI) to classify breast cancer by mammographic density in an organized screening program. MATERIALS AND METHOD: We included information about 99,489 examinations from 74,941 women who participated in BreastScreen Norway, 2013-2019. All examinations were analyzed with an AI system that assigned a malignancy risk score (AI score) from 1 (lowest) to 10 (highest) for each examination. Mammographic density was classified into Volpara density grade (VDG), VDG1-4; VDG1 indicated fatty and VDG4 extremely dense breasts. Screen-detected and interval cancers with an AI score of 1-10 were stratified by VDG. RESULTS: We found 10,406 (10.5% of the total) examinations to have an AI risk score of 10, of which 6.7% (704/10,406) was breast cancer. The cancers represented 89.7% (617/688) of the screen-detected and 44.6% (87/195) of the interval cancers. 20.3% (20,178/99,489) of the examinations were classified as VDG1 and 6.1% (6047/99,489) as VDG4. For screen-detected cancers, 84.0% (68/81, 95% CI, 74.1-91.2) had an AI score of 10 for VDG1, 88.9% (328/369, 95% CI, 85.2-91.9) for VDG2, 92.5% (185/200, 95% CI, 87.9-95.7) for VDG3, and 94.7% (36/38, 95% CI, 82.3-99.4) for VDG4. For interval cancers, the percentages with an AI score of 10 were 33.3% (3/9, 95% CI, 7.5-70.1) for VDG1 and 48.0% (12/25, 95% CI, 27.8-68.7) for VDG4. CONCLUSION: The tested AI system performed well according to cancer detection across all density categories, especially for extremely dense breasts. The highest proportion of screen-detected cancers with an AI score of 10 was observed for women classified as VDG4. CLINICAL RELEVANCE STATEMENT: Our study demonstrates that AI can correctly classify the majority of screen-detected and about half of the interval breast cancers, regardless of breast density. KEY POINTS: • Mammographic density is important to consider in the evaluation of artificial intelligence in mammographic screening. • Given a threshold representing about 10% of those with the highest malignancy risk score by an AI system, we found an increasing percentage of cancers with increasing mammographic density. • Artificial intelligence risk score and mammographic density combined may help triage examinations to reduce workload for radiologists.

9.
Insights Imaging ; 15(1): 38, 2024 Feb 08.
Article in English | MEDLINE | ID: mdl-38332187

ABSTRACT

OBJECTIVES: The randomized controlled trial comparing digital breast tomosynthesis and synthetic 2D mammograms (DBT + SM) versus digital mammography (DM) (the To-Be 1 trial), 2016-2017, did not result in higher cancer detection for DBT + SM. We aimed to determine if negative cases prior to interval and consecutive screen-detected cancers from DBT + SM were due to interpretive error. METHODS: Five external breast radiologists performed the individual blinded review of 239 screening examinations (90 true negative, 39 false positive, 19 prior to interval cancer, and 91 prior to consecutive screen-detected cancer) and the informed consensus review of examinations prior to interval and screen-detected cancers (n = 110). The reviewers marked suspicious findings with a score of 1-5 (probability of malignancy). A case was false negative if ≥ 2 radiologists assigned the cancer site with a score of ≥ 2 in the blinded review and if the case was assigned as false negative by a consensus in the informed review. RESULTS: In the informed review, 5.3% of examinations prior to interval cancer and 18.7% prior to consecutive round screen-detected cancer were considered false negative. In the blinded review, 10.6% of examinations prior to interval cancer and 42.9% prior to consecutive round screen-detected cancer were scored ≥ 2. A score of ≥ 2 was assigned to 47.8% of negative and 89.7% of false positive examinations. CONCLUSIONS: The false negative rates were consistent with those of prior DM reviews, indicating that the lack of higher cancer detection for DBT + SM versus DM in the To-Be 1 trial is complex and not due to interpretive error alone. CRITICAL RELEVANCE STATEMENT: The randomized controlled trial on digital breast tomosynthesis and synthetic 2D mammograms (DBT) and digital mammography (DM), 2016-2017, showed no difference in cancer detection for the two techniques. The rates of false negative screening examinations prior to interval and consecutive screen-detected cancer for DBT were consistent with the rates in prior DM reviews, indicating that the non-superior DBT performance in the trial might not be due to interpretive error alone. KEY POINTS: • Screening with digital breast tomosynthesis (DBT) did not result in a higher breast cancer detection rate compared to screening with digital mammography (DM) in the To-Be 1 trial. • The false negative rates for examinations prior to interval and consecutive screen-detected cancer for DBT were determined in the trial to test if the lack of differences was due to interpretive error. • The false negative rates were consistent with those of prior DM reviews, indicating that the lack of higher cancer detection for DBT versus DM was complex and not due to interpretive error alone.

10.
Radiol Artif Intell ; 6(2): e230137, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38323914

ABSTRACT

Purpose To evaluate performance improvements of general radiologists and breast imaging specialists when interpreting a set of diverse digital breast tomosynthesis (DBT) examinations with the aid of a custom-built categorical artificial intelligence (AI) system. Materials and Methods A fully balanced multireader, multicase reader study was conducted to compare the performance of 18 radiologists (nine general radiologists and nine breast imaging specialists) reading 240 retrospectively collected screening DBT mammograms (mean patient age, 59.8 years ± 11.3 [SD]; 100% women), acquired between August 2016 and March 2019, with and without the aid of a custom-built categorical AI system. The area under the receiver operating characteristic curve (AUC), sensitivity, and specificity across general radiologists and breast imaging specialists reading with versus without AI were assessed. Reader performance was also analyzed as a function of breast cancer characteristics and patient subgroups. Results Every radiologist demonstrated improved interpretation performance when reading with versus without AI, with an average AUC of 0.93 versus 0.87, demonstrating a difference in AUC of 0.06 (95% CI: 0.04, 0.08; P < .001). Improvement in AUC was observed for both general radiologists (difference of 0.08; P < .001) and breast imaging specialists (difference of 0.04; P < .001) and across all cancer characteristics (lesion type, lesion size, and pathology) and patient subgroups (race and ethnicity, age, and breast density) examined. Conclusion A categorical AI system helped improve overall radiologist interpretation performance of DBT screening mammograms for both general radiologists and breast imaging specialists and across various patient subgroups and breast cancer characteristics. Keywords: Computer-aided Diagnosis, Screening Mammography, Digital Breast Tomosynthesis, Breast Cancer, Screening, Convolutional Neural Network (CNN), Artificial Intelligence Supplemental material is available for this article. © RSNA, 2024.


Subject(s)
Breast Neoplasms , Female , Humans , Middle Aged , Breast Neoplasms/diagnostic imaging , Mammography/methods , Retrospective Studies , Artificial Intelligence , Early Detection of Cancer/methods , Radiologists
SELECTION OF CITATIONS
SEARCH DETAIL