Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 188
Filter
1.
J Am Coll Radiol ; 2024 May 22.
Article in English | MEDLINE | ID: mdl-38789066

ABSTRACT

With promising Artificial Intelligence (AI) algorithms receiving FDA (Food and Drug Administration) clearance, the potential impact of these models on clinical outcomes must be evaluated locally before their integration into routine workflows. Robust validation infrastructures are pivotal to inspecting the accuracy and generalizability of these deep learning algorithms to ensure both patient safety and health equity. Protected Health Information (PHI) concerns, intellectual property rights, and diverse requirements of models impede the development of rigorous external validation infrastructures. Our work proposes various suggestions for addressing the challenges associated with the development of efficient, customizable, and cost-effective infrastructures for the external validation of AI models at large medical centers and institutions. We present comprehensive steps to establish an AI inferencing infrastructure outside clinical systems to examine the local performance of AI algorithms before health practice- or system-wide implementation and promote an evidence-based approach for adopting AI models that can enhance radiology workflows and improve patient outcomes.

2.
Radiol Artif Intell ; 6(3): e230375, 2024 May.
Article in English | MEDLINE | ID: mdl-38597784

ABSTRACT

Purpose To explore the stand-alone breast cancer detection performance, at different risk score thresholds, of a commercially available artificial intelligence (AI) system. Materials and Methods This retrospective study included information from 661 695 digital mammographic examinations performed among 242 629 female individuals screened as a part of BreastScreen Norway, 2004-2018. The study sample included 3807 screen-detected cancers and 1110 interval breast cancers. A continuous examination-level risk score by the AI system was used to measure performance as the area under the receiver operating characteristic curve (AUC) with 95% CIs and cancer detection at different AI risk score thresholds. Results The AUC of the AI system was 0.93 (95% CI: 0.92, 0.93) for screen-detected cancers and interval breast cancers combined and 0.97 (95% CI: 0.97, 0.97) for screen-detected cancers. In a setting where 10% of the examinations with the highest AI risk scores were defined as positive and 90% with the lowest scores as negative, 92.0% (3502 of 3807) of the screen-detected cancers and 44.6% (495 of 1110) of the interval breast cancers were identified with AI. In this scenario, 68.5% (10 987 of 16 040) of false-positive screening results (negative recall assessment) were considered negative by AI. When 50% was used as the cutoff, 99.3% (3781 of 3807) of the screen-detected cancers and 85.2% (946 of 1110) of the interval breast cancers were identified as positive by AI, whereas 17.0% (2725 of 16 040) of the false-positive results were considered negative. Conclusion The AI system showed high performance in detecting breast cancers within 2 years of screening mammography and a potential for use to triage low-risk mammograms to reduce radiologist workload. Keywords: Mammography, Breast, Screening, Convolutional Neural Network (CNN), Deep Learning Algorithms Supplemental material is available for this article. © RSNA, 2024 See also commentary by Bahl and Do in this issue.


Subject(s)
Artificial Intelligence , Breast Neoplasms , Early Detection of Cancer , Mammography , Humans , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , Breast Neoplasms/diagnosis , Female , Mammography/methods , Norway/epidemiology , Retrospective Studies , Middle Aged , Early Detection of Cancer/methods , Aged , Adult , Mass Screening/methods , Radiographic Image Interpretation, Computer-Assisted/methods
3.
JAMA ; 2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38687474
4.
Eur Radiol ; 2024 Mar 25.
Article in English | MEDLINE | ID: mdl-38528136

ABSTRACT

OBJECTIVE: To explore the ability of artificial intelligence (AI) to classify breast cancer by mammographic density in an organized screening program. MATERIALS AND METHOD: We included information about 99,489 examinations from 74,941 women who participated in BreastScreen Norway, 2013-2019. All examinations were analyzed with an AI system that assigned a malignancy risk score (AI score) from 1 (lowest) to 10 (highest) for each examination. Mammographic density was classified into Volpara density grade (VDG), VDG1-4; VDG1 indicated fatty and VDG4 extremely dense breasts. Screen-detected and interval cancers with an AI score of 1-10 were stratified by VDG. RESULTS: We found 10,406 (10.5% of the total) examinations to have an AI risk score of 10, of which 6.7% (704/10,406) was breast cancer. The cancers represented 89.7% (617/688) of the screen-detected and 44.6% (87/195) of the interval cancers. 20.3% (20,178/99,489) of the examinations were classified as VDG1 and 6.1% (6047/99,489) as VDG4. For screen-detected cancers, 84.0% (68/81, 95% CI, 74.1-91.2) had an AI score of 10 for VDG1, 88.9% (328/369, 95% CI, 85.2-91.9) for VDG2, 92.5% (185/200, 95% CI, 87.9-95.7) for VDG3, and 94.7% (36/38, 95% CI, 82.3-99.4) for VDG4. For interval cancers, the percentages with an AI score of 10 were 33.3% (3/9, 95% CI, 7.5-70.1) for VDG1 and 48.0% (12/25, 95% CI, 27.8-68.7) for VDG4. CONCLUSION: The tested AI system performed well according to cancer detection across all density categories, especially for extremely dense breasts. The highest proportion of screen-detected cancers with an AI score of 10 was observed for women classified as VDG4. CLINICAL RELEVANCE STATEMENT: Our study demonstrates that AI can correctly classify the majority of screen-detected and about half of the interval breast cancers, regardless of breast density. KEY POINTS: • Mammographic density is important to consider in the evaluation of artificial intelligence in mammographic screening. • Given a threshold representing about 10% of those with the highest malignancy risk score by an AI system, we found an increasing percentage of cancers with increasing mammographic density. • Artificial intelligence risk score and mammographic density combined may help triage examinations to reduce workload for radiologists.

6.
Insights Imaging ; 15(1): 38, 2024 Feb 08.
Article in English | MEDLINE | ID: mdl-38332187

ABSTRACT

OBJECTIVES: The randomized controlled trial comparing digital breast tomosynthesis and synthetic 2D mammograms (DBT + SM) versus digital mammography (DM) (the To-Be 1 trial), 2016-2017, did not result in higher cancer detection for DBT + SM. We aimed to determine if negative cases prior to interval and consecutive screen-detected cancers from DBT + SM were due to interpretive error. METHODS: Five external breast radiologists performed the individual blinded review of 239 screening examinations (90 true negative, 39 false positive, 19 prior to interval cancer, and 91 prior to consecutive screen-detected cancer) and the informed consensus review of examinations prior to interval and screen-detected cancers (n = 110). The reviewers marked suspicious findings with a score of 1-5 (probability of malignancy). A case was false negative if ≥ 2 radiologists assigned the cancer site with a score of ≥ 2 in the blinded review and if the case was assigned as false negative by a consensus in the informed review. RESULTS: In the informed review, 5.3% of examinations prior to interval cancer and 18.7% prior to consecutive round screen-detected cancer were considered false negative. In the blinded review, 10.6% of examinations prior to interval cancer and 42.9% prior to consecutive round screen-detected cancer were scored ≥ 2. A score of ≥ 2 was assigned to 47.8% of negative and 89.7% of false positive examinations. CONCLUSIONS: The false negative rates were consistent with those of prior DM reviews, indicating that the lack of higher cancer detection for DBT + SM versus DM in the To-Be 1 trial is complex and not due to interpretive error alone. CRITICAL RELEVANCE STATEMENT: The randomized controlled trial on digital breast tomosynthesis and synthetic 2D mammograms (DBT) and digital mammography (DM), 2016-2017, showed no difference in cancer detection for the two techniques. The rates of false negative screening examinations prior to interval and consecutive screen-detected cancer for DBT were consistent with the rates in prior DM reviews, indicating that the non-superior DBT performance in the trial might not be due to interpretive error alone. KEY POINTS: • Screening with digital breast tomosynthesis (DBT) did not result in a higher breast cancer detection rate compared to screening with digital mammography (DM) in the To-Be 1 trial. • The false negative rates for examinations prior to interval and consecutive screen-detected cancer for DBT were determined in the trial to test if the lack of differences was due to interpretive error. • The false negative rates were consistent with those of prior DM reviews, indicating that the lack of higher cancer detection for DBT versus DM was complex and not due to interpretive error alone.

7.
Radiol Artif Intell ; 6(2): e230137, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38323914

ABSTRACT

Purpose To evaluate performance improvements of general radiologists and breast imaging specialists when interpreting a set of diverse digital breast tomosynthesis (DBT) examinations with the aid of a custom-built categorical artificial intelligence (AI) system. Materials and Methods A fully balanced multireader, multicase reader study was conducted to compare the performance of 18 radiologists (nine general radiologists and nine breast imaging specialists) reading 240 retrospectively collected screening DBT mammograms (mean patient age, 59.8 years ± 11.3 [SD]; 100% women), acquired between August 2016 and March 2019, with and without the aid of a custom-built categorical AI system. The area under the receiver operating characteristic curve (AUC), sensitivity, and specificity across general radiologists and breast imaging specialists reading with versus without AI were assessed. Reader performance was also analyzed as a function of breast cancer characteristics and patient subgroups. Results Every radiologist demonstrated improved interpretation performance when reading with versus without AI, with an average AUC of 0.93 versus 0.87, demonstrating a difference in AUC of 0.06 (95% CI: 0.04, 0.08; P < .001). Improvement in AUC was observed for both general radiologists (difference of 0.08; P < .001) and breast imaging specialists (difference of 0.04; P < .001) and across all cancer characteristics (lesion type, lesion size, and pathology) and patient subgroups (race and ethnicity, age, and breast density) examined. Conclusion A categorical AI system helped improve overall radiologist interpretation performance of DBT screening mammograms for both general radiologists and breast imaging specialists and across various patient subgroups and breast cancer characteristics. Keywords: Computer-aided Diagnosis, Screening Mammography, Digital Breast Tomosynthesis, Breast Cancer, Screening, Convolutional Neural Network (CNN), Artificial Intelligence Supplemental material is available for this article. © RSNA, 2024.


Subject(s)
Breast Neoplasms , Female , Humans , Middle Aged , Male , Breast Neoplasms/diagnostic imaging , Mammography/methods , Retrospective Studies , Artificial Intelligence , Early Detection of Cancer/methods , Radiologists
9.
JAMA Oncol ; 10(2): 167-175, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38060241

ABSTRACT

Importance: Advanced-stage breast cancer rates vary by race and ethnicity, with Black women having a 2-fold higher rate than White women among regular screeners. Clinical risk factors that explain a large proportion of advanced breast cancers by race and ethnicity are unknown. Objective: To evaluate the population attributable risk proportions (PARPs) for advanced-stage breast cancer (prognostic pathologic stage IIA or higher) associated with clinical risk factors among routinely screened premenopausal and postmenopausal women by race and ethnicity. Design, Setting, and Participants: This cohort study used data collected prospectively from Breast Cancer Surveillance Consortium community-based breast imaging facilities from January 2005 to June 2018. Participants were women aged 40 to 74 years undergoing 3 331 740 annual (prior screening within 11-18 months) or biennial (prior screening within 19-30 months) screening mammograms associated with 1815 advanced breast cancers diagnosed within 2 years of screening examinations. Data analysis was performed from September 2022 to August 2023. Exposures: Heterogeneously or extremely dense breasts, first-degree family history of breast cancer, overweight/obesity (body mass index >25.0), history of benign breast biopsy, and screening interval (biennial vs annual) stratified by menopausal status and race and ethnicity (Asian or Pacific Islander, Black, Hispanic/Latinx, White, other/multiracial). Main Outcomes and Measures: PARPs for advanced breast cancer. Results: Among 904 615 women, median (IQR) age was 57 (50-64) years. Of the 3 331 740 annual or biennial screening mammograms, 10.8% were for Asian or Pacific Islander women; 9.5% were for Black women; 5.3% were for Hispanic/Latinx women; 72.0% were for White women; and 2.0% were for women of other races and ethnicities, including those who were Alaska Native, American Indian, 2 or more reported races, or other. Body mass index PARPs were larger for postmenopausal vs premenopausal women (30% vs 22%) and highest for postmenopausal Black (38.6%; 95% CI, 32.0%-44.8%) and Hispanic/Latinx women (31.8%; 95% CI, 25.3%-38.0%) and premenopausal Black women (30.3%; 95% CI, 17.7%-42.0%), with overall prevalence of having overweight/obesity highest in premenopausal Black (84.4%) and postmenopausal Black (85.1%) and Hispanic/Latinx women (72.4%). Breast density PARPs were larger for premenopausal vs postmenopausal women (37% vs 24%, respectively) and highest among premenopausal Asian or Pacific Islander (46.6%; 95% CI, 37.9%-54.4%) and White women (39.8%; 95% CI, 31.7%-47.3%) whose prevalence of dense breasts was high (62%-79%). For premenopausal and postmenopausal women, PARPs were small for family history of breast cancer (5%-8%), history of breast biopsy (7%-12%), and screening interval (2.1%-2.3%). Conclusions and Relevance: In this cohort study among routinely screened women, the proportion of advanced breast cancers attributed to biennial vs annual screening was small. To reduce the number of advanced breast cancer diagnoses, primary prevention should focus on interventions that shift patients with overweight and obesity to normal weight.


Subject(s)
Breast Neoplasms , Female , Humans , Male , Breast Neoplasms/diagnosis , Breast Neoplasms/epidemiology , Breast Neoplasms/pathology , Ethnicity , Cohort Studies , Overweight , Obesity/epidemiology , Obesity/diagnosis
10.
J Am Coll Radiol ; 21(2): 319-328, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37949155

ABSTRACT

PURPOSE: To summarize the literature regarding the performance of mammography-image based artificial intelligence (AI) algorithms, with and without additional clinical data, for future breast cancer risk prediction. MATERIALS AND METHODS: A systematic literature review was performed using six databases (medRixiv, bioRxiv, Embase, Engineer Village, IEEE Xplore, and PubMed) from 2012 through September 30, 2022. Studies were included if they used real-world screening mammography examinations to validate AI algorithms for future risk prediction based on images alone or in combination with clinical risk factors. The quality of studies was assessed, and predictive accuracy was recorded as the area under the receiver operating characteristic curve (AUC). RESULTS: Sixteen studies met inclusion and exclusion criteria, of which 14 studies provided AUC values. The median AUC performance of AI image-only models was 0.72 (range 0.62-0.90) compared with 0.61 for breast density or clinical risk factor-based tools (range 0.54-0.69). Of the seven studies that compared AI image-only performance directly to combined image + clinical risk factor performance, six demonstrated no significant improvement, and one study demonstrated increased improvement. CONCLUSIONS: Early efforts for predicting future breast cancer risk based on mammography images alone demonstrate comparable or better accuracy to traditional risk tools with little or no improvement when adding clinical risk factor data. Transitioning from clinical risk factor-based to AI image-based risk models may lead to more accurate, personalized risk-based screening approaches.


Subject(s)
Breast Neoplasms , Humans , Female , Breast Neoplasms/diagnostic imaging , Mammography/methods , Artificial Intelligence , Early Detection of Cancer/methods , Breast/diagnostic imaging , Retrospective Studies
12.
J Am Coll Radiol ; 21(3): 503-504, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37813226
14.
Eur J Radiol ; 167: 111069, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37708674

ABSTRACT

PURPOSE: To describe and compare early screening outcomes before, during and after a randomized controlled trial with digital breast tomosynthesis (DBT) including synthetic 2D mammography versus standard digital mammography (DM) (To-Be 1) and a follow-up cohort study using DBT (To-Be 2). METHODS: Retrospective results of 125,020 screening examinations from four consecutive screening rounds performed in 2014-2021 were described and compared for pre-To-Be 1 (DM), To-Be 1 (DM or DBT), To-Be 2 (DBT), and post-To-Be 2 (DM) cohorts. Descriptive analyses of rates of recall, biopsy, screen-detected and interval cancer, distribution of histopathologic tumor characteristics and time spent on image interpretation and consensus were presented for the four rounds including five cohorts, one cohort in each screening round except for the To-Be 1 trail, which included a DBT and a DM cohort. Odds ratios (OR) with 95% CIs was calculated for recall and cancer detection rates. RESULTS: Rate of screen-detected cancer was 0.90% for women screened with DBT in To-Be 2 and 0.64% for DM in pre-To-Be 1. The rates did not differ for the To-Be 1 DM (0.61%), To-Be 1 DBT (0.66%) and post-To-Be 2 DM (0.67%) cohorts. The interval cancer rates ranged between 0.13% and 0.20%. The distribution of histopathologic tumor characteristics did not differ between the cohorts. CONCLUSIONS: Screening all women with DBT following a randomized controlled trial in an organized, population-based screening program showed a temporary increase in the rate of screen-detected cancer.


Subject(s)
Mammography , Humans , Female , Follow-Up Studies , Retrospective Studies , Biopsy , Consensus
15.
JAMA Health Forum ; 4(9): e232801, 2023 09 01.
Article in English | MEDLINE | ID: mdl-37682552

ABSTRACT

This Viewpoint describes new federal updates to screening mammography rules and recommends strategies for mitigating potential consequences of the rules.


Subject(s)
Breast Density , Humans , Female
16.
Breast Cancer Res Treat ; 202(3): 505-514, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37697031

ABSTRACT

PURPOSE: Invasive lobular carcinoma (ILC) is a distinct histological subtype of breast cancer that can make early detection with mammography challenging. We compared imaging performance of digital breast tomosynthesis (DBT) to digital mammography (DM) for diagnoses of ILC, invasive ductal carcinoma (IDC), and invasive mixed carcinoma (IMC) in a screening population. METHODS: We included screening exams (DM; n = 1,715,249 or DBT; n = 414,793) from 2011 to 2018 among 839,801 women in the Breast Cancer Surveillance Consortium. Examinations were followed for one year to ascertain incident ILC, IDC, or IMC. We measured cancer detection rate (CDR) and interval invasive cancer rate/1000 screening examinations for each histological subtype and stratified by breast density and modality. We calculated relative risk (RR) for DM vs. DBT using log-binomial models to adjust for the propensity of receiving DBT vs. DM. RESULTS: Unadjusted CDR per 1000 mammograms of ILC overall was 0.33 (95%CI: 0.30-0.36) for DM; 0.45 (95%CI: 0.39-0.52) for DBT, and for women with dense breasts- 0.33 (95%CI: 0.29-0.37) for DM and 0.54 (95%CI: 0.43-0.66) for DBT. Similar results were noted for IDC and IMC. Adjusted models showed a significantly increased RR for cancer detection with DBT compared to DM among women with dense breasts for all three histologies (RR; 95%CI: ILC 1.53; 1.09-2.14, IDC 1.21; 1.02-1.44, IMC 1.76; 1.30-2.38), but no significant increase among women with non-dense breasts. CONCLUSION: DBT was associated with higher CDR for ILC, IDC, and IMC for women with dense breasts. Early detection of ILC with DBT may improve outcomes for this distinct clinical entity.


Subject(s)
Breast Neoplasms , Carcinoma, Ductal, Breast , Female , Humans , Breast Neoplasms/pathology , Early Detection of Cancer/methods , Mammography/methods , Breast Density , Carcinoma, Ductal, Breast/diagnostic imaging , Mass Screening/methods , Retrospective Studies
17.
Radiology ; 308(2): e230576, 2023 08.
Article in English | MEDLINE | ID: mdl-37581498

ABSTRACT

Background Contrast-enhanced mammography (CEM) and abbreviated breast MRI (ABMRI) are emerging alternatives to standard MRI for supplemental breast cancer screening. Purpose To compare the diagnostic performance of CEM, ABMRI, and standard MRI. Materials and Methods This single-institution, prospective, blinded reader study included female participants referred for breast MRI from January 2018 to June 2021. CEM was performed within 14 days of standard MRI; ABMRI was produced from standard MRI images. Two readers independently interpreted each CEM and ABMRI after a washout period. Examination-level performance metrics calculated were recall rate, cancer detection, and false-positive biopsy recommendation rates per 1000 examinations and sensitivity, specificity, and positive predictive value of biopsy recommendation. Bootstrap and permutation tests were used to calculate 95% CIs and compare modalities. Results Evaluated were 492 paired CEM and ABMRI interpretations from 246 participants (median age, 51 years; IQR, 43-61 years). On 49 MRI scans with lesions recommended for biopsy, nine lesions showed malignant pathology. No differences in ABMRI and standard MRI performance were identified. Compared with standard MRI, CEM demonstrated significantly lower recall rate (14.0% vs 22.8%; difference, -8.7%; 95% CI: -14.0, -3.5), lower false-positive biopsy recommendation rate per 1000 examinations (65.0 vs 162.6; difference, -97.6; 95% CI: -146.3, -50.8), and higher specificity (87.8% vs 80.2%; difference, 7.6%; 95% CI: 2.3, 13.1). Compared with standard MRI, CEM had significantly lower cancer detection rate (22.4 vs 36.6; difference, -14.2; 95% CI: -28.5, -2.0) and sensitivity (61.1% vs 100%; difference, -38.9%; 95% CI: -66.7, -12.5). The performance differences between CEM and ABMRI were similar to those observed between CEM and standard MRI. Conclusion ABMRI had comparable performance to standard MRI and may support more efficient MRI screening. CEM had lower recall and higher specificity compared with standard MRI or ABMRI, offset by lower cancer detection rate and sensitivity compared with standard MRI. These trade-offs warrant further consideration of patient population characteristics before widespread screening with CEM. Clinical trial registration no. NCT03517813 © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Chang in this issue.


Subject(s)
Breast Neoplasms , Female , Humans , Middle Aged , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Prospective Studies , Sensitivity and Specificity , Early Detection of Cancer/methods , Mammography/methods , Magnetic Resonance Imaging/methods
18.
20.
Cancer Epidemiol Biomarkers Prev ; 32(11): 1542-1551, 2023 11 01.
Article in English | MEDLINE | ID: mdl-37440458

ABSTRACT

BACKGROUND: We evaluated diagnostic mammography among women with a breast lump to determine whether performance varied across racial and ethnic groups. METHODS: This study included 51,014 diagnostic mammograms performed between 2005 and 2018 in the Breast Cancer Surveillance Consortium among Asian/Pacific Islander (12%), Black (7%), Hispanic/Latina (6%), and White (75%) women reporting a lump. Breast cancers occurring within 1 year were ascertained from cancer registry linkages. Multivariable regression was used to adjust performance statistic comparisons for breast cancer risk factors, mammogram modality, demographics, additional imaging, and imaging facility. RESULTS: Cancer detection rates were highest among Asian/Pacific Islander [per 1,000 exams, 84.2 (95% confidence interval (CI): 72.0-98.2)] and Black women [81.4 (95% CI: 69.4-95.2)] and lowest among Hispanic/Latina women [42.9 (95% CI: 34.2-53.6)]. Positive predictive values (PPV) were higher among Black [37.0% (95% CI: 31.2-43.3)] and White [37.0% (95% CI: 30.0-44.6)] women and lowest among Hispanic/Latina women [22.0% (95% CI: 17.2-27.7)]. False-positive results were most common among Asian/Pacific Islander women [per 1,000 exams, 183.9 (95% CI: 126.7-259.2)] and lowest among White women [112.4 (95% CI: 86.1-145.5)]. After adjustment, false-positive and cancer detection rates remained higher for Asian/Pacific Islander and Black women (vs. Hispanic/Latina and White). Adjusted PPV was highest among Asian/Pacific Islander women. CONCLUSIONS: Among women with a lump, Asian/Pacific Islander and Black women were more likely to have cancer detected and more likely to receive a false-positive result compared with White and Hispanic/Latina women. IMPACT: Strategies for optimizing diagnostic mammography among women with a lump may vary by racial/ethnic group, but additional factors that influence performance differences need to be identified. See related In the Spotlight, p. 1479.


Subject(s)
Breast Neoplasms , Racial Groups , Female , Humans , United States , Male , Ethnicity , Mammography , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , White
SELECTION OF CITATIONS
SEARCH DETAIL
...