Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 122
Filtrar
1.
J Am Coll Radiol ; 21(3): 376-386, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37922974

RESUMO

PURPOSE: Cancer detection rate (CDR), an important metric in the mammography screening audit, is designed to ensure adequate sensitivity. Most practices use biopsy results as the reference standard; however, commonly ascertainment of biopsy results is incomplete. We used simulation to determine the relationship between the cancer ascertainment rate of biopsy (AR-biopsy), CDR estimation, and associated error rates in classifying whether practices and radiologists meet the established ACR benchmark of 2.5 per 1,000. MATERIALS AND METHODS: We simulated screening mammography volume, number of cancers detected, and CDR, using negative binomial and beta-binomial distributions, respectively. Simulations were performed at both the practice and radiologist level. Average CDR was based on linearly rescaling a published CDR by the AR-biopsy. CDR distributions were simulated for AR-biopsy between 5% and 100% in steps of five percentage points and were summarized with boxplots and smoothed histograms over the range of AR-biopsy, to quantify the proportion of practices and radiologists meeting the ACR benchmark at each level of AR-biopsy. RESULTS: Decreasing AR-biopsy led to an increasing probability of categorizing CDR performance as being below the ACR benchmark. Our simulation predicts that at the practice level, an AR-biopsy of 65% categorizes 17.6% below the benchmark (compared to 1.6% at an AR-biopsy of 100%), and at the radiologist level, an AR-biopsy of 65% categorizes 34.7% as being below the benchmark (compared to 11.6% at an AR-biopsy of 100%). CONCLUSIONS: Our simulation demonstrates that decreasing the AR-biopsy (in currently clinically relevant ranges) has the potential to artifactually lower the assessed CDR on both the practice and radiologist levels and may, in turn, increase the chance of erroneous categorization of underperformance per the ACR benchmark.


Assuntos
Detecção Precoce de Câncer , Neoplasias , Humanos , Mamografia , Benchmarking , Biópsia
2.
Radiology ; 309(2): e230530, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37962503

RESUMO

Background Despite growing interest in using contrast-enhanced mammography (CEM) for breast cancer screening as an alternative to breast MRI, limited literature is available. Purpose To determine whether CEM is noninferior to breast MRI or abbreviated breast MRI (AB MRI) and superior to two-dimensional mammography in an asymptomatic population simulating those who would present for screening and then undergo diagnostic work-up. Materials and Methods This enriched reader study used CEM and MRI data prospectively collected from asymptomatic individuals at a single institution from December 2014 to March 2020. Case sets were obtained at screening, as part of work-up for a screening-detected finding, or before biopsy of a screening-detected abnormality. All images were anonymized and randomized, and all 12 radiologists interpreted them. For CEM interpretation, readers were first shown low-energy images as a surrogate for digital mammography and asked to give a forced Breast Imaging Reporting and Data System score for up to three abnormalities. The highest score was used as the case score. Readers then reviewed the full CEM examination and scored it similarly. After a minimum 1-month washout, the readers similarly interpreted AB MRI and full MRI examinations. Receiver operating characteristic analysis, powered to test CEM noninferiority to full MRI, was performed. Results The study included 132 case sets (14 negative, 74 benign, and 44 malignant; all female participants; mean age, 54 years ± 12 [SD]). The mean areas under the receiver operating characteristic curve (AUCs) for digital mammography, CEM, AB MRI, and full MRI were 0.79, 0.91, 0.89, and 0.91, respectively. CEM was superior to digital mammography (P < .001). No evidence of a difference in AUC was found between CEM and AB MRI and MRI. Conclusion In an asymptomatic study sample, CEM was noninferior to full MRI and AB MRI and was superior to digital mammography. Clinical trial registration no. NCT03482557 and NCT02275871 © RSNA, 2023 Supplemental material is available for this article.


Assuntos
Neoplasias da Mama , Feminino , Humanos , Pessoa de Meia-Idade , Área Sob a Curva , Neoplasias da Mama/diagnóstico por imagem , Imageamento por Ressonância Magnética , Mamografia , Exame Físico
3.
Med Phys ; 50(12): 7427-7440, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37824821

RESUMO

PURPOSE: A comprehensive, centrally-monitored physics quality control (QC) program was developed for the Tomosynthesis Imaging Screening Trial (TMIST), a randomized controlled trial of digital breast tomosynthesis (TM) versus digital mammography (DM) for cancer screening. As part of the program, in addition to a set of phantom-based tests, de-identified data on image acquisition and processing parameters were captured from the DICOM headers of all individual patient images in the trial. These data were analyzed to assess the potential usefulness of header data from digital mammograms and tomosynthesis images of patients for quality assurance in breast imaging. METHODS: Data were automatically extracted from the headers of all de-identified patient mammograms and tomosynthesis images in the TMIST study. Image acquisition parameters and estimated radiation doses were tracked for individual sites, systems and across system types. These parameters included (among others) kV, target/filter use, number of acquired views per examination, AEC mode, compression thickness and force and detector temperature. Consistency of manually entered study data parameters (subject ID, screening time-point) from TMIST was evaluated. Preliminary observations from the program are presented. RESULTS: We report on data from 812 651 images from 135 525 examinations acquired between October, 2017 and December, 2022. Data came from 6 system models from 3 manufacturers. There was greater variability both in the number of views used and in the estimated (proxy) doses received in DM exams compared to TM. Mean proxy doses per examination varied among manufacturers from 2.76-4.54 mGy for DM and 3-4.84 mGy for the tomosynthesis component in the TM arm with maximum examination proxy doses of 20 and 26 mGy for DM and TM respectively. Mean proxy doses per examination for the combination examination in TM (tomosynthesis plus digital mammography) varied from 6.6 to 7.6 mGy among manufacturers with a maximum of 44.5 mGy. CONCLUSIONS: Overall, modern digital mammography and tomosynthesis systems used in TMIST have operated very reliably. Doses vary considerably due to variation in the number of views per examination, thickness and fibro-glandularity of the breast, and choices in the use of synthesized versus actual 2D mammography in the TM examination. These data may also be useful in predicting equipment problems. Header information is valuable not only for automated QC, but also for cross-checking accuracy and consistency of data in a clinical study.


Assuntos
Neoplasias da Mama , Detecção Precoce de Câncer , Humanos , Feminino , Doses de Radiação , Mamografia/métodos , Mama/diagnóstico por imagem , Imagens de Fantasmas , Neoplasias da Mama/diagnóstico por imagem
4.
Med Phys ; 50(12): 7441-7461, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37830895

RESUMO

BACKGROUND: The Tomosynthesis Mammography Imaging Screening Trial (TMIST), EA1151 conducted by the Eastern Cooperative Oncology Group (ECOG)/American College of Radiology Imaging Network (ACRIN) is a randomized clinical trial designed to assess the effectiveness for breast cancer screening of digital breast tomosynthesis (TM) compared to digital mammography (DM). Equipment from multiple vendors is being used in the study. PURPOSE: For the findings of the study to be valid and capture the true capacities of the two technology types, it is important that all equipment is operated within appropriate parameters with regard to image quality and dose. A harmonized QC program was established by a core physics team. Since there are over 120 trial sites, a centralized, automated QC program was chosen as the most practical design. This report presents results of the weekly QC testing program. A companion paper will review quality monitoring based on data from the headers of the patient images. METHODS: Study images are collected centrally after de-identification using the "TRIAD" application developed by ACR. The core physics team devised and implemented a minimal set of quality control (QC) tests to evaluate the tomosynthesis and 2D mammography systems. Weekly, monthly and annual testing is performed by the site mammography technologists with images submitted directly to the physics core. The weekly physics QC tests are described: SDNR of a low-contrast mass object, artifact spread, spatial resolution, tracking of technical factors, and in-slice noise power spectra. RESULTS: As of December 31, 2022 (5 years), 145 sites with 411 machines had submitted QC data. A total of 136 742 TMIST participant screening imaging studies had been performed. The 5th and 95th percentile mean glandular doses for a single tomosynthesis exposure to a 4.0 cm thick PMMA phantom ("standard breast phantom") were 1.24 and 1.68 mGy respectively. The largest sources of QC non-conformance were: operator error, not following the QC protocol exactly, unreported software updates and preventive maintenance activities that affected QC setpoints. Noise power spectra were measured, however, standardization of performance targets across machine types and software revisions was difficult. Nevertheless, for each machine type, test measurement results were very consistent when the protocol was followed. Deviations in test results were mostly related to software and hardware changes. CONCLUSION: Most systems performed very consistently. Although this is a harmonized program using identical phantoms and testing protocols, it is not appropriate to apply universal threshold or target metrics across the machine types because the systems have different non-linear reconstruction algorithms and image display filters. It was found to be more useful to assess pass/fail criteria in terms of relative deviations from baseline values established when a system is first characterized and after equipment is changed. Generally, systems which needed repair failed suddenly, but in retrospect, for a few cases, drops in SDNR and increases in mAs were observed prior to tube failure. TMIST is registered as NCT03233191 by Clinicaltrials.gov.


Assuntos
Neoplasias da Mama , Mamografia , Humanos , Feminino , Mamografia/métodos , Mama , Neoplasias da Mama/diagnóstico por imagem , Algoritmos , Controle de Qualidade , Imagens de Fantasmas
5.
J Breast Imaging ; 5(5): 546-554, 2023 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-38416918

RESUMO

OBJECTIVE: Measuring the cost of performing breast imaging is difficult in healthcare systems. The purpose of our study was to evaluate this cost using time-driven activity-based costing (TDABC) and to evaluate cost drivers for different exams. METHODS: An IRB-approved, single-center prospective study was performed on 80 female patients presenting for breast screening, diagnostic or biopsy exams from July 2020 to April 2021. Using TDABC, data were collected for each exam type. Included were full-field digital mammography (FFDM), digital breast tomosynthesis (DBT), contrast-enhanced mammography (CEM), US and MRI exams, and stereotactic, US-guided and MRI-guided biopsies. For each exam type, mean cost and relative contributions of equipment, personnel and supplies were calculated. RESULTS: Screening MRI, CEM, US, DBT, and FFDM costs were $249, $120, $83, $28, and $30. Personnel was the major contributor to cost (60.0%-87.0%) for all screening exams except MRI where equipment was the major contributor (62.2%). Diagnostic MRI, CEM, US, and FFDM costs were $241, $123, $70, and $43. Personnel was the major contributor to cost (60.5%-88.6%) for all diagnostic exams except MRI where equipment was the major contributor (61.8%). Costs of MRI-guided, stereotactic and US-guided biopsy were $1611, $826, and $356. Supplies contributed 40.5%-49.8% and personnel contributed 30.7%-55.6% to the total cost of biopsies. CONCLUSION: TDABC provides assessment of actual costs of performing breast imaging. Costs and contributors varied across screening, diagnostic and biopsy exams and modalities. Practices may consider this methodology in understanding costs and making changes directed at cost savings.


Assuntos
Mama , Mamografia , Feminino , Humanos , Estudos Prospectivos , Mama/diagnóstico por imagem , Mamografia/métodos , Biópsia Guiada por Imagem , Imageamento por Ressonância Magnética
8.
Med Phys ; 48(7): 3623-3629, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33931863

RESUMO

PURPOSE: In the reconstruction of volume breast images from x-ray projections in breast tomosynthesis, some tomographic systems truncate the image data presented to the radiologist such that a non-negligible amount of tissue may be missing from the breast image. QC tests were conducted to determine if this problem existed in imaging in the TMIST study. METHODS: Test tools developed for TMIST containing small objects at known heights were used in routine weekly and annual QC testing of tomosynthesis units to assess the degree to which phantom material that was irradiated in imaging was excluded from the reconstructed image. Results from 318 tests on five system types from three manufacturers are reported. RESULTS: The presence and extent of this problem varied among system types. The cause was most frequently related to machine errors in the determination of breast thickness or to deflection of components during breast compression. In particular, the problem occurred when a compression paddle other than the one calibrated for tomosynthesis was used for the tests. This was also verified to have occurred in some clinical imaging. CONCLUSIONS: Missing volume can be avoided by intentionally reconstructing additional image slices above and below the presumed locations of the breast support and compression plate. A compression paddle which has been calibrated for tomosynthesis should be used both for clinical imaging and testing. The prevalence of this phenomenon suggests that more frequent testing for volume coverage may be advisable.


Assuntos
Mama , Compressão de Dados , Mama/diagnóstico por imagem , Mamografia , Imagens de Fantasmas
10.
BMC Med Imaging ; 20(1): 61, 2020 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-32517657

RESUMO

BACKGROUND: There is an increasing interest in non-contrast-enhanced magnetic resonance imaging (MRI) for detecting and evaluating breast lesions. We present a methodology utilizing lesion core and periphery region of interest (ROI) features derived from directional diffusion-weighted imaging (DWI) data to evaluate performance in discriminating benign from malignant lesions in dense breasts. METHODS: We accrued 55 dense-breast cases with 69 lesions (31 benign; 38 cancer) at a single institution in a prospective study; cases with ROIs exceeding 7.50 cm2 were excluded, resulting in analysis of 50 cases with 63 lesions (29 benign, 34 cancers). Spin-echo echo-planar imaging DWI was acquired at 1.5 T and 3 T. Data from three diffusion encoding gradient directions were exported and processed independently. Lesion ROIs were hand-drawn on DWI images by two radiologists. A region growing algorithm generated 3D lesion models on augmented apparent-diffusion coefficient (ADC) maps and defined lesion core and lesion periphery sub-ROIs. A lesion-core and a lesion-periphery feature were defined and combined into an overall classifier whose performance was compared to that of mean ADC using receiver operating characteristic (ROC) analysis. Inter-observer variability in ROI definition was measured using Dice Similarity Coefficient (DSC). RESULTS: The region-growing algorithm for 3D lesion model generation improved inter-observer variability over hand drawn ROIs (DSC: 0.66 vs 0.56 (p < 0.001) with substantial agreement (DSC > 0.8) in 46% vs 13% of cases, respectively (p < 0.001)). The overall classifier improved discrimination over mean ADC, (ROC- area under the curve (AUC): 0.85 vs 0.75 and 0.83 vs 0.74 respectively for the two readers). CONCLUSIONS: A classifier generated from directional DWI information using lesion core and lesion periphery information separately can improve lesion discrimination in dense breasts over mean ADC and should be considered for inclusion in computer-aided diagnosis algorithms. Our model-based ROIs could facilitate standardization of breast MRI computer-aided diagnostics (CADx).


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Mama/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Mama/patologia , Densidade da Mama , Diagnóstico Diferencial , Imagem de Difusão por Ressonância Magnética , Feminino , Humanos , Variações Dependentes do Observador , Sensibilidade e Especificidade
11.
J Am Coll Radiol ; 17(12): 1653-1662, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32592660

RESUMO

OBJECTIVE: We developed deep learning algorithms to automatically assess BI-RADS breast density. METHODS: Using a large multi-institution patient cohort of 108,230 digital screening mammograms from the Digital Mammographic Imaging Screening Trial, we investigated the effect of data, model, and training parameters on overall model performance and provided crowdsourcing evaluation from the attendees of the ACR 2019 Annual Meeting. RESULTS: Our best-performing algorithm achieved good agreement with radiologists who were qualified interpreters of mammograms, with a four-class κ of 0.667. When training was performed with randomly sampled images from the data set versus sampling equal number of images from each density category, the model predictions were biased away from the low-prevalence categories such as extremely dense breasts. The net result was an increase in sensitivity and a decrease in specificity for predicting dense breasts for equal class compared with random sampling. We also found that the performance of the model degrades when we evaluate on digital mammography data formats that differ from the one that we trained on, emphasizing the importance of multi-institutional training sets. Lastly, we showed that crowdsourced annotations, including those from attendees who routinely read mammograms, had higher agreement with our algorithm than with the original interpreting radiologists. CONCLUSION: We demonstrated the possible parameters that can influence the performance of the model and how crowdsourcing can be used for evaluation. This study was performed in tandem with the development of the ACR AI-LAB, a platform for democratizing artificial intelligence.


Assuntos
Neoplasias da Mama , Crowdsourcing , Aprendizado Profundo , Inteligência Artificial , Densidade da Mama , Neoplasias da Mama/diagnóstico por imagem , Feminino , Humanos , Mamografia
13.
J Am Coll Radiol ; 17(3): 368-376, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31541655

RESUMO

OBJECTIVE: There is insufficient large-scale evidence for screening mammography in women <40 years at elevated risk. This study compares risk-based screening of women aged 30 to 39 with risk factors versus women aged 40 to 49 without risk factors in the National Mammography Database (NMD). METHODS: This retrospective, HIPAA-compliant, institutional review board-exempt study analyzed data from 150 NMD mammography facilities in 31 states. Patients were stratified by 5-year age intervals, availability of prior mammograms, and specific risk factors for breast cancer: family history of breast cancer, personal history of breast cancer, and dense breasts. Four screening performance metrics were calculated for each age and risk group: recall rate (RR), cancer detection rate (CDR), and positive predictive values for biopsy recommended (PPV2) and biopsy performed (PPV3). RESULTS: Data from 5,986,131 screening mammograms performed between January 2008 and December 2015 in 2,647,315 women were evaluated. Overall, mean CDR was 3.69 of 1,000 (95% confidence interval: 3.64-3.74), RR was 9.89% (9.87%-9.92%), PPV2 was 20.1% (19.9%-20.4%), and PPV3 was 28.2% (27.0%-28.5%). Women aged 30 to 34 and 35 to 39 had similar CDR, RR, and PPVs, with the presence of the three evaluated risk factors associated with significantly higher CDR. Moreover, compared with a population currently recommended for screening mammography in the United States (aged 40-49 at average risk), incidence screening (at least one prior screening examination) of women aged 30 to 39 with the three evaluated risk factors has similar cancer detection rates and recall rates. DISCUSSION: Women with one or more of these three specific risk factors likely benefit from screening commencing at age 30 instead of age 40.


Assuntos
Neoplasias da Mama , Mamografia , Adulto , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , Detecção Precoce de Câncer , Feminino , Humanos , Programas de Rastreamento , Estudos Retrospectivos , Estados Unidos/epidemiologia
14.
J Med Imaging (Bellingham) ; 6(1): 015501, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-30713851

RESUMO

We investigated effects of prevalence and case distribution on radiologist diagnostic performance as measured by area under the receiver operating characteristic curve (AUC) and sensitivity-specificity in lab-based reader studies evaluating imaging devices. Our retrospective reader studies compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with dense breasts. Mammograms were acquired from the prospective Digital Mammographic Imaging Screening Trial. We performed five reader studies that differed in terms of cancer prevalence and the distribution of noncancers. Twenty radiologists participated in each reader study. Using split-plot study designs, we collected recall decisions and multilevel scores from the radiologists for calculating sensitivity, specificity, and AUC. Differences in reader-averaged AUCs slightly favored SFM over FFDM (biggest AUC difference: 0.047, SE = 0.023 , p = 0.047 ), where standard error accounts for reader and case variability. The differences were not significant at a level of 0.01 (0.05/5 reader studies). The differences in sensitivities and specificities were also indeterminate. Prevalence had little effect on AUC (largest difference: 0.02), whereas sensitivity increased and specificity decreased as prevalence increased. We found that AUC is robust to changes in prevalence, while radiologists were more aggressive with recall decisions as prevalence increased.

16.
J Oncol Pract ; : JOP1800092, 2018 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-30285529

RESUMO

PURPOSE:: Research biopsy specimens collected in clinical trials often present requirements beyond those of tumor biopsy specimens collected for diagnostic purposes. Research biopsies underpin hypothesis-driven drug development, pharmacodynamic assessment of molecularly targeted anticancer agents, and, increasingly, genomic assessment for precision medicine; insufficient biopsy specimen quality or quantity therefore compromises the scientific value of a study and the resources devoted to it, as well as each patient's contribution to and potential benefit from a clinical trial. METHODS:: To improve research biopsy specimen quality, we consulted with other translational oncology teams and reviewed current best practices. RESULTS:: Among the recommendations were improving communication between oncologists and interventional radiologists, providing feedback on specimen sufficiency, increasing academic recognition and financial support for the time investment required by radiologists to collect and preserve research biopsy specimens, and improving real-time assessment of tissue quality. CONCLUSION:: Implementing these recommendations at the National Cancer Institute's Developmental Therapeutics Clinic has demonstrably improved the quality of biopsy specimens collected; more widespread dissemination of these recommendations beyond large clinical cancer centers is possible and will be of value to the community in improving clinical research and, ultimately, patient care.

17.
AJR Am J Roentgenol ; 210(6): 1376-1385, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29708782

RESUMO

OBJECTIVE: The objective of our study was to determine the accuracy of preoperative measurements for detecting pathologic complete response (CR) and assessing residual disease after neoadjuvant chemotherapy (NACT) in patients with locally advanced breast cancer. SUBJECTS AND METHODS: The American College of Radiology Imaging Network 6657 Trial prospectively enrolled women with ≥ 3 cm invasive breast cancer receiving NACT. Preoperative measurements of residual disease included longest diameter by mammography, MRI, and clinical examination and functional volume on MRI. The accuracy of preoperative measurements for detecting pathologic CR and the association with final pathology size were assessed for all lesions, separately for single masses and nonmass enhancements (NMEs), multiple masses, and lesions without ductal carcinoma in situ (DCIS). RESULTS: In the 138 women with all four preoperative measures, longest diameter by MRI showed the highest accuracy for detecting pathologic CR for all lesions and NME (AUC = 0.76 and 0.84, respectively). There was little difference across preoperative measurements in the accuracy of detecting pathologic CR for single masses (AUC = 0.69-0.72). Longest diameter by MRI and longest diameter by clinical examination showed moderate ability for detecting pathologic CR for multiple masses (AUC = 0.78 and 0.74), and longest diameter by MRI and longest diameter by mammography showed moderate ability for detecting pathologic CR for tumors without DCIS (AUC = 0.74 and 0.71). In subjects with residual disease, longest diameter by MRI exhibited the strongest association with pathology size for all lesions and single masses (r = 0.33 and 0.47). Associations between preoperative measures and pathology results were not significantly influenced by tumor subtype or mammographic density. CONCLUSION: Our results indicate that measurement of longest diameter by MRI is more accurate than by mammography and clinical examination for preoperative assessment of tumor residua after NACT and may improve surgical planning.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/tratamento farmacológico , Imageamento por Ressonância Magnética/métodos , Terapia Neoadjuvante , Neoplasia Residual/diagnóstico por imagem , Adulto , Neoplasias da Mama/patologia , Neoplasias da Mama/cirurgia , Feminino , Humanos , Mamografia , Pessoa de Meia-Idade , Invasividade Neoplásica/diagnóstico por imagem , Invasividade Neoplásica/patologia , Neoplasia Residual/tratamento farmacológico , Neoplasia Residual/patologia , Neoplasia Residual/cirurgia , Exame Físico , Cuidados Pré-Operatórios , Estudos Prospectivos , Resultado do Tratamento , Carga Tumoral
20.
Artigo em Inglês | MEDLINE | ID: mdl-28845078

RESUMO

The FDA recently completed a study on design methodologies surrounding the Validation of Imaging Premarket Evaluation and Regulation called VIPER. VIPER consisted of five large reader sub-studies to compare the impact of different study populations on reader behavior as seen by sensitivity, specificity, and AUC, the area under the ROC curve (receiver operating characteristic curve). The study investigated different prevalence levels and two kinds of sampling of non-cancer patients: a screening population and a challenge population. The VIPER study compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with heterogeneously dense or extremely dense breasts. All cases and corresponding images were sampled from Digital Mammographic Imaging Screening Trial (DMIST) archives. There were 20 readers (American Board Certified radiologists) for each sub-study, and instead of every reader reading every case (fully-crossed study), readers and cases were split into groups to reduce reader workload and the total number of observations (split-plot study). For data collection, readers first decided whether or not they would recall a patient. Following that decision, they provided an ROC score for how close or far that patient was from the recall decision threshold. Performance results for FFDM show that as prevalence increases to 50%, there is a moderate increase in sensitivity and decrease in specificity, whereas AUC is mainly flat. Regarding precision, the statistical efficiency (ratio of variances) of sensitivity and specificity relative to AUC are 0.66 at best and decrease with prevalence. Analyses comparing modalities and the study populations (screening vs. challenge) are still ongoing.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA