Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Radiology ; 293(1): 38-46, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31385754

RESUMO

Background Recent deep learning (DL) approaches have shown promise in improving sensitivity but have not addressed limitations in radiologist specificity or efficiency. Purpose To develop a DL model to triage a portion of mammograms as cancer free, improving performance and workflow efficiency. Materials and Methods In this retrospective study, 223 109 consecutive screening mammograms performed in 66 661 women from January 2009 to December 2016 were collected with cancer outcomes obtained through linkage to a regional tumor registry. This cohort was split by patient into 212 272, 25 999, and 26 540 mammograms from 56 831, 7021, and 7176 patients for training, validation, and testing, respectively. A DL model was developed to triage mammograms as cancer free and evaluated on the test set. A DL-triage workflow was simulated in which radiologists skipped mammograms triaged as cancer free (interpreting them as negative for cancer) and read mammograms not triaged as cancer free by using the original interpreting radiologists' assessments. Sensitivities, specificities, and percentage of mammograms read were calculated, with and without the DL-triage-simulated workflow. Statistics were computed across 5000 bootstrap samples to assess confidence intervals (CIs). Specificities were compared by using a two-tailed t test (P < .05) and sensitivities were compared by using a one-sided t test with a noninferiority margin of 5% (P < .05). Results The test set included 7176 women (mean age, 57.8 years ± 10.9 [standard deviation]). When reading all mammograms, radiologists obtained a sensitivity and specificity of 90.6% (173 of 191; 95% CI: 86.6%, 94.7%) and 93.5% (24 625 of 26 349; 95% CI: 93.3%, 93.9%). In the DL-simulated workflow, the radiologists obtained a sensitivity and specificity of 90.1% (172 of 191; 95% CI: 86.0%, 94.3%) and 94.2% (24 814 of 26 349; 95% CI: 94.0%, 94.6%) while reading 80.7% (21 420 of 26 540) of the mammograms. The simulated workflow improved specificity (P = .002) and obtained a noninferior sensitivity with a margin of 5% (P < .001). Conclusion This deep learning model has the potential to reduce radiologist workload and significantly improve specificity without harming sensitivity. © RSNA, 2019 Online supplemental material is available for this article. See also the editorial by Kontos and Conant in this issue.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Mamografia/métodos , Triagem/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Mama/diagnóstico por imagem , Estudos de Coortes , Simulação por Computador , Feminino , Humanos , Pessoa de Meia-Idade , Sistema de Registros , Estudos Retrospectivos
2.
Radiology ; 292(1): 60-66, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31063083

RESUMO

Background Mammographic density improves the accuracy of breast cancer risk models. However, the use of breast density is limited by subjective assessment, variation across radiologists, and restricted data. A mammography-based deep learning (DL) model may provide more accurate risk prediction. Purpose To develop a mammography-based DL breast cancer risk model that is more accurate than established clinical breast cancer risk models. Materials and Methods This retrospective study included 88 994 consecutive screening mammograms in 39 571 women between January 1, 2009, and December 31, 2012. For each patient, all examinations were assigned to either training, validation, or test sets, resulting in 71 689, 8554, and 8751 examinations, respectively. Cancer outcomes were obtained through linkage to a regional tumor registry. By using risk factor information from patient questionnaires and electronic medical records review, three models were developed to assess breast cancer risk within 5 years: a risk-factor-based logistic regression model (RF-LR) that used traditional risk factors, a DL model (image-only DL) that used mammograms alone, and a hybrid DL model that used both traditional risk factors and mammograms. Comparisons were made to an established breast cancer risk model that included breast density (Tyrer-Cuzick model, version 8 [TC]). Model performance was compared by using areas under the receiver operating characteristic curve (AUCs) with DeLong test (P < .05). Results The test set included 3937 women, aged 56.20 years ± 10.04. Hybrid DL and image-only DL showed AUCs of 0.70 (95% confidence interval [CI]: 0.66, 0.75) and 0.68 (95% CI: 0.64, 0.73), respectively. RF-LR and TC showed AUCs of 0.67 (95% CI: 0.62, 0.72) and 0.62 (95% CI: 0.57, 0.66), respectively. Hybrid DL showed a significantly higher AUC (0.70) than TC (0.62; P < .001) and RF-LR (0.67; P = .01). Conclusion Deep learning models that use full-field mammograms yield substantially improved risk discrimination compared with the Tyrer-Cuzick (version 8) model. © RSNA, 2019 Online supplemental material is available for this article. See also the editorial by Sitek and Wolfe in this issue.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo , Mamografia/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Mama/diagnóstico por imagem , Feminino , Humanos , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Estudos Retrospectivos , Medição de Risco
3.
Radiology ; 290(1): 52-58, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30325282

RESUMO

Purpose To develop a deep learning (DL) algorithm to assess mammographic breast density. Materials and Methods In this retrospective study, a deep convolutional neural network was trained to assess Breast Imaging Reporting and Data System (BI-RADS) breast density based on the original interpretation by an experienced radiologist of 41 479 digital screening mammograms obtained in 27 684 women from January 2009 to May 2011. The resulting algorithm was tested on a held-out test set of 8677 mammograms in 5741 women. In addition, five radiologists performed a reader study on 500 mammograms randomly selected from the test set. Finally, the algorithm was implemented in routine clinical practice, where eight radiologists reviewed 10 763 consecutive mammograms assessed with the model. Agreement on BI-RADS category for the DL model and for three sets of readings-(a) radiologists in the test set, (b) radiologists working in consensus in the reader study set, and (c) radiologists in the clinical implementation set-were estimated with linear-weighted κ statistics and were compared across 5000 bootstrap samples to assess significance. Results The DL model showed good agreement with radiologists in the test set (κ = 0.67; 95% confidence interval [CI]: 0.66, 0.68) and with radiologists in consensus in the reader study set (κ = 0.78; 95% CI: 0.73, 0.82). There was very good agreement (κ = 0.85; 95% CI: 0.84, 0.86) with radiologists in the clinical implementation set; for binary categorization of dense or nondense breasts, 10 149 of 10 763 (94%; 95% CI: 94%, 95%) DL assessments were accepted by the interpreting radiologist. Conclusion This DL model can be used to assess mammographic breast density at the level of an experienced mammographer. © RSNA, 2018 Online supplemental material is available for this article . See also the editorial by Chan and Helvie in this issue.


Assuntos
Mama/diagnóstico por imagem , Aprendizado Profundo , Mamografia/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Densidade da Mama/fisiologia , Bases de Dados Factuais , Feminino , Humanos , Pessoa de Meia-Idade
4.
AJR Am J Roentgenol ; 213(1): 227-233, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30933651

RESUMO

OBJECTIVE. The purpose of this study is to develop an image-based deep learning (DL) model to predict the 5-year risk of breast cancer on the basis of a single breast MR image from a screening examination. MATERIALS AND METHODS. We collected 1656 consecutive breast MR images from screening examinations performed for 1183 high-risk women from January 2011 to June 2013, to predict the risk of cancer developing within 5 years of the screening. Women who lacked a 5-year screening follow-up examination and women who had cancer other than primary breast cancer develop in their breast were excluded from the study. We developed a logistic regression model based on traditional risk factors (the risk factor logistic regression [RF-LR] model) and a DL model based on the MR image alone (the Image-DL model). Examinations occurring within 6 months of a cancer diagnosis were excluded from the testing sets in each fold of cross-validation. We compared our models against the Tyrer-Cuzick (TC) model. All models were evaluated using mean (± SD) AUC values and observed-to-expected (OE) ratios across 10-fold cross-validation. RESULTS. The RF-LR and Image-DL models achieved mean AUC values of 0.558 ± 0.108 and 0.638 ± 0.094, respectively. In contrast, the TC model achieved an AUC value of 0.493 ± 0.092. The Image-DL and RF-LR models achieved mean OE ratios of 0.993 ± 0.658 and 0.828 ± 0.181, compared with the mean OE ratio of 1.091 ± 0.255 obtained using the TC model. CONCLUSION. Our DL model can assess the 5-year cancer risk on the basis of a breast MR image alone, and it showed improved individual risk discrimination when compared with a state-of-the-art risk assessment model. These results offer promising preliminary data regarding the potential of image-based risk assessment models to support more personalized care.

5.
JCO Clin Cancer Inform ; 4: 865-874, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33006906

RESUMO

PURPOSE: Literature on clinical note mining has highlighted the superiority of machine learning (ML) over hand-crafted rules. Nevertheless, most studies assume the availability of large training sets, which is rarely the case. For this reason, in the clinical setting, rules are still common. We suggest 2 methods to leverage the knowledge encoded in pre-existing rules to inform ML decisions and obtain high performance, even with scarce annotations. METHODS: We collected 501 prostate pathology reports from 6 American hospitals. Reports were split into 2,711 core segments, annotated with 20 attributes describing the histology, grade, extension, and location of tumors. The data set was split by institutions to generate a cross-institutional evaluation setting. We assessed 4 systems, namely a rule-based approach, an ML model, and 2 hybrid systems integrating the previous methods: a Rule as Feature model and a Classifier Confidence model. Several ML algorithms were tested, including logistic regression (LR), support vector machine (SVM), and eXtreme gradient boosting (XGB). RESULTS: When training on data from a single institution, LR lags behind the rules by 3.5% (F1 score: 92.2% v 95.7%). Hybrid models, instead, obtain competitive results, with Classifier Confidence outperforming the rules by +0.5% (96.2%). When a larger amount of data from multiple institutions is used, LR improves by +1.5% over the rules (97.2%), whereas hybrid systems obtain +2.2% for Rule as Feature (97.7%) and +2.6% for Classifier Confidence (98.3%). Replacing LR with SVM or XGB yielded similar performance gains. CONCLUSION: We developed methods to use pre-existing handcrafted rules to inform ML algorithms. These hybrid systems obtain better performance than either rules or ML models alone, even when training data are limited.


Assuntos
Aprendizado de Máquina , Próstata , Algoritmos , Humanos , Modelos Logísticos , Masculino , Máquina de Vetores de Suporte , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA