Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 59
Filter
Add more filters

Publication year range
1.
Cell ; 180(4): 688-702.e13, 2020 02 20.
Article in English | MEDLINE | ID: mdl-32084340

ABSTRACT

Due to the rapid emergence of antibiotic-resistant bacteria, there is a growing need to discover new antibiotics. To address this challenge, we trained a deep neural network capable of predicting molecules with antibacterial activity. We performed predictions on multiple chemical libraries and discovered a molecule from the Drug Repurposing Hub-halicin-that is structurally divergent from conventional antibiotics and displays bactericidal activity against a wide phylogenetic spectrum of pathogens including Mycobacterium tuberculosis and carbapenem-resistant Enterobacteriaceae. Halicin also effectively treated Clostridioides difficile and pan-resistant Acinetobacter baumannii infections in murine models. Additionally, from a discrete set of 23 empirically tested predictions from >107 million molecules curated from the ZINC15 database, our model identified eight antibacterial compounds that are structurally distant from known antibiotics. This work highlights the utility of deep learning approaches to expand our antibiotic arsenal through the discovery of structurally distinct antibacterial molecules.


Subject(s)
Anti-Bacterial Agents/pharmacology , Drug Discovery/methods , Machine Learning , Thiadiazoles/pharmacology , Acinetobacter baumannii/drug effects , Animals , Anti-Bacterial Agents/chemistry , Cheminformatics/methods , Clostridioides difficile/drug effects , Databases, Chemical , Mice , Mice, Inbred BALB C , Mice, Inbred C57BL , Mycobacterium tuberculosis/drug effects , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Thiadiazoles/chemistry
3.
Nature ; 620(7976): 1089-1100, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37433327

ABSTRACT

There has been considerable recent progress in designing new proteins using deep-learning methods1-9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.


Subject(s)
Deep Learning , Proteins , Catalytic Domain , Cryoelectron Microscopy , Hemagglutinin Glycoproteins, Influenza Virus/chemistry , Hemagglutinin Glycoproteins, Influenza Virus/metabolism , Hemagglutinin Glycoproteins, Influenza Virus/ultrastructure , Protein Binding , Proteins/chemistry , Proteins/metabolism , Proteins/ultrastructure
4.
Nat Chem Biol ; 20(3): 291-301, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37770698

ABSTRACT

Diverse mechanisms have been described for selective enrichment of biomolecules in membrane-bound organelles, but less is known about mechanisms by which molecules are selectively incorporated into biomolecular assemblies such as condensates that lack surrounding membranes. The chemical environments within condensates may differ from those outside these bodies, and if these differed among various types of condensate, then the different solvation environments would provide a mechanism for selective distribution among these intracellular bodies. Here we use small molecule probes to show that different condensates have distinct chemical solvating properties and that selective partitioning of probes in condensates can be predicted with deep learning approaches. Our results demonstrate that different condensates harbor distinct chemical environments that influence the distribution of molecules, show that clues to condensate chemical grammar can be ascertained by machine learning and suggest approaches to facilitate development of small molecule therapeutics with optimal subcellular distribution and therapeutic benefit.


Subject(s)
Biomolecular Condensates , Machine Learning
5.
Nat Chem Biol ; 19(11): 1342-1350, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37231267

ABSTRACT

Acinetobacter baumannii is a nosocomial Gram-negative pathogen that often displays multidrug resistance. Discovering new antibiotics against A. baumannii has proven challenging through conventional screening approaches. Fortunately, machine learning methods allow for the rapid exploration of chemical space, increasing the probability of discovering new antibacterial molecules. Here we screened ~7,500 molecules for those that inhibited the growth of A. baumannii in vitro. We trained a neural network with this growth inhibition dataset and performed in silico predictions for structurally new molecules with activity against A. baumannii. Through this approach, we discovered abaucin, an antibacterial compound with narrow-spectrum activity against A. baumannii. Further investigations revealed that abaucin perturbs lipoprotein trafficking through a mechanism involving LolE. Moreover, abaucin could control an A. baumannii infection in a mouse wound model. This work highlights the utility of machine learning in antibiotic discovery and describes a promising lead with targeted activity against a challenging Gram-negative pathogen.


Subject(s)
Acinetobacter baumannii , Deep Learning , Animals , Mice , Anti-Bacterial Agents/pharmacology , Drug Resistance, Multiple, Bacterial , Microbial Sensitivity Tests
6.
J Chem Inf Model ; 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38950894

ABSTRACT

Information extraction from chemistry literature is vital for constructing up-to-date reaction databases for data-driven chemistry. Complete extraction requires combining information across text, tables, and figures, whereas prior work has mainly investigated extracting reactions from single modalities. In this paper, we present OpenChemIE to address this complex challenge and enable the extraction of reaction data at the document level. OpenChemIE approaches the problem in two steps: extracting relevant information from individual modalities and then integrating the results to obtain a final list of reactions. For the first step, we employ specialized neural models that each address a specific task for chemistry information extraction, such as parsing molecules or reactions from text or figures. We then integrate the information from these modules using chemistry-informed algorithms, allowing for the extraction of fine-grained reaction data from reaction condition and substrate scope investigations. Our machine learning models attain state-of-the-art performance when evaluated individually, and we meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole, achieving an F1 score of 69.5%. Additionally, the reaction extraction results of OpenChemIE attain an accuracy score of 64.3% when directly compared against the Reaxys chemical database. OpenChemIE is most suited for information extraction on organic chemistry literature, where molecules are generally depicted as planar graphs or written in text and can be consolidated into a SMILES format. We provide OpenChemIE freely to the public as an open-source package, as well as through a web interface.

7.
Proc Natl Acad Sci U S A ; 118(39)2021 09 28.
Article in English | MEDLINE | ID: mdl-34526388

ABSTRACT

Effective treatments for COVID-19 are urgently needed. However, discovering single-agent therapies with activity against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been challenging. Combination therapies play an important role in antiviral therapies, due to their improved efficacy and reduced toxicity. Recent approaches have applied deep learning to identify synergistic drug combinations for diseases with vast preexisting datasets, but these are not applicable to new diseases with limited combination data, such as COVID-19. Given that drug synergy often occurs through inhibition of discrete biological targets, here we propose a neural network architecture that jointly learns drug-target interaction and drug-drug synergy. The model consists of two parts: a drug-target interaction module and a target-disease association module. This design enables the model to utilize drug-target interaction data and single-agent antiviral activity data, in addition to available drug-drug combination datasets, which may be small in nature. By incorporating additional biological information, our model performs significantly better in synergy prediction accuracy than previous methods with limited drug combination training data. We empirically validated our model predictions and discovered two drug combinations, remdesivir and reserpine as well as remdesivir and IQ-1S, which display strong antiviral SARS-CoV-2 synergy in vitro. Our approach, which was applied here to address the urgent threat of COVID-19, can be readily extended to other diseases for which a dearth of chemical-chemical combination data exists.


Subject(s)
Antiviral Agents/pharmacology , COVID-19 Drug Treatment , Deep Learning , Adenosine Monophosphate/analogs & derivatives , Alanine/analogs & derivatives , Cell Survival/drug effects , Drug Combinations , Drug Interactions , Drug Synergism , Humans , SARS-CoV-2
8.
J Chem Inf Model ; 63(13): 4030-4041, 2023 07 10.
Article in English | MEDLINE | ID: mdl-37368970

ABSTRACT

Reaction diagram parsing is the task of extracting reaction schemes from a diagram in the chemistry literature. The reaction diagrams can be arbitrarily complex; thus, robustly parsing them into structured data is an open challenge. In this paper, we present RxnScribe, a machine learning model for parsing reaction diagrams of varying styles. We formulate this structured prediction task with a sequence generation approach, which condenses the traditional pipeline into an end-to-end model. We train RxnScribe on a dataset of 1378 diagrams and evaluate it with cross validation, achieving an 80.0% soft match F1 score, with significant improvements over previous models. Our code and data are publicly available at https://github.com/thomas0809/RxnScribe.


Subject(s)
Machine Learning
9.
J Chem Inf Model ; 63(7): 1925-1934, 2023 04 10.
Article in English | MEDLINE | ID: mdl-36971363

ABSTRACT

Molecular structure recognition is the task of translating a molecular image into its graph structure. Significant variation in drawing styles and conventions exhibited in chemical literature poses a significant challenge for automating this task. In this paper, we propose MolScribe, a novel image-to-graph generation model that explicitly predicts atoms and bonds, along with their geometric layouts, to construct the molecular structure. Our model flexibly incorporates symbolic chemistry constraints to recognize chirality and expand abbreviated structures. We further develop data augmentation strategies to enhance the model robustness against domain shifts. In experiments on both synthetic and realistic molecular images, MolScribe significantly outperforms previous models, achieving 76-93% accuracy on public benchmarks. Chemists can also easily verify MolScribe's prediction, informed by its confidence estimation and atom-level alignment with the input image. MolScribe is publicly available through Python and web interfaces: https://github.com/thomas0809/MolScribe.


Subject(s)
Benchmarking , Molecular Structure
10.
Acc Chem Res ; 54(2): 263-270, 2021 01 19.
Article in English | MEDLINE | ID: mdl-33370107

ABSTRACT

Recent advances in computer hardware and software have led to a revolution in deep neural networks that has impacted fields ranging from language translation to computer vision. Deep learning has also impacted a number of areas in drug discovery, including the analysis of cellular images and the design of novel routes for the synthesis of organic molecules. While work in these areas has been impactful, a complete review of the applications of deep learning in drug discovery would be beyond the scope of a single Account. In this Account, we will focus on two key areas where deep learning has impacted molecular design: the prediction of molecular properties and the de novo generation of suggestions for new molecules.One of the most significant advances in the development of quantitative structure-activity relationships (QSARs) has come from the application of deep learning methods to the prediction of the biological activity and physical properties of molecules in drug discovery programs. Rather than employing the expert-derived chemical features typically used to build predictive models, researchers are now using deep learning to develop novel molecular representations. These representations, coupled with the ability of deep neural networks to uncover complex, nonlinear relationships, have led to state-of-the-art performance. While deep learning has changed the way that many researchers approach QSARs, it is not a panacea. As with any other machine learning task, the design of predictive models is dependent on the quality, quantity, and relevance of available data. Seemingly fundamental issues, such as optimal methods for creating a training set, are still open questions for the field. Another critical area that is still the subject of multiple research efforts is the development of methods for assessing the confidence in a model.Deep learning has also contributed to a renaissance in the application of de novo molecule generation. Rather than relying on manually defined heuristics, deep learning methods learn to generate new molecules based on sets of existing molecules. Techniques that were originally developed for areas such as image generation and language translation have been adapted to the generation of molecules. These deep learning methods have been coupled with the predictive models described above and are being used to generate new molecules with specific predicted biological activity profiles. While these generative algorithms appear promising, there have been only a few reports on the synthesis and testing of molecules based on designs proposed by generative models. The evaluation of the diversity, quality, and ultimate value of molecules produced by generative models is still an open question. While the field has produced a number of benchmarks, it has yet to agree on how one should ultimately assess molecules "invented" by an algorithm.

11.
J Chem Inf Model ; 62(9): 2035-2045, 2022 05 09.
Article in English | MEDLINE | ID: mdl-34115937

ABSTRACT

Access to structured chemical reaction data is of key importance for chemists in performing bench experiments and in modern applications like computer-aided drug design. Existing reaction databases are generally populated by human curators through manual abstraction from published literature (e.g., patents and journals), which is time consuming and labor intensive, especially with the exponential growth of chemical literature in recent years. In this study, we focus on developing automated methods for extracting reactions from chemical literature. We consider journal publications as the target source of information, which are more comprehensive and better represent the latest developments in chemistry compared to patents; however, they are less formulaic in their descriptions of reactions. To implement the reaction extraction system, we first devised a chemical reaction schema, primarily including a central product, and a set of associated reaction roles such as reactants, catalyst, solvent, and so on. We formulate the task as a structure prediction problem and solve it with a two-stage deep learning framework consisting of product extraction and reaction role labeling. Both models are built upon Transformer-based encoders, which are adaptively pretrained using domain and task-relevant unlabeled data. Our models are shown to be both effective and data efficient, achieving an F1 score of 76.2% in product extraction and 78.7% in role extraction, with only hundreds of annotated reactions.


Subject(s)
Databases, Factual , Humans
13.
J Chem Inf Model ; 60(8): 3770-3780, 2020 08 24.
Article in English | MEDLINE | ID: mdl-32702986

ABSTRACT

Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While several approaches to UQ have been proposed in the literature, there is no clear consensus on the comparative performance of these models. In this paper, we study this question in the context of regression tasks. We systematically evaluate several methods on five regression data sets using multiple complementary performance metrics. Our experiments show that none of the methods we tested is unequivocally superior to all others, and none produces a particularly reliable ranking of errors across multiple data sets. While we believe that these results show that existing UQ methods are not sufficient for all common use cases and further research is needed, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.


Subject(s)
Drug Discovery , Neural Networks, Computer , Uncertainty
14.
Breast Cancer Res Treat ; 175(1): 1-4, 2019 May.
Article in English | MEDLINE | ID: mdl-30666539

ABSTRACT

PURPOSE: Atypical ductal hyperplasia (ADH) significantly increases the risk of breast cancer in women. However, little is known about the implications of ADH in men. METHODS: Review of 932 males with breast pathology was performed to identify cases of ADH. Patients were excluded if ADH was upgraded to cancer on excision, or if they had contralateral breast cancer. Cases were reviewed to determine whether any male with ADH developed breast cancer. RESULTS: Nineteen males were diagnosed with ADH from June 2003 to September 2018. All had gynecomastia. Surgical procedure was mastectomy in 8 patients and excision/reduction in 11. One patient had their nipple areola complex removed, and 1 required a free nipple graft. Median patient age at ADH diagnosis was 25 years (range 18-72 years). Of the 14 patients with bilateral gynecomastia, 10 had bilateral ADH and 4 had unilateral. Five cases of ADH were described as severe, bordering on ductal carcinoma in situ. No patient reported a family history of breast cancer. No patient took tamoxifen. At a mean follow-up of 75 months (range 4-185 months), no patient developed breast cancer. CONCLUSION: Our study is the first to provide follow-up information for males with ADH. With 6 years of mean follow-up, no male in our series has developed breast cancer. This suggests that either ADH in men does not pose the same risk as ADH in women or that surgical excision of symptomatic gynecomastia in men effectively reduces the risk of breast cancer.


Subject(s)
Gynecomastia/epidemiology , Gynecomastia/pathology , Mammary Glands, Human/pathology , Adolescent , Adult , Aged , Breast Neoplasms, Male/epidemiology , Breast Neoplasms, Male/etiology , Breast Neoplasms, Male/pathology , Carcinoma, Ductal, Breast/epidemiology , Carcinoma, Ductal, Breast/etiology , Follow-Up Studies , Gynecomastia/surgery , Humans , Hyperplasia , Male , Mastectomy , Middle Aged , Public Health Surveillance , Risk , Young Adult
15.
Breast Cancer Res Treat ; 177(3): 741-748, 2019 Oct.
Article in English | MEDLINE | ID: mdl-31317348

ABSTRACT

INTRODUCTION: Bilateral reduction mammoplasty is one of the most common plastic surgery procedures performed in the U.S. This study examines the incidence, management, and prognosis of incidental breast cancer identified in reduction specimens from a large cohort of reduction mammoplasty patients. METHODS: Breast pathology reports were retrospectively reviewed for evidence of incidental cancers in bilateral reduction mammoplasty specimens from five institutions between 1990 and 2017. RESULTS: A total of 4804 women met the inclusion criteria of this study; incidental cancer was identified in 45 breasts of 39 (0.8%) patients. Six patients (15%) had bilateral cancer. Overall, the maximum diagnosis by breast was 16 invasive cancers and 29 ductal carcinomas in situs. Thirty-three patients had unilateral cancer, 15 (45.5%) of which had high-risk lesions in the contralateral breast. Twenty-one patients underwent mastectomy (12 bilateral and nine unilateral), residual cancer was found in 10 in 25 (40%) therapeutic mastectomies. Seven patients did not undergo mastectomy received breast radiation. The median follow-up was 92 months. No local recurrences were observed in the patients undergoing mastectomy or radiation. Three of 11 (27%) patients who did not undergo mastectomy or radiation developed a local recurrence. The overall survival rate was 87.2% and disease-free survival was 82.1%. CONCLUSIONS: Patients undergoing reduction mammoplasty for macromastia have a small but definite risk of incidental breast cancer. The high rate of bilateral cancer, contralateral high-risk lesions, and residual disease at mastectomy mandates thorough pathologic evaluation and careful follow-up of these patients. Mastectomy or breast radiation is recommended for local control given the high likelihood of local recurrence without either.


Subject(s)
Breast Neoplasms/epidemiology , Adult , Aged , Breast Neoplasms/diagnosis , Breast Neoplasms/etiology , Breast Neoplasms/surgery , Disease Management , Female , Humans , Incidence , Mammaplasty/methods , Middle Aged , Neoplasm Grading , Public Health Surveillance , Retrospective Studies , Treatment Outcome , Tumor Burden
16.
Breast Cancer Res Treat ; 173(1): 201-207, 2019 Jan.
Article in English | MEDLINE | ID: mdl-30238276

ABSTRACT

PURPOSE: Mammoplasty removes random samples of breast tissue from asymptomatic women providing a unique method for evaluating background prevalence of breast pathology in normal population. Our goal was to identify the rate of atypical breast lesions and cancers in women of various ages in the largest mammoplasty cohort reported to date. METHODS: We analyzed pathologic reports from patients undergoing bilateral mammoplasty, using natural language processing algorithm, verified by human review. Patients with a prior history of breast cancer or atypia were excluded. RESULTS: A total of 4775 patients were deemed eligible. Median age was 40 (range 13-86) and was higher in patients with any incidental finding compared to patients with normal reports (52 vs. 39 years, p = 0.0001). Pathological findings were detected in 7.06% (337) of procedures. Benign high-risk lesions were found in 299 patients (6.26%). Invasive carcinoma and ductal carcinoma in situ were detected in 15 (0.31%) and 23 (0.48%) patients, respectively. The rate of atypias and cancers increased with age. CONCLUSION: The overall rate of abnormal findings in asymptomatic patients undergoing mammoplasty was 7.06%, increasing with age. As these results are based on random sample of breast tissue, they likely underestimate the prevalence of abnormal findings in asymptomatic women.


Subject(s)
Breast Neoplasms/epidemiology , Mammaplasty , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Breast/pathology , Breast Neoplasms/pathology , Cohort Studies , Female , Humans , Incidental Findings , Massachusetts/epidemiology , Middle Aged , Precancerous Conditions/pathology , Prevalence
17.
Radiology ; 293(1): 38-46, 2019 10.
Article in English | MEDLINE | ID: mdl-31385754

ABSTRACT

Background Recent deep learning (DL) approaches have shown promise in improving sensitivity but have not addressed limitations in radiologist specificity or efficiency. Purpose To develop a DL model to triage a portion of mammograms as cancer free, improving performance and workflow efficiency. Materials and Methods In this retrospective study, 223 109 consecutive screening mammograms performed in 66 661 women from January 2009 to December 2016 were collected with cancer outcomes obtained through linkage to a regional tumor registry. This cohort was split by patient into 212 272, 25 999, and 26 540 mammograms from 56 831, 7021, and 7176 patients for training, validation, and testing, respectively. A DL model was developed to triage mammograms as cancer free and evaluated on the test set. A DL-triage workflow was simulated in which radiologists skipped mammograms triaged as cancer free (interpreting them as negative for cancer) and read mammograms not triaged as cancer free by using the original interpreting radiologists' assessments. Sensitivities, specificities, and percentage of mammograms read were calculated, with and without the DL-triage-simulated workflow. Statistics were computed across 5000 bootstrap samples to assess confidence intervals (CIs). Specificities were compared by using a two-tailed t test (P < .05) and sensitivities were compared by using a one-sided t test with a noninferiority margin of 5% (P < .05). Results The test set included 7176 women (mean age, 57.8 years ± 10.9 [standard deviation]). When reading all mammograms, radiologists obtained a sensitivity and specificity of 90.6% (173 of 191; 95% CI: 86.6%, 94.7%) and 93.5% (24 625 of 26 349; 95% CI: 93.3%, 93.9%). In the DL-simulated workflow, the radiologists obtained a sensitivity and specificity of 90.1% (172 of 191; 95% CI: 86.0%, 94.3%) and 94.2% (24 814 of 26 349; 95% CI: 94.0%, 94.6%) while reading 80.7% (21 420 of 26 540) of the mammograms. The simulated workflow improved specificity (P = .002) and obtained a noninferior sensitivity with a margin of 5% (P < .001). Conclusion This deep learning model has the potential to reduce radiologist workload and significantly improve specificity without harming sensitivity. © RSNA, 2019 Online supplemental material is available for this article. See also the editorial by Kontos and Conant in this issue.


Subject(s)
Breast Neoplasms/diagnostic imaging , Deep Learning , Image Interpretation, Computer-Assisted/methods , Mammography/methods , Triage/methods , Adult , Aged , Aged, 80 and over , Breast/diagnostic imaging , Cohort Studies , Computer Simulation , Female , Humans , Middle Aged , Registries , Retrospective Studies
18.
Radiology ; 292(1): 60-66, 2019 07.
Article in English | MEDLINE | ID: mdl-31063083

ABSTRACT

Background Mammographic density improves the accuracy of breast cancer risk models. However, the use of breast density is limited by subjective assessment, variation across radiologists, and restricted data. A mammography-based deep learning (DL) model may provide more accurate risk prediction. Purpose To develop a mammography-based DL breast cancer risk model that is more accurate than established clinical breast cancer risk models. Materials and Methods This retrospective study included 88 994 consecutive screening mammograms in 39 571 women between January 1, 2009, and December 31, 2012. For each patient, all examinations were assigned to either training, validation, or test sets, resulting in 71 689, 8554, and 8751 examinations, respectively. Cancer outcomes were obtained through linkage to a regional tumor registry. By using risk factor information from patient questionnaires and electronic medical records review, three models were developed to assess breast cancer risk within 5 years: a risk-factor-based logistic regression model (RF-LR) that used traditional risk factors, a DL model (image-only DL) that used mammograms alone, and a hybrid DL model that used both traditional risk factors and mammograms. Comparisons were made to an established breast cancer risk model that included breast density (Tyrer-Cuzick model, version 8 [TC]). Model performance was compared by using areas under the receiver operating characteristic curve (AUCs) with DeLong test (P < .05). Results The test set included 3937 women, aged 56.20 years ± 10.04. Hybrid DL and image-only DL showed AUCs of 0.70 (95% confidence interval [CI]: 0.66, 0.75) and 0.68 (95% CI: 0.64, 0.73), respectively. RF-LR and TC showed AUCs of 0.67 (95% CI: 0.62, 0.72) and 0.62 (95% CI: 0.57, 0.66), respectively. Hybrid DL showed a significantly higher AUC (0.70) than TC (0.62; P < .001) and RF-LR (0.67; P = .01). Conclusion Deep learning models that use full-field mammograms yield substantially improved risk discrimination compared with the Tyrer-Cuzick (version 8) model. © RSNA, 2019 Online supplemental material is available for this article. See also the editorial by Sitek and Wolfe in this issue.


Subject(s)
Breast Neoplasms/diagnostic imaging , Deep Learning , Mammography/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Adult , Aged , Aged, 80 and over , Breast/diagnostic imaging , Female , Humans , Middle Aged , Reproducibility of Results , Retrospective Studies , Risk Assessment
19.
Radiology ; 290(1): 52-58, 2019 01.
Article in English | MEDLINE | ID: mdl-30325282

ABSTRACT

Purpose To develop a deep learning (DL) algorithm to assess mammographic breast density. Materials and Methods In this retrospective study, a deep convolutional neural network was trained to assess Breast Imaging Reporting and Data System (BI-RADS) breast density based on the original interpretation by an experienced radiologist of 41 479 digital screening mammograms obtained in 27 684 women from January 2009 to May 2011. The resulting algorithm was tested on a held-out test set of 8677 mammograms in 5741 women. In addition, five radiologists performed a reader study on 500 mammograms randomly selected from the test set. Finally, the algorithm was implemented in routine clinical practice, where eight radiologists reviewed 10 763 consecutive mammograms assessed with the model. Agreement on BI-RADS category for the DL model and for three sets of readings-(a) radiologists in the test set, (b) radiologists working in consensus in the reader study set, and (c) radiologists in the clinical implementation set-were estimated with linear-weighted κ statistics and were compared across 5000 bootstrap samples to assess significance. Results The DL model showed good agreement with radiologists in the test set (κ = 0.67; 95% confidence interval [CI]: 0.66, 0.68) and with radiologists in consensus in the reader study set (κ = 0.78; 95% CI: 0.73, 0.82). There was very good agreement (κ = 0.85; 95% CI: 0.84, 0.86) with radiologists in the clinical implementation set; for binary categorization of dense or nondense breasts, 10 149 of 10 763 (94%; 95% CI: 94%, 95%) DL assessments were accepted by the interpreting radiologist. Conclusion This DL model can be used to assess mammographic breast density at the level of an experienced mammographer. © RSNA, 2018 Online supplemental material is available for this article . See also the editorial by Chan and Helvie in this issue.


Subject(s)
Breast/diagnostic imaging , Deep Learning , Mammography/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Adult , Aged , Aged, 80 and over , Algorithms , Breast Density/physiology , Databases, Factual , Female , Humans , Middle Aged
20.
AJR Am J Roentgenol ; 213(1): 227-233, 2019 Jul.
Article in English | MEDLINE | ID: mdl-30933651

ABSTRACT

OBJECTIVE. The purpose of this study is to develop an image-based deep learning (DL) model to predict the 5-year risk of breast cancer on the basis of a single breast MR image from a screening examination. MATERIALS AND METHODS. We collected 1656 consecutive breast MR images from screening examinations performed for 1183 high-risk women from January 2011 to June 2013, to predict the risk of cancer developing within 5 years of the screening. Women who lacked a 5-year screening follow-up examination and women who had cancer other than primary breast cancer develop in their breast were excluded from the study. We developed a logistic regression model based on traditional risk factors (the risk factor logistic regression [RF-LR] model) and a DL model based on the MR image alone (the Image-DL model). Examinations occurring within 6 months of a cancer diagnosis were excluded from the testing sets in each fold of cross-validation. We compared our models against the Tyrer-Cuzick (TC) model. All models were evaluated using mean (± SD) AUC values and observed-to-expected (OE) ratios across 10-fold cross-validation. RESULTS. The RF-LR and Image-DL models achieved mean AUC values of 0.558 ± 0.108 and 0.638 ± 0.094, respectively. In contrast, the TC model achieved an AUC value of 0.493 ± 0.092. The Image-DL and RF-LR models achieved mean OE ratios of 0.993 ± 0.658 and 0.828 ± 0.181, compared with the mean OE ratio of 1.091 ± 0.255 obtained using the TC model. CONCLUSION. Our DL model can assess the 5-year cancer risk on the basis of a breast MR image alone, and it showed improved individual risk discrimination when compared with a state-of-the-art risk assessment model. These results offer promising preliminary data regarding the potential of image-based risk assessment models to support more personalized care.

SELECTION OF CITATIONS
SEARCH DETAIL