RESUMO
Although a few cancer genes are mutated in a high proportion of tumours of a given type (>20%), most are mutated at intermediate frequencies (2-20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600-5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics.
Assuntos
Genes Neoplásicos/genética , Neoplasias/classificação , Neoplasias/genética , Apoptose/genética , Estudos de Casos e Controles , Proliferação de Células , Cromatina/genética , Análise Mutacional de DNA , Exoma/genética , Genoma Humano/genética , Instabilidade Genômica/genética , Genômica , Humanos , Evasão da Resposta Imune/genética , Taxa de Mutação , Neoplasias/patologia , Mutação Puntual/genética , Processamento Pós-Transcricional do RNA/genética , Tamanho da AmostraRESUMO
Major international projects are underway that are aimed at creating a comprehensive catalogue of all the genes responsible for the initiation and progression of cancer. These studies involve the sequencing of matched tumour-normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour-normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.
Assuntos
Heterogeneidade Genética , Mutação/genética , Neoplasias/genética , Oncogenes/genética , Artefatos , Período de Replicação do DNA , Exoma/genética , Reações Falso-Positivas , Expressão Gênica , Genoma Humano/genética , Humanos , Neoplasias Pulmonares/genética , Taxa de Mutação , Neoplasias/classificação , Neoplasias/patologia , Neoplasias de Células Escamosas/genética , Reprodutibilidade dos Testes , Tamanho da AmostraRESUMO
BACKGROUND: The incidence of hematologic cancers increases with age. These cancers are associated with recurrent somatic mutations in specific genes. We hypothesized that such mutations would be detectable in the blood of some persons who are not known to have hematologic disorders. METHODS: We analyzed whole-exome sequencing data from DNA in the peripheral-blood cells of 17,182 persons who were unselected for hematologic phenotypes. We looked for somatic mutations by identifying previously characterized single-nucleotide variants and small insertions or deletions in 160 genes that are recurrently mutated in hematologic cancers. The presence of mutations was analyzed for an association with hematologic phenotypes, survival, and cardiovascular events. RESULTS: Detectable somatic mutations were rare in persons younger than 40 years of age but rose appreciably in frequency with age. Among persons 70 to 79 years of age, 80 to 89 years of age, and 90 to 108 years of age, these clonal mutations were observed in 9.5% (219 of 2300 persons), 11.7% (37 of 317), and 18.4% (19 of 103), respectively. The majority of the variants occurred in three genes: DNMT3A, TET2, and ASXL1. The presence of a somatic mutation was associated with an increase in the risk of hematologic cancer (hazard ratio, 11.1; 95% confidence interval [CI], 3.9 to 32.6), an increase in all-cause mortality (hazard ratio, 1.4; 95% CI, 1.1 to 1.8), and increases in the risks of incident coronary heart disease (hazard ratio, 2.0; 95% CI, 1.2 to 3.4) and ischemic stroke (hazard ratio, 2.6; 95% CI, 1.4 to 4.8). CONCLUSIONS: Age-related clonal hematopoiesis is a common condition that is associated with increases in the risk of hematologic cancer and in all-cause mortality, with the latter possibly due to an increased risk of cardiovascular disease. (Funded by the National Institutes of Health and others.).
Assuntos
Sangue , Transformação Celular Neoplásica/genética , Neoplasias Hematológicas/genética , Hematopoese , Células-Tronco Hematopoéticas/fisiologia , Mutação , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Células Clonais , Análise Mutacional de DNA , Exoma , Humanos , Pessoa de Meia-Idade , Fatores de Risco , Adulto JovemRESUMO
The most common mutation in human melanoma, BRAF(V600E), activates the serine/threonine kinase BRAF and causes excessive activity in the mitogen-activated protein kinase pathway. BRAF(V600E) mutations are also present in benign melanocytic naevi, highlighting the importance of additional genetic alterations in the genesis of malignant tumours. Such changes include recurrent copy number variations that result in the amplification of oncogenes. For certain amplifications, the large number of genes in the interval has precluded an understanding of the cooperating oncogenic events. Here we have used a zebrafish melanoma model to test genes in a recurrently amplified region of chromosome 1 for the ability to cooperate with BRAF(V600E) and accelerate melanoma. SETDB1, an enzyme that methylates histone H3 on lysine 9 (H3K9), was found to accelerate melanoma formation significantly in zebrafish. Chromatin immunoprecipitation coupled with massively parallel DNA sequencing and gene expression analyses uncovered genes, including HOX genes, that are transcriptionally dysregulated in response to increased levels of SETDB1. Our studies establish SETDB1 as an oncogene in melanoma and underscore the role of chromatin factors in regulating tumorigenesis.
Assuntos
Variações do Número de Cópias de DNA/genética , Amplificação de Genes/genética , Histona-Lisina N-Metiltransferase/genética , Melanoma/genética , Melanoma/patologia , Proteínas Metiltransferases/genética , Proteínas Metiltransferases/metabolismo , Idade de Início , Substituição de Aminoácidos , Animais , Animais Geneticamente Modificados , Transformação Celular Neoplásica/genética , Imunoprecipitação da Cromatina , Cromossomos Humanos Par 1/genética , Modelos Animais de Doenças , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica/genética , Genes Homeobox/genética , Histona Metiltransferases , Histona-Lisina N-Metiltransferase/metabolismo , Humanos , Melanócitos/citologia , Melanócitos/enzimologia , Melanócitos/metabolismo , Melanócitos/patologia , Melanoma/enzimologia , Nevo/enzimologia , Oncogenes/genética , Proteínas Proto-Oncogênicas B-raf/química , Proteínas Proto-Oncogênicas B-raf/genética , Proteínas Proto-Oncogênicas B-raf/metabolismo , Peixe-Zebra/genéticaRESUMO
Genomic studies have identified somatic alterations in the majority of myeloproliferative neoplasms (MPN) patients, including JAK2 mutations in the majority of MPN patients and CALR mutations in JAK2-negative MPN patients. However, the role of JAK-STAT pathway activation in different MPNs, and in patients without JAK2 mutations, has not been definitively delineated. We used expression profiling, single nucleotide polymorphism arrays, and mutational profiling to investigate a well-characterized cohort of MPN patients. MPN patients with homozygous JAK2V617F mutations were characterized by a distinctive transcriptional profile. Notably, a transcriptional signature consistent with activated JAK2 signaling is seen in all MPN patients regardless of clinical phenotype or mutational status. In addition, the activated JAK2 signature was present in patients with somatic CALR mutations. Conversely, we identified a gene expression signature of CALR mutations; this signature was significantly enriched in JAK2-mutant MPN patients consistent with a shared mechanism of transformation by JAK2 and CALR mutations. We also identified a transcriptional signature of TET2 mutations in MPN patent samples. Our data indicate that MPN patients, regardless of diagnosis or JAK2 mutational status, are characterized by a distinct gene expression signature with upregulation of JAK-STAT target genes, demonstrating the central importance of the JAK-STAT pathway in MPN pathogenesis.
Assuntos
Genômica , Janus Quinases/metabolismo , Transtornos Mieloproliferativos/genética , Transtornos Mieloproliferativos/metabolismo , Fatores de Transcrição STAT/metabolismo , Transdução de Sinais , Calreticulina , Estudos de Casos e Controles , Transformação Celular Neoplásica/genética , Transformação Celular Neoplásica/metabolismo , Análise por Conglomerados , Feminino , Perfilação da Expressão Gênica , Homozigoto , Humanos , Janus Quinase 2/genética , Janus Quinases/genética , Masculino , Mutação , Fatores de Transcrição STAT/genética , TranscriptomaRESUMO
A powerful way to discover key genes with causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here we present high-resolution analyses of somatic copy-number alterations (SCNAs) from 3,131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across several cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-kappaBeta pathway. We show that cancer cells containing amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend on the expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in several cancer types.
Assuntos
Variações do Número de Cópias de DNA/genética , Dosagem de Genes/genética , Neoplasias/genética , Apoptose/genética , Linhagem Celular Tumoral , Sobrevivência Celular/genética , Amplificação de Genes/genética , Genômica , Humanos , Família Multigênica/genética , Proteína de Sequência 1 de Leucemia de Células Mieloides , Neoplasias/classificação , Neoplasias/patologia , Proteínas Proto-Oncogênicas c-bcl-2/genética , Transdução de Sinais , Proteína bcl-X/genéticaRESUMO
A comprehensive understanding of the molecular vulnerabilities of every type of cancer will provide a powerful roadmap to guide therapeutic approaches. Efforts such as The Cancer Genome Atlas Project will identify genes with aberrant copy number, sequence, or expression in various cancer types, providing a survey of the genes that may have a causal role in cancer. A complementary approach is to perform systematic loss-of-function studies to identify essential genes in particular cancer cell types. We have begun a systematic effort, termed Project Achilles, aimed at identifying genetic vulnerabilities across large numbers of cancer cell lines. Here, we report the assessment of the essentiality of 11,194 genes in 102 human cancer cell lines. We show that the integration of these functional data with information derived from surveying cancer genomes pinpoints known and previously undescribed lineage-specific dependencies across a wide spectrum of cancers. In particular, we found 54 genes that are specifically essential for the proliferation and viability of ovarian cancer cells and also amplified in primary tumors or differentially overexpressed in ovarian cancer cell lines. One such gene, PAX8, is focally amplified in 16% of high-grade serous ovarian cancers and expressed at higher levels in ovarian tumors. Suppression of PAX8 selectively induces apoptotic cell death of ovarian cancer cells. These results identify PAX8 as an ovarian lineage-specific dependency. More generally, these observations demonstrate that the integration of genome-scale functional and structural studies provides an efficient path to identify dependencies of specific cancer types on particular genes and pathways.
Assuntos
Neoplasias Ovarianas/genética , Oxirredutases do Álcool , Sequência de Bases , Linhagem Celular Tumoral , Proliferação de Células , Sobrevivência Celular/genética , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Oncogenes , Neoplasias Ovarianas/patologia , Fator de Transcrição PAX8 , Fatores de Transcrição Box Pareados/genética , RNA Neoplásico/genética , RNA Interferente Pequeno/genéticaRESUMO
Background: Artificial intelligence (AI) has repeatedly been shown to encode historical inequities in healthcare. We aimed to develop a framework to quantitatively assess the performance equity of health AI technologies and to illustrate its utility via a case study. Methods: Here, we propose a methodology to assess whether health AI technologies prioritise performance for patient populations experiencing worse outcomes, that is complementary to existing fairness metrics. We developed the Health Equity Assessment of machine Learning performance (HEAL) framework designed to quantitatively assess the performance equity of health AI technologies via a four-step interdisciplinary process to understand and quantify domain-specific criteria, and the resulting HEAL metric. As an illustrative case study (analysis conducted between October 2022 and January 2023), we applied the HEAL framework to a dermatology AI model. A set of 5420 teledermatology cases (store-and-forward cases from patients of 20 years or older, submitted from primary care providers in the USA and skin cancer clinics in Australia), enriched for diversity in age, sex and race/ethnicity, was used to retrospectively evaluate the AI model's HEAL metric, defined as the likelihood that the AI model performs better for subpopulations with worse average health outcomes as compared to others. The likelihood that AI performance was anticorrelated to pre-existing health outcomes was estimated using bootstrap methods as the probability that the negated Spearman's rank correlation coefficient (i.e., "R") was greater than zero. Positive values of R suggest that subpopulations with poorer health outcomes have better AI model performance. Thus, the HEAL metric, defined as p (R >0), measures how likely the AI technology is to prioritise performance for subpopulations with worse average health outcomes as compared to others (presented as a percentage below). Health outcomes were quantified as disability-adjusted life years (DALYs) when grouping by sex and age, and years of life lost (YLLs) when grouping by race/ethnicity. AI performance was measured as top-3 agreement with the reference diagnosis from a panel of 3 dermatologists per case. Findings: Across all dermatologic conditions, the HEAL metric was 80.5% for prioritizing AI performance of racial/ethnic subpopulations based on YLLs, and 92.1% and 0.0% respectively for prioritizing AI performance of sex and age subpopulations based on DALYs. Certain dermatologic conditions were significantly associated with greater AI model performance compared to a reference category of less common conditions. For skin cancer conditions, the HEAL metric was 73.8% for prioritizing AI performance of age subpopulations based on DALYs. Interpretation: Analysis using the proposed HEAL framework showed that the dermatology AI model prioritised performance for race/ethnicity, sex (all conditions) and age (cancer conditions) subpopulations with respect to pre-existing health disparities. More work is needed to investigate ways of promoting equitable AI performance across age for non-cancer conditions and to better understand how AI models can contribute towards improving equity in health outcomes. Funding: Google LLC.
RESUMO
BACKGROUND: Presence of lymph node metastasis (LNM) influences prognosis and clinical decision-making in colorectal cancer. However, detection of LNM is variable and depends on a number of external factors. Deep learning has shown success in computational pathology, but has struggled to boost performance when combined with known predictors. METHODS: Machine-learned features are created by clustering deep learning embeddings of small patches of tumor in colorectal cancer via k-means, and then selecting the top clusters that add predictive value to a logistic regression model when combined with known baseline clinicopathological variables. We then analyze performance of logistic regression models trained with and without these machine-learned features in combination with the baseline variables. RESULTS: The machine-learned extracted features provide independent signal for the presence of LNM (AUROC: 0.638, 95% CI: [0.590, 0.683]). Furthermore, the machine-learned features add predictive value to the set of 6 clinicopathologic variables in an external validation set (likelihood ratio test, p < 0.00032; AUROC: 0.740, 95% CI: [0.701, 0.780]). A model incorporating these features can also further risk-stratify patients with and without identified metastasis (p < 0.001 for both stage II and stage III). CONCLUSION: This work demonstrates an effective approach to combine deep learning with established clinicopathologic factors in order to identify independently informative features associated with LNM. Further work building on these specific results may have important impact in prognostication and therapeutic decision making for LNM. Additionally, this general computational approach may prove useful in other contexts.
When colorectal cancers spread to the lymph nodes, it can indicate a poorer prognosis. However, detecting lymph node metastasis (spread) can be difficult and depends on a number of factors such as how samples are taken and processed. Here, we show that machine learning, which involves computer software learning from patterns in data, can predict lymph node metastasis in patients with colorectal cancer from the microscopic appearance of their primary tumor and the clinical characteristics of the patients. We also show that the same approach can predict patient survival. With further work, our approach may help clinicians to inform patients about their prognosis and decide on appropriate treatments.
RESUMO
Histologic grading of breast cancer involves review and scoring of three well-established morphologic features: mitotic count, nuclear pleomorphism, and tubule formation. Taken together, these features form the basis of the Nottingham Grading System which is used to inform breast cancer characterization and prognosis. In this study, we develop deep learning models to perform histologic scoring of all three components using digitized hematoxylin and eosin-stained slides containing invasive breast carcinoma. We first evaluate model performance using pathologist-based reference standards for each component. To complement this typical approach to evaluation, we further evaluate the deep learning models via prognostic analyses. The individual component models perform at or above published benchmarks for algorithm-based grading approaches, achieving high concordance rates with pathologist grading. Further, prognostic performance using deep learning-based grading is on par with that of pathologists performing review of matched slides. By providing scores for each component feature, the deep-learning based approach also provides the potential to identify the grading components contributing most to prognostic value. This may enable optimized prognostic models, opportunities to improve access to consistent grading, and approaches to better understand the links between histologic features and clinical outcomes in breast cancer.
RESUMO
Artificial intelligence (AI) has shown promise for diagnosing prostate cancer in biopsies. However, results have been limited to individual studies, lacking validation in multinational settings. Competitions have been shown to be accelerators for medical imaging innovations, but their impact is hindered by lack of reproducibility and independent validation. With this in mind, we organized the PANDA challenge-the largest histopathology competition to date, joined by 1,290 developers-to catalyze development of reproducible AI algorithms for Gleason grading using 10,616 digitized prostate biopsies. We validated that a diverse set of submitted algorithms reached pathologist-level performance on independent cross-continental cohorts, fully blinded to the algorithm developers. On United States and European external validation sets, the algorithms achieved agreements of 0.862 (quadratically weighted κ, 95% confidence interval (CI), 0.840-0.884) and 0.868 (95% CI, 0.835-0.900) with expert uropathologists. Successful generalization across different patient populations, laboratories and reference standards, achieved by a variety of algorithmic approaches, warrants evaluating AI-based Gleason grading in prospective clinical trials.
Assuntos
Gradação de Tumores , Neoplasias da Próstata/patologia , Algoritmos , Biópsia , Estudos de Coortes , Humanos , Masculino , Neoplasias da Próstata/diagnóstico , Reprodutibilidade dos TestesRESUMO
Recent advances in artificial intelligence show tremendous promise to improve the accuracy, reproducibility, and availability of medical diagnostics across a number of medical subspecialities. This is especially true in the field of digital pathology, which has recently witnessed a surge in publications describing state-of-the-art performance for machine learning models across a wide range of diagnostic applications. Nonetheless, despite this promise, there remain significant gaps in translating applications for any of these technologies into actual clinical practice. In this review, we will first give a brief overview of the recent progress in applying AI to digitized pathology images, focusing on how these tools might be applied in clinical workflows in the near term to improve the accuracy and efficiency of pathologists. Then we define and describe in detail the various factors that need to be addressed in order to successfully close the "translation gap" for AI applications in digital pathology.
Assuntos
Inteligência Artificial/tendências , Diagnóstico , Técnicas e Procedimentos Diagnósticos/tendências , Aprendizado de Máquina/tendências , HumanosRESUMO
During the last decade, a dramatic rise in the development and application of artificial intelligence (AI) tools for use in pathology services has occurred. This trend is often expected to continue and reshape the field of pathology in the coming years. The deployment of computational pathology and applications of AI tools can be considered as a paradigm shift that will change pathology services, making them more efficient and capable of meeting the needs of this era of precision medicine. Despite the success of AI models, the translational process from discovery to clinical applications has been slow. The gap between self-contained research and clinical environment may be too wide and has been largely neglected. In this review, we cover the current and prospective applications of AI in pathology. We examine its applications in diagnosis and prognosis, and we offer insights for considerations that could improve clinical applicability of these tools. Then, we discuss its potential to improve workflow efficiency, and its benefits in pathologist education. Finally, we review the factors that could influence adoption in clinical practices and the associated regulatory processes.
Assuntos
Inteligência Artificial , Patologia , Inteligência Artificial/tendências , Humanos , Patologia/métodos , Patologia/tendênciasRESUMO
Both histologic subtypes and tumor mutation burden (TMB) represent important biomarkers in lung cancer, with implications for patient prognosis and treatment decisions. Typically, TMB is evaluated by comprehensive genomic profiling but this requires use of finite tissue specimens and costly, time-consuming laboratory processes. Histologic subtype classification represents an established component of lung adenocarcinoma histopathology, but can be challenging and is associated with substantial inter-pathologist variability. Here we developed a deep learning system to both classify histologic patterns in lung adenocarcinoma and predict TMB status using de-identified Hematoxylin and Eosin (H&E) stained whole slide images. We first trained a convolutional neural network to map histologic features across whole slide images of lung cancer resection specimens. On evaluation using an external data source, this model achieved patch-level area under the receiver operating characteristic curve (AUC) of 0.78-0.98 across nine histologic features. We then integrated the output of this model with clinico-demographic data to develop an interpretable model for TMB classification. The resulting end-to-end system was evaluated on 172 held out cases from TCGA, achieving an AUC of 0.71 (95% CI 0.63-0.80). The benefit of using histologic features in predicting TMB is highlighted by the significant improvement this approach offers over using the clinical features alone (AUC of 0.63 [95% CI 0.53-0.72], p = 0.002). Furthermore, we found that our histologic subtype-based approach achieved performance similar to that of a weakly supervised approach (AUC of 0.72 [95% CI 0.64-0.80]). Together these results underscore that incorporating histologic patterns in biomarker prediction for lung cancer provides informative signals, and that interpretable approaches utilizing these patterns perform comparably with less interpretable, weakly supervised approaches.
Assuntos
Adenocarcinoma de Pulmão/genética , Carcinoma Pulmonar de Células não Pequenas/genética , Aprendizado Profundo , Neoplasias Pulmonares/genética , Mutação , Adenocarcinoma de Pulmão/patologia , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Área Sob a Curva , Carcinoma Pulmonar de Células não Pequenas/patologia , Corantes , Conjuntos de Dados como Assunto , Amarelo de Eosina-(YS) , Feminino , Hematoxilina , Humanos , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Curva ROC , Fatores Sexuais , Fumar , Coloração e RotulagemRESUMO
Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When evaluated on two validation datasets containing 1239 cases (9340 slides) and 738 cases (7140 slides), respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95% CI: 0.66-0.73) and 0.69 (95% CI: 0.64-0.72), and added significant predictive value to a set of nine clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores (R2 = 18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning-based image-similarity model and showed that they explained the majority of the variance (R2 of 73-80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0-95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies.
RESUMO
Background: Gleason grading of prostate cancer is an important prognostic factor, but suffers from poor reproducibility, particularly among non-subspecialist pathologists. Although artificial intelligence (A.I.) tools have demonstrated Gleason grading on-par with expert pathologists, it remains an open question whether and to what extent A.I. grading translates to better prognostication. Methods: In this study, we developed a system to predict prostate cancer-specific mortality via A.I.-based Gleason grading and subsequently evaluated its ability to risk-stratify patients on an independent retrospective cohort of 2807 prostatectomy cases from a single European center with 5-25 years of follow-up (median: 13, interquartile range 9-17). Results: Here, we show that the A.I.'s risk scores produced a C-index of 0.84 (95% CI 0.80-0.87) for prostate cancer-specific mortality. Upon discretizing these risk scores into risk groups analogous to pathologist Grade Groups (GG), the A.I. has a C-index of 0.82 (95% CI 0.78-0.85). On the subset of cases with a GG provided in the original pathology report (n = 1517), the A.I.'s C-indices are 0.87 and 0.85 for continuous and discrete grading, respectively, compared to 0.79 (95% CI 0.71-0.86) for GG obtained from the reports. These represent improvements of 0.08 (95% CI 0.01-0.15) and 0.07 (95% CI 0.00-0.14), respectively. Conclusions: Our results suggest that A.I.-based Gleason grading can lead to effective risk stratification, and warrants further evaluation for improving disease management.
RESUMO
Background: Breast cancer management depends on biomarkers including estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 (ER/PR/HER2). Though existing scoring systems are widely used and well-validated, they can involve costly preparation and variable interpretation. Additionally, discordances between histology and expected biomarker findings can prompt repeat testing to address biological, interpretative, or technical reasons for unexpected results. Methods: We developed three independent deep learning systems (DLS) to directly predict ER/PR/HER2 status for both focal tissue regions (patches) and slides using hematoxylin-and-eosin-stained (H&E) images as input. Models were trained and evaluated using pathologist annotated slides from three data sources. Areas under the receiver operator characteristic curve (AUCs) were calculated for test sets at both a patch-level (>135 million patches, 181 slides) and slide-level (n = 3274 slides, 1249 cases, 37 sites). Interpretability analyses were performed using Testing with Concept Activation Vectors (TCAV), saliency analysis, and pathologist review of clustered patches. Results: The patch-level AUCs are 0.939 (95%CI 0.936-0.941), 0.938 (0.936-0.940), and 0.808 (0.802-0.813) for ER/PR/HER2, respectively. At the slide level, AUCs are 0.86 (95%CI 0.84-0.87), 0.75 (0.73-0.77), and 0.60 (0.56-0.64) for ER/PR/HER2, respectively. Interpretability analyses show known biomarker-histomorphology associations including associations of low-grade and lobular histology with ER/PR positivity, and increased inflammatory infiltrates with triple-negative staining. Conclusions: This study presents rapid breast cancer biomarker estimation from routine H&E slides and builds on prior advances by prioritizing interpretability of computationally learned features in the context of existing pathological knowledge.
RESUMO
Identification of specific somatic gene alterations is crucial for the insight into the development, progression, and clinical behavior of individual cancer types. The recently discovered recurrent ERG rearrangement in prostate cancer might represent a prostate cancer-specific alteration that has not been systematically assessed in tumors other than prostate cancer. Aim of this study was to assess, whether the ERG rearrangement and the distinct deletion site between TMPRSS2 and ERG, both predominantly resulting in a TMPRSS2-ERG fusion, occur in tumors other than prostate cancer. We assessed 54 different tumor types (2942 samples in total) for their ERG rearrangement status by fluorescence in situ hybridization (FISH). To calibrate, we analyzed 285 prostate cancer samples for the ERG rearrangement frequency. Additionally, we interrogated a high-resolution single nucleotide polymorphism (SNP) data set across 3131 cancer specimens (26 tumor types) for copy number alterations. None of the 54 different tumor types assessed by FISH harbored an ERG rearrangement, whereas the prostate cancer samples revealed an ERG rearrangement in 49.5% of cases. Furthermore, within the 26 tumor types assessed for copy number alterations by SNP, the distinct deletion site between TMPRSS2 and ERG (21q22.2-3) was detectable exclusively in prostate cancer. Although Ewing's sarcoma and AML have known rearrangements rarely involving ERG, we hypothesize that the ERG rearrangement as well as the distinct deletion site on 21q22.2-3 between TMPRSS2 and ERG are prostate-cancer-specific genomic alterations. These observations provide further insight into the oncogenesis of prostate cancer and might be critical for the development of ERG rearrangement assessment as a clinical tool.
Assuntos
Adenocarcinoma/genética , Rearranjo Gênico , Neoplasias da Próstata/genética , Transativadores/genética , Análise Mutacional de DNA , DNA de Neoplasias/análise , Feminino , Deleção de Genes , Humanos , Hibridização in Situ Fluorescente , Masculino , Proteínas de Fusão Oncogênica , Polimorfismo de Nucleotídeo Único , Serina Endopeptidases , Análise Serial de Tecidos , Regulador Transcricional ERGRESUMO
Breast cancer is the most common cancer and second leading cause of cancer-related death worldwide. The mainstay of breast cancer workup is histopathological diagnosis - which guides therapy and prognosis. However, emerging knowledge about the complex nature of cancer and the availability of tailored therapies have exposed opportunities for improvements in diagnostic precision. In parallel, advances in artificial intelligence (AI) along with the growing digitization of pathology slides for the primary diagnosis are a promising approach to meet the demand for more accurate detection, classification and prediction of behaviour of breast tumours. In this article, we cover the current and prospective uses of AI in digital pathology for breast cancer, review the basics of digital pathology and AI, and outline outstanding challenges in the field.
Assuntos
Inteligência Artificial , Neoplasias da Mama/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Mama/diagnóstico por imagem , Feminino , HumanosRESUMO
Providing prognostic information at the time of cancer diagnosis has important implications for treatment and monitoring. Although cancer staging, histopathological assessment, molecular features, and clinical variables can provide useful prognostic insights, improving risk stratification remains an active research area. We developed a deep learning system (DLS) to predict disease specific survival across 10 cancer types from The Cancer Genome Atlas (TCGA). We used a weakly-supervised approach without pixel-level annotations, and tested three different survival loss functions. The DLS was developed using 9,086 slides from 3,664 cases and evaluated using 3,009 slides from 1,216 cases. In multivariable Cox regression analysis of the combined cohort including all 10 cancers, the DLS was significantly associated with disease specific survival (hazard ratio of 1.58, 95% CI 1.28-1.70, p<0.0001) after adjusting for cancer type, stage, age, and sex. In a per-cancer adjusted subanalysis, the DLS remained a significant predictor of survival in 5 of 10 cancer types. Compared to a baseline model including stage, age, and sex, the c-index of the model demonstrated an absolute 3.7% improvement (95% CI 1.0-6.5) in the combined cohort. Additionally, our models stratified patients within individual cancer stages, particularly stage II (p = 0.025) and stage III (p<0.001). By developing and evaluating prognostic models across multiple cancer types, this work represents one of the most comprehensive studies exploring the direct prediction of clinical outcomes using deep learning and histopathology images. Our analysis demonstrates the potential for this approach to provide significant prognostic information in multiple cancer types, and even within specific pathologic stages. However, given the relatively small number of cases and observed clinical events for a deep learning task of this type, we observed wide confidence intervals for model performance, thus highlighting that future work will benefit from larger datasets assembled for the purposes for survival modeling.