RESUMO
Tumors of the major and minor salivary glands histologically encompass a diverse and partly overlapping spectrum of frequent diagnostically challenging neoplasms. Despite recent advances in molecular testing and the identification of tumor-specific mutations or gene fusions, there is an unmet need to identify additional diagnostic biomarkers for entities lacking specific alterations. In this study, we collected a comprehensive cohort of 363 cases encompassing 20 different salivary gland tumor entities and explored the potential of DNA methylation to classify these tumors. We were able to show that most entities show specific epigenetic signatures and present a machine learning algorithm that achieved a mean balanced accuracy of 0.991. Of note, we showed that cribriform adenocarcinoma is epigenetically distinct from classical polymorphous adenocarcinoma, which could support risk stratification of these tumors. Myoepithelioma and pleomorphic adenoma form a uniform epigenetic class, supporting the theory of a single entity with a broad but continuous morphologic spectrum. Furthermore, we identified a histomorphologically heterogeneous but epigenetically distinct class that could represent a novel tumor entity. In conclusion, our study provides a comprehensive resource of the DNA methylation landscape of salivary gland tumors. Our data provide novel insight into disputed entities and show the potential of DNA methylation to identify new tumor classes. Furthermore, in future, our machine learning classifier could support the histopathologic diagnosis of salivary gland tumors.
RESUMO
INTRODUCTION: Molecular profiling of lung cancer is essential to identify genetic alterations that predict response to targeted therapy. While deep learning shows promise for predicting oncogenic mutations from whole tissue images, existing studies often face challenges such as limited sample sizes, a focus on earlier stage patients, and insufficient analysis of robustness and generalizability. METHODS: This retrospective study evaluates factors influencing mutation prediction accuracy using the large Heidelberg Lung Adenocarcinoma Cohort (HLCC), a cohort of 2356 late-stage FFPE samples. Validation is performed in the publicly available TCGA-LUAD cohort. RESULTS: Models trained on the larger HLCC cohort generalized well to the TCGA dataset for mutations in EGFR (AUC 0.76), STK11 (AUC 0.71) and TP53 (AUC 0.75), in line with the hypothesis that larger cohort sizes improve model robustness. Variation in performance due to pre-processing and modeling choices, such as mutation variant calling, affected EGFR prediction accuracy by up to 7 %. DISCUSSION: Model explanations suggest that acinar and papillary growth patterns are critical for the detection of EGFR mutations, whereas solid growth patterns and large nuclei are indicative of TP53 mutations. These findings highlight the importance of specific morphological features in mutation detection and the potential of deep learning models to improve mutation prediction accuracy. CONCLUSION: Although deep learning models trained on larger cohorts show improved robustness and generalizability in predicting oncogenic mutations, they cannot replace comprehensive molecular profiling. However, they may support patient pre-selection for clinical trials and deepen the insight in genotype-phenotype relationships.
Assuntos
Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Mutação , Humanos , Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/patologia , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Estudos Retrospectivos , Feminino , Masculino , Receptores ErbB/genética , Biomarcadores Tumorais/genética , Pessoa de Meia-Idade , Idoso , Proteína Supressora de Tumor p53/genética , Aprendizado ProfundoRESUMO
Single-pulse electrical stimulation in the nervous system, often called cortico-cortical evoked potential (CCEP) measurement, is an important technique to understand how brain regions interact with one another. Voltages are measured from implanted electrodes in one brain area while stimulating another with brief current impulses separated by several seconds. Historically, researchers have tried to understand the significance of evoked voltage polyphasic deflections by visual inspection, but no general-purpose tool has emerged to understand their shapes or describe them mathematically. We describe and illustrate a new technique to parameterize brain stimulation data, where voltage response traces are projected into one another using a semi-normalized dot product. The length of timepoints from stimulation included in the dot product is varied to obtain a temporal profile of structural significance, and the peak of the profile uniquely identifies the duration of the response. Using linear kernel PCA, a canonical response shape is obtained over this duration, and then single-trial traces are parameterized as a projection of this canonical shape with a residual term. Such parameterization allows for dissimilar trace shapes from different brain areas to be directly compared by quantifying cross-projection magnitudes, response duration, canonical shape projection amplitudes, signal-to-noise ratios, explained variance, and statistical significance. Artifactual trials are automatically identified by outliers in sub-distributions of cross-projection magnitude, and rejected. This technique, which we call "Canonical Response Parameterization" (CRP) dramatically simplifies the study of CCEP shapes, and may also be applied in a wide range of other settings involving event-triggered data.
Assuntos
Encéfalo , Potenciais Evocados , Potenciais Evocados/fisiologia , Mapeamento Encefálico/métodos , Eletrodos Implantados , Estimulação Elétrica/métodosRESUMO
The molecular heterogeneity of cancer cells contributes to the often partial response to targeted therapies and relapse of disease due to the escape of resistant cell populations. While single-cell sequencing has started to improve our understanding of this heterogeneity, it offers a mostly descriptive view on cellular types and states. To obtain more functional insights, we propose scGeneRAI, an explainable deep learning approach that uses layer-wise relevance propagation (LRP) to infer gene regulatory networks from static single-cell RNA sequencing data for individual cells. We benchmark our method with synthetic data and apply it to single-cell RNA sequencing data of a cohort of human lung cancers. From the predicted single-cell networks our approach reveals characteristic network patterns for tumor cells and normal epithelial cells and identifies subnetworks that are observed only in (subgroups of) tumor cells of certain patients. While current state-of-the-art methods are limited by their ability to only predict average networks for cell populations, our approach facilitates the reconstruction of networks down to the level of single cells which can be utilized to characterize the heterogeneity of gene regulation within and across tumors.
Assuntos
Aprendizado Profundo , Redes Reguladoras de Genes , Neoplasias , Análise da Expressão Gênica de Célula Única , Humanos , Regulação da Expressão Gênica , Neoplasias/genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologiaRESUMO
AIM: Analysis of cerebrospinal fluid (CSF) is essential for diagnostic workup of patients with neurological diseases and includes differential cell typing. The current gold standard is based on microscopic examination by specialised technicians and neuropathologists, which is time-consuming, labour-intensive and subjective. METHODS: We, therefore, developed an image analysis approach based on expert annotations of 123,181 digitised CSF objects from 78 patients corresponding to 15 clinically relevant categories and trained a multiclass convolutional neural network (CNN). RESULTS: The CNN classified the 15 categories with high accuracy (mean AUC 97.3%). By using explainable artificial intelligence (XAI), we demonstrate that the CNN identified meaningful cellular substructures in CSF cells recapitulating human pattern recognition. Based on the evaluation of 511 cells selected from 12 different CSF samples, we validated the CNN by comparing it with seven board-certified neuropathologists blinded for clinical information. Inter-rater agreement between the CNN and the ground truth was non-inferior (Krippendorff's alpha 0.79) compared with the agreement of seven human raters and the ground truth (mean Krippendorff's alpha 0.72, range 0.56-0.81). The CNN assigned the correct diagnostic label (inflammatory, haemorrhagic or neoplastic) in 10 out of 11 clinical samples, compared with 7-11 out of 11 by human raters. CONCLUSIONS: Our approach provides the basis to overcome current limitations in automated cell classification for routine diagnostics and demonstrates how a visual explanation framework can connect machine decision-making with cell properties and thus provide a novel versatile and quantitative method for investigating CSF manifestations of various neurological diseases.
Assuntos
Aprendizado Profundo , Humanos , Inteligência Artificial , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodosRESUMO
Understanding the pathological properties of dysregulated protein networks in individual patients' tumors is the basis for precision therapy. Functional experiments are commonly used, but cover only parts of the oncogenic signaling networks, whereas methods that reconstruct networks from omics data usually only predict average network features across tumors. Here, we show that the explainable AI method layer-wise relevance propagation (LRP) can infer protein interaction networks for individual patients from proteomic profiling data. LRP reconstructs average and individual interaction networks with an AUC of 0.99 and 0.93, respectively, and outperforms state-of-the-art network prediction methods for individual tumors. Using data from The Cancer Proteome Atlas, we identify known and potentially novel oncogenic network features, among which some are cancer-type specific and show only minor variation among patients, while others are present across certain tumor types but differ among individual patients. Our approach may therefore support predictive diagnostics in precision oncology by inferring "patient-level" oncogenic mechanisms.
RESUMO
In head and neck squamous cell cancers (HNSCs) that present as metastases with an unknown primary (HNSC-CUPs), the identification of a primary tumor improves therapy options and increases patient survival. However, the currently available diagnostic methods are laborious and do not offer a sufficient detection rate. Predictive machine learning models based on DNA methylation profiles have recently emerged as a promising technique for tumor classification. We applied this technique to HNSC to develop a tool that can improve the diagnostic work-up for HNSC-CUPs. On a reference cohort of 405 primary HNSC samples, we developed four classifiers based on different machine learning models [random forest (RF), neural network (NN), elastic net penalized logistic regression (LOGREG), and support vector machine (SVM)] that predict the primary site of HNSC tumors from their DNA methylation profile. The classifiers achieved high classification accuracies (RF = 83%, NN = 88%, LOGREG = SVM = 89%) on an independent cohort of 64 HNSC metastases. Further, the NN, LOGREG, and SVM models significantly outperformed p16 status as a marker for an origin in the oropharynx. In conclusion, the DNA methylation profiles of HNSC metastases are characteristic for their primary sites, and the classifiers developed in this study, which are made available to the scientific community, can provide valuable information to guide the diagnostic work-up of HNSC-CUP. © 2021 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Assuntos
Metilação de DNA , Neoplasias de Cabeça e Pescoço , Neoplasias de Cabeça e Pescoço/genética , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Carcinoma de Células Escamosas de Cabeça e Pescoço/genéticaRESUMO
The complexity of diagnostic (surgical) pathology has increased substantially over the last decades with respect to histomorphological and molecular profiling. Pathology has steadily expanded its role in tumor diagnostics and beyond from disease entity identification via prognosis estimation to precision therapy prediction. It is therefore not surprising that pathology is among the disciplines in medicine with high expectations in the application of artificial intelligence (AI) or machine learning approaches given their capabilities to analyze complex data in a quantitative and standardized manner to further enhance scope and precision of diagnostics. While an obvious application is the analysis of histological images, recent applications for the analysis of molecular profiling data from different sources and clinical data support the notion that AI will enhance both histopathology and molecular pathology in the future. At the same time, current literature should not be misunderstood in a way that pathologists will likely be replaced by AI applications in the foreseeable future. Although AI will transform pathology in the coming years, recent studies reporting AI algorithms to diagnose cancer or predict certain molecular properties deal with relatively simple diagnostic problems that fall short of the diagnostic complexity pathologists face in clinical routine. Here, we review the pertinent literature of AI methods and their applications to pathology, and put the current achievements and what can be expected in the future in the context of the requirements for research and routine diagnostics.
Assuntos
Inteligência Artificial , Neoplasias , Humanos , Aprendizado de Máquina , Neoplasias/diagnóstico , Neoplasias/genética , PrognósticoRESUMO
Deep learning has recently gained popularity in digital pathology due to its high prediction quality. However, the medical domain requires explanation and insight for a better understanding beyond standard quantitative performance evaluation. Recently, many explanation methods have emerged. This work shows how heatmaps generated by these explanation methods allow to resolve common challenges encountered in deep learning-based digital histopathology analyses. We elaborate on biases which are typically inherent in histopathological image data. In the binary classification task of tumour tissue discrimination in publicly available haematoxylin-eosin-stained images of various tumour entities, we investigate three types of biases: (1) biases which affect the entire dataset, (2) biases which are by chance correlated with class labels and (3) sampling biases. While standard analyses focus on patch-level evaluation, we advocate pixel-wise heatmaps, which offer a more precise and versatile diagnostic instrument. This insight is shown to not only be helpful to detect but also to remove the effects of common hidden biases, which improves generalisation within and across datasets. For example, we could see a trend of improved area under the receiver operating characteristic (ROC) curve by 5% when reducing a labelling bias. Explanation techniques are thus demonstrated to be a helpful and highly relevant tool for the development and the deployment phases within the life cycle of real-world applications in digital pathology.
Assuntos
Aprendizado Profundo , Interpretação de Imagem Assistida por Computador , Neoplasias/diagnóstico por imagem , Neoplasias/patologia , Área Sob a Curva , Humanos , Redes Neurais de Computação , Curva ROCRESUMO
Head and neck squamous cell carcinoma (HNSC) patients are at risk of suffering from both pulmonary metastases or a second squamous cell carcinoma of the lung (LUSC). Differentiating pulmonary metastases from primary lung cancers is of high clinical importance, but not possible in most cases with current diagnostics. To address this, we performed DNA methylation profiling of primary tumors and trained three different machine learning methods to distinguish metastatic HNSC from primary LUSC. We developed an artificial neural network that correctly classified 96.4% of the cases in a validation cohort of 279 patients with HNSC and LUSC as well as normal lung controls, outperforming support vector machines (95.7%) and random forests (87.8%). Prediction accuracies of more than 99% were achieved for 92.1% (neural network), 90% (support vector machine), and 43% (random forest) of these cases by applying thresholds to the resulting probability scores and excluding samples with low confidence. As independent clinical validation of the approach, we analyzed a series of 51 patients with a history of HNSC and a second lung tumor, demonstrating the correct classifications based on clinicopathological properties. In summary, our approach may facilitate the reliable diagnostic differentiation of pulmonary metastases of HNSC from primary LUSC to guide therapeutic decisions.
Assuntos
Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/secundário , Metilação de DNA/genética , Neoplasias de Cabeça e Pescoço/genética , Neoplasias de Cabeça e Pescoço/patologia , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/secundário , Aprendizado de Máquina , Algoritmos , Estudos de Coortes , Humanos , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Comprehensive mutational profiling data now available on all major cancers have led to proposals of novel molecular tumor classifications that modify or replace the established organ- and tissue-based tumor typing. The rationale behind such molecular reclassifications is that genetic alterations underlying cancer pathology predict response to therapy and may therefore offer a more precise view on cancer than histology. The use of individual actionable mutations to select cancers for treatment across histotypes is already being tested in the so-called basket trials with variable success rates. Here, we present a computational approach that facilitates the systematic analysis of the histological context dependency of mutational effects by integrating genomic and proteomic tumor profiles across cancers. METHODS: To determine effects of oncogenic mutations on protein profiles, we used the energy distance, which compares the Euclidean distances of protein profiles in tumors with an oncogenic mutation (inner distance) to that in tumors without the mutation (outer distance) and performed Monte Carlo simulations for the significance analysis. Finally, the proteins were ranked by their contribution to profile differences to identify proteins characteristic of oncogenic mutation effects across cancers. RESULTS: We apply our approach to four current proposals of molecular tumor classifications and major therapeutically relevant actionable genes. All 12 actionable genes evaluated show effects on the protein level in the corresponding tumor type and showed additional mutation-related protein profiles in 21 tumor types. Moreover, our analysis identifies consistent cross-cancer effects for 4 genes (FGFR1, ERRB2, IDH1, KRAS/NRAS) in 14 tumor types. We further use cell line drug response data to validate our findings. CONCLUSIONS: This computational approach can be used to identify mutational signatures that have protein-level effects and can therefore contribute to preclinical in silico tests of the efficacy of molecular classifications as well as the druggability of individual mutations. It thus supports the identification of novel targeted therapies effective across cancers and guides efficient basket trial designs.