RESUMO
Non-self epitopes, whether originated from foreign substances or somatic mutations, trigger immune responses when presented by major histocompatibility complex (MHC) molecules and recognized by T cells. Identification of immunogenically active neoepitopes has significant implications in cancer and virus medicine. However, current methods are mostly limited to predicting physical binding of mutant peptides and MHCs. We previously developed a deep-learning based model, DeepNeo, to identify immunogenic neoepitopes by capturing the structural properties of peptide-MHC pairs with T cell reactivity. Here, we upgraded our DeepNeo model with up-to-date training data. The upgraded model (DeepNeo-v2) was improved in evaluation metrics and showed prediction score distribution that better fits known neoantigen behavior. The immunogenic neoantigen prediction can be conducted at https://deepneo.net.
Assuntos
Antígenos de Neoplasias , Neoplasias , Humanos , Antígenos de Neoplasias/metabolismo , Neoplasias/genética , Peptídeos/química , Epitopos , Antígenos de HistocompatibilidadeRESUMO
Despite advances in predicting physical peptide-major histocompatibility complex I (pMHC I) binding, it remains challenging to identify functionally immunogenic neoepitopes, especially for MHC II. By using the results of >36,000 immunogenicity assay, we developed a method to identify pMHC whose structural alignment facilitates T cell reaction. Our method predicted neoepitopes for MHC II and MHC I that were responsive to checkpoint blockade when applied to >1,200 samples of various tumor types. To investigate selection by spontaneous immunity at the single epitope level, we analyzed the frequency spectrum of >25 million mutations in >9,000 treatment-naive tumors with >100 immune phenotypes. MHC II immunogenicity specifically lowered variant frequencies in tumors under high immune pressure, particularly with high TCR clonality and MHC II expression. A similar trend was shown for MHC I neoepitopes, but only in particular tissue types. In summary, we report immune selection imposed by MHC II-restricted natural or therapeutic T cell reactivity.
Assuntos
Neoplasias , Humanos , Neoplasias/genética , Neoplasias/terapia , Epitopos/genética , Linfócitos T , Peptídeos/química , Peptídeos/metabolismoRESUMO
Although there are many genetic loci in noncoding regions associated with vascular disease, studies on long noncoding RNAs (lncRNAs) discovered from human plaques that affect atherosclerosis have been highly limited. We aimed to identify and functionally validate a lncRNA using human atherosclerotic plaques. Human aortic samples were obtained from patients who underwent aortic surgery, and tissues were classified according to atherosclerotic plaques. RNA was extracted and analyzed for differentially expressed lncRNAs in plaques. Human aortic smooth muscle cells (HASMCs) were stimulated with oxidized low-density lipoprotein (oxLDL) to evaluate the effect of the identified lncRNA on the inflammatory transition of the cells. Among 380 RNAs differentially expressed between the plaque and control tissues, lncRNA HSPA7 was selected and confirmed to show upregulated expression upon oxLDL treatment. HSPA7 knockdown inhibited the migration of HASMCs and the secretion and expression of IL-1ß and IL-6; however, HSPA7 knockdown recovered the oxLDL-induced reduction in the expression of contractile markers. Although miR-223 inhibition promoted the activity of Nf-κB and the secretion of inflammatory proteins such as IL-1ß and IL-6, HSPA7 knockdown diminished these effects. The effects of miR-223 inhibition and HSPA7 knockdown were also found in THP-1 cell-derived macrophages. The impact of HSPA7 on miR-223 was mediated in an AGO2-dependent manner. HSPA7 is differentially increased in human atheroma and promotes the inflammatory transition of vascular smooth muscle cells by sponging miR-223. For the first time, this study elucidated the molecular mechanism of action of HSPA7, a lncRNA of previously unknown function, in humans.
Assuntos
Aterosclerose/etiologia , Aterosclerose/patologia , Proteínas de Choque Térmico HSP70/genética , MicroRNAs/genética , Miócitos de Músculo Liso/metabolismo , Placa Aterosclerótica/etiologia , RNA Longo não Codificante/genética , Proteínas Argonautas , Aterosclerose/metabolismo , Biomarcadores , Suscetibilidade a Doenças , Regulação da Expressão Gênica , Humanos , Miócitos de Músculo Liso/patologia , Placa Aterosclerótica/metabolismo , Placa Aterosclerótica/patologia , Interferência de RNARESUMO
BACKGROUND: One of the greatest challenges in cancer genomics is to distinguish driver mutations from passenger mutations. Whereas recurrence is a hallmark of driver mutations, it is difficult to observe recurring noncoding mutations owing to a limited amount of whole-genome sequenced samples. Hence, it is required to develop a method to predict potentially recurrent mutations. RESULTS: In this work, we developed a random forest classifier that predicts regulatory mutations that may recur based on the features of the mutations repeatedly appearing in a given cohort. With breast cancer as a model, we profiled 35 quantitative features describing genetic and epigenetic signals at the mutation site, transcription factors whose binding motif was disrupted by the mutation, and genes targeted by long-range chromatin interactions. A true set of mutations for machine learning was generated by interrogating publicly available pan-cancer genomes based on our statistical model of mutation recurrence. The performance of our random forest classifier was evaluated by cross validations. The variable importance of each feature in the classification of mutations was investigated. Our statistical recurrence model for the random forest classifier showed an area under the curve (AUC) of ~0.78 in predicting recurrent mutations. Chromatin accessibility at the mutation sites, the distance from the mutations to known cancer risk loci, and the role of the target genes in the regulatory or protein interaction network were among the most important variables. CONCLUSIONS: Our methods enable to characterize recurrent regulatory mutations using a limited number of whole-genome samples, and based on the characterization, to predict potential driver mutations whose recurrence is not found in the given samples but likely to be observed with additional samples.
Assuntos
Neoplasias da Mama/genética , Genômica/métodos , Mutação , Cromatina/genética , Feminino , Genoma , Humanos , Modelos Estatísticos , Fatores de Transcrição/genéticaRESUMO
Global network modeling of distal regulatory interactions is essential in understanding the overall architecture of gene expression programs. Here, we developed a Bayesian probabilistic model and computational method for global causal network construction with breast cancer as a model. Whereas physical regulator binding was well supported by gene expression causality in general, distal elements in intragenic regions or loci distant from the target gene exhibited particularly strong functional effects. Modeling the action of long-range enhancers was critical in recovering true biological interactions with increased coverage and specificity overall and unraveling regulatory complexity underlying tumor subclasses and drug responses in particular. Transcriptional cancer drivers and risk genes were discovered based on the network analysis of somatic and genetic cancer-related DNA variants. Notably, we observed that the risk genes were functionally downstream of the cancer drivers and were selectively susceptible to network perturbation by tumorigenic changes in their upstream drivers. Furthermore, cancer risk alleles tended to increase the susceptibility of the transcription of their associated genes. These findings suggest that transcriptional cancer drivers selectively induce a combinatorial misregulation of downstream risk genes, and that genetic risk factors, mostly residing in distal regulatory regions, increase transcriptional susceptibility to upstream cancer-driving somatic changes.