RESUMO
High-grade serous carcinoma has a poor prognosis, owing primarily to its early dissemination throughout the abdominal cavity. Genomic and proteomic approaches have provided snapshots of the proteogenomics of ovarian cancer1,2, but a systematic examination of both the tumour and stromal compartments is critical in understanding ovarian cancer metastasis. Here we develop a label-free proteomic workflow to analyse as few as 5,000 formalin-fixed, paraffin-embedded cells microdissected from each compartment. The tumour proteome was stable during progression from in situ lesions to metastatic disease; however, the metastasis-associated stroma was characterized by a highly conserved proteomic signature, prominently including the methyltransferase nicotinamide N-methyltransferase (NNMT) and several of the proteins that it regulates. Stromal NNMT expression was necessary and sufficient for functional aspects of the cancer-associated fibroblast (CAF) phenotype, including the expression of CAF markers and the secretion of cytokines and oncogenic extracellular matrix. Stromal NNMT expression supported ovarian cancer migration, proliferation and in vivo growth and metastasis. Expression of NNMT in CAFs led to depletion of S-adenosyl methionine and reduction in histone methylation associated with widespread gene expression changes in the tumour stroma. This work supports the use of ultra-low-input proteomics to identify candidate drivers of disease phenotypes. NNMT is a central, metabolic regulator of CAF differentiation and cancer progression in the stroma that may be therapeutically targeted.
Assuntos
Fibroblastos Associados a Câncer/metabolismo , Nicotinamida N-Metiltransferase/metabolismo , Proteômica , Fibroblastos Associados a Câncer/enzimologia , Linhagem Celular Tumoral , Células Cultivadas , Metilação de DNA , Progressão da Doença , Feminino , Histonas/química , Histonas/metabolismo , Humanos , Metástase Neoplásica , Niacinamida/análogos & derivados , Niacinamida/metabolismo , Neoplasias Ovarianas/metabolismo , Neoplasias Ovarianas/patologia , Fenótipo , Prognóstico , S-Adenosil-Homocisteína/metabolismo , S-Adenosilmetionina/metabolismoRESUMO
Understanding the mechanisms promoting chromosomal translocations of the rearranging receptor loci in leukemia and lymphoma remains incomplete. Here we show that leukemias induced by aberrant activation of ß-catenin in thymocytes, which bear recurrent Tcra/Myc-Pvt1 translocations, depend on Tcf-1. The DNA double strand breaks (DSBs) in the Tcra site of the translocation are Rag-generated, whereas the Myc-Pvt1 DSBs are not. Aberrantly activated ß-catenin redirects Tcf-1 binding to novel DNA sites to alter chromatin accessibility and down-regulate genome-stability pathways. Impaired homologous recombination (HR) DNA repair and replication checkpoints lead to retention of DSBs that promote translocations and transformation of double-positive (DP) thymocytes. The resulting lymphomas, which resemble human T cell acute lymphoblastic leukemia (T-ALL), are sensitive to PARP inhibitors (PARPis). Our findings indicate that aberrant ß-catenin signaling contributes to translocations in thymocytes by guiding Tcf-1 to promote the generation and retention of replication-induced DSBs allowing their coexistence with Rag-generated DSBs. Thus, PARPis could offer therapeutic options in hematologic malignancies with active Wnt/ß-catenin signaling.
Assuntos
Transformação Celular Neoplásica , Instabilidade Genômica , Fator 1-alfa Nuclear de Hepatócito , Leucemia-Linfoma Linfoblástico de Células T Precursoras , Timócitos , Translocação Genética , beta Catenina , Animais , Transformação Celular Neoplásica/genética , Quebras de DNA de Cadeia Dupla , Instabilidade Genômica/genética , Fator 1-alfa Nuclear de Hepatócito/genética , Fator 1-alfa Nuclear de Hepatócito/metabolismo , Camundongos , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/patologia , Proteínas Proto-Oncogênicas c-myc/genética , RNA Longo não Codificante/genética , Timócitos/patologia , Translocação Genética/genética , beta Catenina/genética , beta Catenina/metabolismoRESUMO
The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.
Assuntos
COVID-19/diagnóstico , COVID-19/transmissão , Variação Genética , Genoma Viral , Reação em Cadeia da Polimerase em Tempo Real/métodos , SARS-CoV-2/genética , COVID-19/virologia , Interações Hospedeiro-Patógeno , Humanos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Multiple novel immunoglobulin-like transcripts (NILTs) have been identified from salmon, trout, and carp. NILTs typically encode activating or inhibitory transmembrane receptors with extracellular immunoglobulin (Ig) domains. Although predicted to provide immune recognition in ray-finned fish, we currently lack a definitive framework of NILT diversity, thereby limiting our predictions for their evolutionary origin and function. In order to better understand the diversity of NILTs and their possible roles in immune function, we identified five NILT loci in the Atlantic salmon (Salmo salar) genome, defined 86 NILT Ig domains within a 3-Mbp region of zebrafish (Danio rerio) chromosome 1, and described 41 NILT Ig domains as part of an alternative haplotype for this same genomic region. We then identified transcripts encoded by 43 different NILT genes which reflect an unprecedented diversity of Ig domain sequences and combinations for a family of non-recombining receptors within a single species. Zebrafish NILTs include a sole putative activating receptor but extensive inhibitory and secreted forms as well as membrane-bound forms with no known signaling motifs. These results reveal a higher level of genetic complexity, interindividual variation, and sequence diversity for NILTs than previously described, suggesting that this gene family likely plays multiple roles in host immunity.
Assuntos
Receptores Imunológicos , Peixe-Zebra , Animais , Peixe-Zebra/genética , Sequência de Aminoácidos , Receptores Imunológicos/genética , Genoma/genética , Imunoglobulinas/genética , Filogenia , Mamíferos/genéticaRESUMO
BACKGROUND: Non-ER nuclear receptor activity can alter estrogen receptor (ER) chromatin association and resultant ER-mediated transcription. Consistent with GR modulation of ER activity, high tumor glucocorticoid receptor (GR) expression correlates with improved relapse-free survival in ER+ breast cancer (BC) patients. METHODS: In vitro cell proliferation assays were used to assess ER-mediated BC cell proliferation following GR modulation. ER chromatin association following ER/GR co-liganding was measured using global ChIP sequencing and directed ChIP analysis of proliferative gene enhancers. RESULTS: We found that GR liganding with either a pure agonist or a selective GR modulator (SGRM) slowed estradiol (E2)-mediated proliferation in ER+ BC models. SGRMs that antagonized transcription of GR-unique genes both promoted GR chromatin association and inhibited ER chromatin localization at common DNA enhancer sites. Gene expression analysis revealed that ER and GR co-activation decreased proliferative gene activation (compared to ER activation alone), specifically reducing CCND1, CDK2, and CDK6 gene expression. We also found that ligand-dependent GR occupancy of common ER-bound enhancer regions suppressed both wild-type and mutant ER chromatin association and decreased corresponding gene expression. In vivo, treatment with structurally diverse SGRMs also reduced MCF-7 Y537S ER-expressing BC xenograft growth. CONCLUSION: These studies demonstrate that liganded GR can suppress ER chromatin occupancy at shared ER-regulated enhancers, including CCND1 (Cyclin D1), regardless of whether the ligand is a classic GR agonist or antagonist. Resulting GR-mediated suppression of ER+ BC proliferative gene expression and cell division suggests that SGRMs could decrease ER-driven gene expression.
Assuntos
Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Cromatina/metabolismo , Mutação , Receptores de Estrogênio/genética , Receptores de Estrogênio/metabolismo , Receptores de Glucocorticoides/metabolismo , Animais , Ciclo Celular , Linhagem Celular Tumoral , Proliferação de Células , Modelos Animais de Doenças , Elementos Facilitadores Genéticos , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Camundongos , Ligação Proteica , Transcrição Gênica , Ensaios Antitumorais Modelo de XenoenxertoRESUMO
Melanoma metastases can be categorized by gene expression for the presence of a T-cell-inflamed tumor microenvironment, which correlates with clinical efficacy of immunotherapies. T cells frequently recognize mutational antigens corresponding to nonsynonymous somatic mutations (NSSMs), and in some cases shared differentiation or cancer-testis antigens. Therapies are being pursued to trigger immune infiltration into non-T-cell-inflamed tumors in the hope of rendering them immunotherapy responsive. However, whether those tumors express antigens capable of T-cell recognition has not been explored. To address this question, 266 melanomas from The Cancer Genome Atlas (TCGA) were categorized by the presence or absence of a T-cell-inflamed gene signature. These two subsets were interrogated for cancer-testis, differentiation, and somatic mutational antigens. No statistically significant differences were observed, including density of NSSMs. Focusing on hypothetical HLA-A2+ binding scores, 707 peptides were synthesized, corresponding to all identified candidate neoepitopes. No differences were observed in measured HLA-A2 binding between inflamed and noninflamed cohorts. Twenty peptides were randomly selected from each cohort to evaluate priming and recognition by human CD8+ T cells in vitro with 25% of peptides confirmed to be immunogenic in both. A similar gene expression profile applied to all solid tumors of TCGA revealed no association between T-cell signature and NSSMs. Our results indicate that lack of spontaneous immune infiltration in solid tumors is unlikely due to lack of antigens. Strategies that improve T-cell infiltration into tumors may therefore be able to facilitate clinical response to immunotherapy once antigens become recognized.
Assuntos
Antígenos de Neoplasias/imunologia , Linfócitos do Interstício Tumoral/fisiologia , Melanoma/imunologia , Neoplasias Cutâneas/imunologia , Linfócitos T/fisiologia , Antígenos de Neoplasias/metabolismo , Expressão Gênica , Antígenos HLA-A/genética , Antígenos HLA-A/metabolismo , Humanos , Melanoma/patologia , Neoplasias Cutâneas/patologia , Microambiente TumoralRESUMO
Antigen processing and presentation genes found within the MHC are among the most highly polymorphic genes of vertebrate genomes, providing populations with diverse immune responses to a wide array of pathogens. Here, we describe transcriptome, exome, and whole-genome sequencing of clonal zebrafish, uncovering the most extensive diversity within the antigen processing and presentation genes of any species yet examined. Our CG2 clonal zebrafish assembly provides genomic context within a remarkably divergent haplotype of the core MHC region on chromosome 19 for six expressed genes not found in the zebrafish reference genome: mhc1uga, proteasome-ß 9b (psmb9b), psmb8f, and previously unknown genes psmb13b, tap2d, and tap2e We identify ancient lineages for Psmb13 within a proteasome branch previously thought to be monomorphic and provide evidence of substantial lineage diversity within each of three major trifurcations of catalytic-type proteasome subunits in vertebrates: Psmb5/Psmb8/Psmb11, Psmb6/Psmb9/Psmb12, and Psmb7/Psmb10/Psmb13. Strikingly, nearby tap2 and MHC class I genes also retain ancient sequence lineages, indicating that alternative lineages may have been preserved throughout the entire MHC pathway since early diversification of the adaptive immune system â¼500 Mya. Furthermore, polymorphisms within the three MHC pathway steps (antigen cleavage, transport, and presentation) are each predicted to alter peptide specificity. Lastly, comparative analysis shows that antigen processing gene diversity is far more extensive than previously realized (with ancient coelacanth psmb8 lineages, shark psmb13, and tap2t and psmb10 outside the teleost MHC), implying distinct immune functions and conserved roles in shaping MHC pathway evolution throughout vertebrates.
Assuntos
Evolução Biológica , Cisteína Endopeptidases/genética , Genoma , Haplótipos , Antígenos de Histocompatibilidade Classe I/genética , Proteínas de Peixe-Zebra/genética , Peixe-Zebra/genética , Animais , Apresentação de Antígeno , Clonagem de Organismos , Cisteína Endopeptidases/classificação , Cisteína Endopeptidases/imunologia , Sequenciamento de Nucleotídeos em Larga Escala , Antígenos de Histocompatibilidade Classe I/classificação , Antígenos de Histocompatibilidade Classe I/imunologia , Filogenia , Complexo de Endopeptidases do Proteassoma/genética , Complexo de Endopeptidases do Proteassoma/imunologia , Isoformas de Proteínas/classificação , Isoformas de Proteínas/genética , Isoformas de Proteínas/imunologia , Transcriptoma , Peixe-Zebra/classificação , Peixe-Zebra/imunologia , Proteínas de Peixe-Zebra/classificação , Proteínas de Peixe-Zebra/imunologiaRESUMO
Variation is essential to ecological and evolutionary dynamics, but genetic variation of quantitative traits may be concentrated in a limited number of dimensions, constraining ecoevolutionary dynamics. We describe high-dimension variation in natural accessions of the model alga, Chlamydomonas reinhardtii, and test the hypothesis that extensive fitness variation across 30 environments is constrained to a small number of axes. We used high-throughput phenotyping to investigate morphological, fitness, and genotype × environment (G × E) variation in 18 natural C. reinhardtii accessions in 30 environments. The organismal phenotypes of cell cycle, cell size, and phototactic behavior exhibited substantial genetic variation between lines, and we found up to 74-fold fitness variation across accessions and environments. Approximately 47% of the extensive G × E variation is accounted for by the first two principal components (PCs) of the G-matrix corresponding to covariation in metals response, nitrogen availability, or salt and nutrient response. The natural variation of C. reinhardtii accessions supports the hypothesis that, despite abundant genetic variation across single environments, the species' adaptive response should be constrained along few major axes of selection. These results highlight the utility of natural accessions for integrating ecoevolutionary and genetic research.
Assuntos
Chlamydomonas reinhardtii/genética , Aptidão Genética , Variação Genética , Adaptação Fisiológica/genética , Chlamydomonas reinhardtii/fisiologia , Interação Gene-Ambiente , FenótipoRESUMO
The process of plant speciation often involves the evolution of divergent ecotypes in response to differences in soil water availability between habitats. While the same set of traits is frequently associated with xeric/mesic ecotype divergence, it is unknown whether those traits evolve independently or if they evolve in tandem as a result of genetic colocalization either by pleiotropy or genetic linkage. The self-fertilizing C4 grass species Panicum hallii includes two major ecotypes found in xeric (var. hallii) or mesic (var. filipes) habitats. We constructed the first linkage map for P. hallii by genotyping a reduced representation genomic library of an F2 population derived from an intercross of var. hallii and filipes. We then evaluated the genetic architecture of divergence between these ecotypes through quantitative trait locus (QTL) mapping. Overall, we mapped QTLs for nine morphological traits that are involved in the divergence between the ecotypes. QTLs for five key ecotype-differentiating traits all colocalized to the same region of linkage group five. Leaf physiological traits were less divergent between ecotypes, but we still mapped five physiological QTLs. We also discovered a two-locus Dobzhansky-Muller hybrid incompatibility. Our study suggests that ecotype-differentiating traits may evolve in tandem as a result of genetic colocalization.
Assuntos
Ecótipo , Variação Genética , Panicum/genética , Isolamento Reprodutivo , Mapeamento Cromossômico , Cruzamentos Genéticos , Marcadores Genéticos , Genética Populacional , Hibridização Genética , Fenótipo , Folhas de Planta/fisiologia , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável , Sintenia/genéticaRESUMO
Immune genes have evolved to maintain exceptional diversity, offering robust defense against pathogens. We performed genomic assembly to examine immune gene variation in zebrafish. Gene pathway analysis identified immune genes as significantly enriched among genes with evidence of positive selection. A large subset of genes was absent from analysis of coding sequences due to apparent lack of reads, prompting us to examine genes overlapping zero coverage regions (ZCRs), defined as 2 kb stretches without mapped reads. Immune genes were identified as highly enriched within ZCRs, including over 60% of major histocompatibility complex (MHC) genes and NOD-like receptor (NLR) genes, mediators of direct and indirect pathogen recognition. This variation was most highly concentrated throughout one arm of chromosome 4 carrying a large cluster of NLR genes, associated with large-scale structural variation covering more than half of the chromosome. Our genomic assemblies uncovered alternative haplotypes and distinct complements of immune genes among individual zebrafish, including the MHC Class II locus on chromosome 8 and the NLR gene cluster on chromosome 4. While previous studies have shown marked variation in NLR genes between vertebrate species, our study highlights extensive variation in NLR gene regions between individuals of the same species. Taken together, these findings provide evidence of immune gene variation on a scale previously unknown in other vertebrate species and raise questions about potential impact on immune function.
Assuntos
Genoma , Peixe-Zebra , Animais , Peixe-Zebra/genética , Genoma/genética , Haplótipos/genética , Éxons , Cromossomos/genéticaRESUMO
Veterans are at an increased risk for prostate cancer, a disease with extraordinary clinical and molecular heterogeneity, compared with the general population. However, little is known about the underlying molecular heterogeneity within the veteran population and its impact on patient management and treatment. Using clinical and targeted tumor sequencing data from the National Veterans Affairs health system, we conducted a retrospective cohort study on 45 patients with advanced prostate cancer in the Veterans Precision Oncology Data Commons (VPODC), most of whom were metastatic castration-resistant. We characterized the mutational burden in this cohort and conducted unsupervised clustering analysis to stratify patients by molecular alterations. Veterans with prostate cancer exhibited a mutational landscape broadly similar to prior studies, including KMT2A and NOTCH1 mutations associated with neuroendocrine prostate cancer phenotype, previously reported to be enriched in veterans. We also identified several potential novel mutations in PTEN, MSH6, VHL, SMO, and ABL1 Hierarchical clustering analysis revealed two subgroups containing therapeutically targetable molecular features with novel mutational signatures distinct from those reported in the Catalogue of Somatic Mutations in Cancer database. The clustering approach presented in this study can potentially be used to clinically stratify patients based on their distinct mutational profiles and identify actionable somatic mutations for precision oncology.
Assuntos
Neoplasias da Próstata , Veteranos , Masculino , Humanos , Estudos Retrospectivos , Medicina de Precisão , Neoplasias da Próstata/genética , Neoplasias da Próstata/patologia , Oncologia , MutaçãoRESUMO
The Blood Profiling Atlas in Cancer (BLOODPAC) Consortium is a collaborative effort involving stakeholders from the public, industry, academia, and regulatory agencies focused on developing shared best practices on liquid biopsy. This report describes the results from the JFDI (Just Freaking Do It) study, a BLOODPAC initiative to develop standards on the use of contrived materials mimicking cell-free circulating tumor DNA, to comparatively evaluate clinical laboratory testing procedures. Nine independent laboratories tested the concordance, sensitivity, and specificity of commercially available contrived materials with known variant-allele frequencies (VAFs) ranging from 0.1% to 5.0%. Each participating laboratory utilized its own proprietary evaluation procedures. The results demonstrated high levels of concordance and sensitivity at VAFs of >0.1%, but reduced concordance and sensitivity at a VAF of 0.1%; these findings were similar to those from previous studies, suggesting that commercially available contrived materials can support the evaluation of testing procedures across multiple technologies. Such materials may enable more objective comparisons of results on materials formulated in-house at each center in multicenter trials. A unique goal of the collaborative effort was to develop a data resource, the BLOODPAC Data Commons, now available to the liquid-biopsy community for further study. This resource can be used to support independent evaluations of results, data extension through data integration and new studies, and retrospective evaluation of data collection.
Assuntos
DNA Tumoral Circulante , Neoplasias Hematológicas , Neoplasias , Humanos , Estudos Retrospectivos , Neoplasias/genética , Biópsia Líquida/métodosRESUMO
Effective and timely antibiotic treatment depends on accurate and rapid in silico antimicrobial-resistant (AMR) predictions. Existing statistical rule-based Mycobacterium tuberculosis (MTB) drug resistance prediction methods using bacterial genomic sequencing data often achieve varying results: high accuracy on some antibiotics but relatively low accuracy on others. Traditional machine learning (ML) approaches have been applied to classify drug resistance for MTB and have shown more stable performance. However, there is no study that uses deep learning architecture like Convolutional Neural Network (CNN) on a large and diverse cohort of MTB samples for AMR prediction. We developed 24 binary classifiers of MTB drug resistance status across eight anti-MTB drugs and three different ML algorithms: logistic regression, random forest and 1D CNN using a training dataset of 10,575 MTB isolates collected from 16 countries across six continents, where an extended pan-genome reference was used for detecting genetic features. Our 1D CNN architecture was designed to integrate both sequential and non-sequential features. In terms of F1-scores, 1D CNN models are our best classifiers that are also more accurate and stable than the state-of-the-art rule-based tool Mykrobe predictor (81.1 to 93.8%, 93.7 to 96.2%, 93.1 to 94.8%, 95.9 to 97.2% and 97.1 to 98.2% for ethambutol, rifampicin, pyrazinamide, isoniazid and ofloxacin respectively). We applied filter-based feature selection to find AMR relevant features. All selected variant features are AMR-related ones in CARD database. 78.8% of them are also in the catalogue of MTB mutations that were recently identified as drug resistance-associated ones by WHO. To facilitate ML model development for AMR prediction, we packaged every step into an automated pipeline and shared the source code at https://github.com/KuangXY3/MTB-AMR-classification-CNN .
Assuntos
Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Confiabilidade dos Dados , Aprendizado Profundo , Farmacorresistência Bacteriana Múltipla/genética , Genoma Bacteriano/efeitos dos fármacos , Mycobacterium tuberculosis/genética , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico , Sequenciamento Completo do Genoma/métodos , Estudos de Coortes , Humanos , Testes de Sensibilidade Microbiana , Mutação , Mycobacterium tuberculosis/isolamento & purificação , Fenótipo , Filogenia , Prognóstico , Tuberculose Resistente a Múltiplos Medicamentos/microbiologiaRESUMO
We analyzed RNA sequencing data from nasal swabs used for SARS-CoV-2 testing. 13% of 317 PCR-negative samples contained over 100 reads aligned to multiple regions of the SARS-CoV-2 genome. Differential gene expression analysis compares the host gene expression in potential false-negative (FN: PCR negative, sequencing positive) samples to subjects with multiple SARS-CoV-2 viral loads. The host transcriptional response in FN samples was distinct from true negative samples (PCR & sequencing negative) and similar to low viral load samples. Gene Ontology analysis shows viral load-dependent changes in gene expression are functionally distinct; 23 common pathways include responses to viral infections and associated immune responses. GO analysis reveals FN samples had a high overlap with high viral load samples. Deconvolution of RNA-seq data shows similar cell content across viral loads. Hence, transcriptome analysis of nasal swabs provides an additional level of identifying SARS-CoV-2 infection.
RESUMO
OBJECTIVE: The objective was to develop and operate a cloud-based federated system for managing, analyzing, and sharing patient data for research purposes, while allowing each resource sharing patient data to operate their component based upon their own governance rules. The federated system is called the Biomedical Research Hub (BRH). MATERIALS AND METHODS: The BRH is a cloud-based federated system built over a core set of software services called framework services. BRH framework services include authentication and authorization, services for generating and assessing findable, accessible, interoperable, and reusable (FAIR) data, and services for importing and exporting bulk clinical data. The BRH includes data resources providing data operated by different entities and workspaces that can access and analyze data from one or more of the data resources in the BRH. RESULTS: The BRH contains multiple data commons that in aggregate provide access to over 6 PB of research data from over 400 000 research participants. DISCUSSION AND CONCLUSION: With the growing acceptance of using public cloud computing platforms for biomedical research, and the growing use of opaque persistent digital identifiers for datasets, data objects, and other entities, there is now a foundation for systems that federate data from multiple independently operated data resources that expose FAIR application programming interfaces, each using a separate data model. Applications can be built that access data from one or more of the data resources.
Assuntos
Pesquisa Biomédica , Computação em Nuvem , Humanos , SoftwareRESUMO
The COVID-19 pandemic has affected African American populations disproportionately with respect to prevalence, and mortality. Expression profiles represent snapshots of combined genetic, socio-environmental (including socioeconomic and environmental factors), and physiological effects on the molecular phenotype. As such, they have potential to improve biological understanding of differences among populations, and provide therapeutic biomarkers and environmental mitigation strategies. Here, we undertook a large-scale assessment of patterns of gene expression between African Americans and European Americans, mining RNA-Seq data from 25 non-diseased and diseased (tumor) tissue-types. We observed the widespread enrichment of pathways implicated in COVID-19 and integral to inflammation and reactive oxygen stress. Chemokine CCL3L3 expression is up-regulated in African Americans. GSTM1, encoding a glutathione S-transferase that metabolizes reactive oxygen species and xenobiotics, is upregulated. The little-studied F8A2 gene is up to 40-fold more highly expressed in African Americans; F8A2 encodes HAP40 protein, which mediates endosome movement, potentially altering the cellular response to SARS-CoV-2. African American expression signatures, superimposed on single cell-RNA reference data, reveal increased number or activity of esophageal glandular cells and lung ACE2-positive basal keratinocytes. Our findings establish basal prognostic signatures that can be used to refine approaches to minimize risk of severe infection and improve precision treatment of COVID-19 for African Americans. To enable dissection of causes of divergent molecular phenotypes, we advocate routine inclusion of metadata on genomic and socio-environmental factors for human RNA-sequencing studies.
Assuntos
Negro ou Afro-Americano/genética , COVID-19/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , População Branca/genética , COVID-19/epidemiologia , COVID-19/virologia , Quimiocina CCL3/genética , Redes Reguladoras de Genes , Glutationa Transferase/genética , Humanos , Neoplasias/classificação , Neoplasias/etnologia , Proteínas Nucleares/genética , Pandemias , Prognóstico , RNA-Seq/métodos , SARS-CoV-2/isolamento & purificação , SARS-CoV-2/fisiologia , Fatores Socioeconômicos , Estados Unidos/epidemiologiaRESUMO
The goal of the National Cancer Institute's (NCI's) Genomic Data Commons (GDC) is to provide the cancer research community with a data repository of uniformly processed genomic and associated clinical data that enables data sharing and collaborative analysis in the support of precision medicine. The initial GDC dataset include genomic, epigenomic, proteomic, clinical and other data from the NCI TCGA and TARGET programs. Data production for the GDC started in June, 2015 using an OpenStack-based private cloud. By June of 2016, the GDC had analyzed more than 50,000 raw sequencing data inputs, as well as multiple other data types. Using the latest human genome reference build GRCh38, the GDC generated a variety of data types from aligned reads to somatic mutations, gene expression, miRNA expression, DNA methylation status, and copy number variation. In this paper, we describe the pipelines and workflows used to process and harmonize the data in the GDC. The generated data, as well as the original input files from TCGA and TARGET, are available for download and exploratory analysis at the GDC Data Portal and Legacy Archive ( https://gdc.cancer.gov/ ).
Assuntos
Análise de Dados , Bases de Dados Genéticas , Genômica , Sequência de Bases , Variações do Número de Cópias de DNA/genética , Metilação de DNA/genética , Regulação da Expressão Gênica , Genoma Humano , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Anotação de Sequência Molecular , Mutação/genética , National Cancer Institute (U.S.) , RNA-Seq , Reprodutibilidade dos Testes , Estados Unidos , Vírus/genéticaRESUMO
BACKGROUND: Tumor-infiltrating CD8+ T cells and neoantigens are predictors of a favorable prognosis and response to immunotherapy with checkpoint inhibitors in many types of adult cancer, but little is known about their role in pediatric malignancies. Here, we analyzed the prognostic strength of T cell-inflamed gene expression and neoantigen load in high-risk neuroblastoma. We also compared transcriptional programs in T cell-inflamed and non-T cell-inflamed high-risk neuroblastomas to investigate possible mechanisms of immune exclusion. METHODS: A defined T cell-inflamed gene expression signature was used to categorize high-risk neuroblastomas in the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) program (n=123), and the Gabriella Miller Kids First (GMKF) program (n=48) into T cell-inflamed, non-T cell-inflamed, and intermediate groups. Associations between the T cell-inflamed and non-T cell-inflamed group, MYCN amplification, and survival were analyzed by Cox proportional hazards models. Additional survival analysis was conducted after integrating neoantigen load predicted from somatic mutations. Pathways activated in non-T cell-inflamed relative to T cell-inflamed tumors were analyzed using causal network analysis. RESULTS: Patients with T cell-inflamed high-risk tumors showed improved overall survival compared with those with non-T cell-inflamed tumors (p<0.05), independent of MYCN amplification status, in both TARGET and GMKF cohorts. Higher neoantigen load was also associated with better event-free and overall survival (p<0.005) and was independent of the T cell-inflamed signature. Activation of MYCN, ASCL1, SOX11, and KMT2A transcriptional programs was inversely correlated with the T cell-inflamed signature in both cohorts. CONCLUSIONS: Our results indicate that tumors from children with high-risk neuroblastoma harboring a strong T cell-inflamed signature have a more favorable clinical outcome, and neoantigen load is a prognosis predictor, independent of T cell inflammation. Strategies to target SOX11 and other signaling pathways associated with non-T cell-inflamed tumors should be pursued as potential immune-potentiating interventions.
Assuntos
Imunoterapia/métodos , Neuroblastoma/imunologia , Microambiente Tumoral/imunologia , Estudos de Coortes , Feminino , Humanos , Masculino , Neuroblastoma/mortalidade , Prognóstico , Fatores de Risco , Análise de SobrevidaRESUMO
BACKGROUND: Pandemic COVID-19 by severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2) infection is facilitated by the ACE2 receptor and protease TMPRSS2. Modestly sized case series have described clinical factors associated with COVID-19, while ACE2 and TMPRSS2 expression analyses have been described in some cell types. Patients with cancer may have worse outcomes to COVID-19. METHODS: We performed an integrated study of ACE2 and TMPRSS2 gene expression across and within organ systems, by normal versus tumor, across several existing databases (The Cancer Genome Atlas, Census of Immune Single Cell Expression Atlas, The Human Cell Landscape, and more). We correlated gene expression with clinical factors (including but not limited to age, gender, race, body mass index, and smoking history), HLA genotype, immune gene expression patterns, cell subsets, and single-cell sequencing as well as commensal microbiome. RESULTS: Matched normal tissues generally display higher ACE2 and TMPRSS2 expression compared with cancer, with normal and tumor from digestive organs expressing the highest levels. No clinical factors were consistently identified to be significantly associated with gene expression levels though outlier organ systems were observed for some factors. Similarly, no HLA genotypes were consistently associated with gene expression levels. Strong correlations were observed between ACE2 expression levels and multiple immune gene signatures including interferon-stimulated genes and the T cell-inflamed phenotype as well as inverse associations with angiogenesis and transforming growth factor-ß signatures. ACE2 positively correlated with macrophage subsets across tumor types. TMPRSS2 was less associated with immune gene expression but was strongly associated with epithelial cell abundance. Single-cell sequencing analysis across nine independent studies demonstrated little to no ACE2 or TMPRSS2 expression in lymphocytes or macrophages. ACE2 and TMPRSS2 gene expression associated with commensal microbiota in matched normal tissues particularly from colorectal cancers, with distinct bacterial populations showing strong associations. CONCLUSIONS: We performed a large-scale integration of ACE2 and TMPRSS2 gene expression across clinical, genetic, and microbiome domains. We identify novel associations with the microbiota and confirm host immunity associations with gene expression. We suggest caution in interpretation regarding genetic associations with ACE2 expression suggested from smaller case series.