RESUMO
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) causes coronavirus disease 2019 (COVID-19), a respiratory illness that can result in hospitalization or death. We used exome sequence data to investigate associations between rare genetic variants and seven COVID-19 outcomes in 586,157 individuals, including 20,952 with COVID-19. After accounting for multiple testing, we did not identify any clear associations with rare variants either exome wide or when specifically focusing on (1) 13 interferon pathway genes in which rare deleterious variants have been reported in individuals with severe COVID-19, (2) 281 genes located in susceptibility loci identified by the COVID-19 Host Genetics Initiative, or (3) 32 additional genes of immunologic relevance and/or therapeutic potential. Our analyses indicate there are no significant associations with rare protein-coding variants with detectable effect sizes at our current sample sizes. Analyses will be updated as additional data become available, and results are publicly available through the Regeneron Genetics Center COVID-19 Results Browser.
Assuntos
COVID-19/diagnóstico , COVID-19/genética , Sequenciamento do Exoma , Exoma/genética , Predisposição Genética para Doença , Hospitalização/estatística & dados numéricos , COVID-19/imunologia , COVID-19/terapia , Feminino , Humanos , Interferons/genética , Masculino , Prognóstico , SARS-CoV-2 , Tamanho da AmostraRESUMO
BACKGROUND: Predictive biomarkers of immune checkpoint inhibitor (ICI) efficacy are currently lacking for non-small cell lung cancer (NSCLC). Here, we describe the results from the Anti-PD-1 Response Prediction DREAM Challenge, a crowdsourced initiative that enabled the assessment of predictive models by using data from two randomized controlled clinical trials (RCTs) of ICIs in first-line metastatic NSCLC. METHODS: Participants developed and trained models using public resources. These were evaluated with data from the CheckMate 026 trial (NCT02041533), according to the model-to-data paradigm to maintain patient confidentiality. The generalizability of the models with the best predictive performance was assessed using data from the CheckMate 227 trial (NCT02477826). Both trials were phase III RCTs with a chemotherapy control arm, which supported the differentiation between predictive and prognostic models. Isolated model containers were evaluated using a bespoke strategy that considered the challenges of handling transcriptome data from clinical trials. RESULTS: A total of 59 teams participated, with 417 models submitted. Multiple predictive models, as opposed to a prognostic model, were generated for predicting overall survival, progression-free survival, and progressive disease status with ICIs. Variables within the models submitted by participants included tumor mutational burden (TMB), programmed death ligand 1 (PD-L1) expression, and gene-expression-based signatures. The best-performing models showed improved predictive power over reference variables, including TMB or PD-L1. CONCLUSIONS: This DREAM Challenge is the first successful attempt to use protected phase III clinical data for a crowdsourced effort towards generating predictive models for ICI clinical outcomes and could serve as a blueprint for similar efforts in other tumor types and disease states, setting a benchmark for future studies aiming to identify biomarkers predictive of ICI efficacy. TRIAL REGISTRATION: CheckMate 026; NCT02041533, registered January 22, 2014. CheckMate 227; NCT02477826, registered June 23, 2015.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Humanos , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Inibidores de Checkpoint Imunológico/uso terapêutico , Neoplasias Pulmonares/patologia , Antígeno B7-H1 , Biomarcadores TumoraisRESUMO
SUMMARY: For heterogeneous tissues, measurements of gene expression through mRNA-Seq data are confounded by relative proportions of cell types involved. In this note, we introduce an efficient pipeline: DeconRNASeq, an R package for deconvolution of heterogeneous tissues based on mRNA-Seq data. It adopts a globally optimized non-negative decomposition algorithm through quadratic programming for estimating the mixing proportions of distinctive tissue types in next-generation sequencing data. We demonstrated the feasibility and validity of DeconRNASeq across a range of mixing levels and sources using mRNA-Seq data mixed in silico at known concentrations. We validated our computational approach for various benchmark data, with high correlation between our predicted cell proportions and the real fractions of tissues. Our study provides a rigorous, quantitative and high-resolution tool as a prerequisite to use mRNA-Seq data. The modularity of package design allows an easy deployment of custom analytical pipelines for data from other high-throughput platforms. AVAILABILITY: DeconRNASeq is written in R, and is freely available at http://bioconductor.org/packages. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Simulação por Computador , Interpretação Estatística de Dados , Modelos Lineares , RNA Mensageiro/metabolismoRESUMO
LMX1B encodes a homeodomain-containing transcription factor that is essential during development. Mutations in LMX1B cause nail-patella syndrome, characterized by dysplasia of the patellae, nails, and elbows and FSGS with specific ultrastructural lesions of the glomerular basement membrane (GBM). By linkage analysis and exome sequencing, we unexpectedly identified an LMX1B mutation segregating with disease in a pedigree of five patients with autosomal dominant FSGS but without either extrarenal features or ultrastructural abnormalities of the GBM suggestive of nail-patella-like renal disease. Subsequently, we screened 73 additional unrelated families with FSGS and found mutations involving the same amino acid (R246) in 2 families. An LMX1B in silico homology model suggested that the mutated residue plays an important role in strengthening the interaction between the LMX1B homeodomain and DNA; both identified mutations would be expected to diminish such interactions. In summary, these results suggest that isolated FSGS could result from mutations in genes that are also involved in syndromic forms of FSGS. This highlights the need to include these genes in all diagnostic approaches to FSGS that involve next-generation sequencing.
Assuntos
Glomerulosclerose Segmentar e Focal/genética , Proteínas com Homeodomínio LIM/genética , Síndrome da Unha-Patela/genética , Fatores de Transcrição/genética , Adolescente , Adulto , Criança , Feminino , Genes Dominantes , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Linhagem , Análise de Sequência de DNA , Adulto JovemRESUMO
Signaling pathways are the fundamental grammar of cellular communication, yet few frameworks are available to analyze molecular imaging probes in the context of signaling pathways. Such a framework would aid in the design and selection of imaging probes for measuring specific signaling pathways and, vice versa, help illuminate which pathways are being assayed by a given probe. RAMP (Researching imaging Agents through Molecular Pathways) is a bioinformatics framework for connecting signaling pathways and imaging probes using a controlled vocabulary of the imaging targets. RAMP contains signaling pathway data from MetaCore, the Kyoto Encyclopedia of Genes and Genomes, and the Gene Ontology project; imaging probe data from the Molecular Imaging and Contrast Agent Database (MICAD); and tissue protein expression data from The Human Protein Atlas. The RAMP search tool is available at
Assuntos
Biologia Computacional/métodos , Meios de Contraste/química , Meios de Contraste/metabolismo , Imagem Molecular , Transdução de Sinais , Software , Bases de Dados Factuais , Humanos , Internet , Modelos Biológicos , Proteínas/análise , Proteínas/metabolismoRESUMO
The UK Biobank Exome Sequencing Consortium (UKB-ESC) is a private-public partnership between the UK Biobank (UKB) and eight biopharmaceutical companies that will complete the sequencing of exomes for all ~500,000 UKB participants. Here, we describe the early results from ~200,000 UKB participants and the features of this project that enabled its success. The biopharmaceutical industry has increasingly used human genetics to improve success in drug discovery. Recognizing the need for large-scale human genetics data, as well as the unique value of the data access and contribution terms of the UKB, the UKB-ESC was formed. As a result, exome data from 200,643 UKB enrollees are now available. These data include ~10 million exonic variants-a rich resource of rare coding variation that is particularly valuable for drug discovery. The UKB-ESC precompetitive collaboration has further strengthened academic and industry ties and has provided teams with an opportunity to interact with and learn from the wider research community.
Assuntos
Bancos de Espécimes Biológicos , Descoberta de Drogas , Sequenciamento do Exoma , Genética Humana , Pesquisa , Descoberta de Drogas/métodos , Genômica/métodos , Humanos , Reino UnidoRESUMO
BACKGROUND: Interstitial fibrosis and tubular atrophy (IF/TA) in renal transplants are the major morphological correlates of progressive graft deterioration. Early diagnosis of IF/TA is a pre-requisite for a timely therapeutic intervention in patients at risk. To evaluate events occurring before the overt onset of IF/TA, gene expression profiling of 3-month protocol biopsies from patients with IF/TA was performed in a patient group (n = 8) who developed mild IF/TA [chronic allograft nephropathy (CAN) grade I, by the Banff scoring system] in the subsequent 6-month protocol biopsy ('progressors'), and in 12 patients without IF/TA at 6 months ('non-progressors'). METHODS: RNA was extracted, labelled and hybridized to human specific genome wide DNA microarrays. Normalized data were subjected to gene-centric and pathway-centric statistical methods. RESULTS: Compared to the non-progressors, the 3-month biopsies of the progressor group showed overexpression of several genes that are important in the T- and B-cell activation and immune response. Genes involved in pro-fibrotic processes were identified in the biopsies of the progressors that preceded the observed IF/TA at 6 months. Furthermore, several genes with transporter and metabolic functions were underrepresented in the progressors in the 3-month biopsies. CONCLUSION: Gene expression profiling of early protocol biopsies identified changes in the transcriptome of grafts, which may be important for the development of IF/TA. Such early detection of transcriptome changes can facilitate the identification of patients at risk shifting the intervention time point well before the histological diagnosis of irreversible IF/TA.
Assuntos
Atrofia/genética , Fibrose/genética , Perfilação da Expressão Gênica , Rejeição de Enxerto/genética , Transplante de Rim , Túbulos Renais/metabolismo , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Atrofia/metabolismo , Atrofia/patologia , Biomarcadores/metabolismo , Biópsia , Criança , Feminino , Fibrose/metabolismo , Fibrose/patologia , Genoma Humano , Rejeição de Enxerto/metabolismo , Humanos , Técnicas Imunoenzimáticas , Túbulos Renais/patologia , Masculino , Pessoa de Meia-Idade , Análise de Sequência com Séries de Oligonucleotídeos , Prognóstico , Transplante Homólogo , Adulto JovemRESUMO
PURPOSE: CheckMate 568 is an open-label phase II trial that evaluated the efficacy and safety of nivolumab plus low-dose ipilimumab as first-line treatment of advanced/metastatic non-small-cell lung cancer (NSCLC). We assessed the association of efficacy with programmed death ligand 1 (PD-L1) expression and tumor mutational burden (TMB). PATIENTS AND METHODS: Two hundred eighty-eight patients with previously untreated, recurrent stage IIIB/IV NSCLC received nivolumab 3 mg/kg every 2 weeks plus ipilimumab 1 mg/kg every 6 weeks. The primary end point was objective response rate (ORR) in patients with 1% or more and less than 1% tumor PD-L1 expression. Efficacy on the basis of TMB (FoundationOne CDx assay) was a secondary end point. RESULTS: Of treated patients with tumor available for testing, 252 patients (88%) of 288 were evaluable for PD-L1 expression and 98 patients (82%) of 120 for TMB. ORR was 30% overall and 41% and 15% in patients with 1% or greater and less than 1% tumor PD-L1 expression, respectively. ORR increased with higher TMB, plateauing at 10 or more mutations/megabase (mut/Mb). Regardless of PD-L1 expression, ORRs were higher in patients with TMB of 10 or more mut/Mb (n = 48: PD-L1, ≥ 1%, 48%; PD-L1, < 1%, 47%) versus TMB of fewer than 10 mut/Mb (n = 50: PD-L1, ≥ 1%, 18%; PD-L1, < 1%, 5%), and progression-free survival was longer in patients with TMB of 10 or more mut/Mb versus TMB of fewer than 10 mut/Mb (median, 7.1 v 2.6 months). Grade 3 to 4 treatment-related adverse events occurred in 29% of patients. CONCLUSION: Nivolumab plus low-dose ipilimumab was effective and tolerable as a first-line treatment of advanced/metastatic NSCLC. TMB of 10 or more mut/Mb was associated with improved response and prolonged progression-free survival in both tumor PD-L1 expression 1% or greater and less than 1% subgroups and was thus identified as a potentially relevant cutoff in the assessment of TMB as a biomarker for first-line nivolumab plus ipilimumab.
Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Antígeno B7-H1/biossíntese , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Neoplasias Pulmonares/tratamento farmacológico , Mutação , Adulto , Idoso , Idoso de 80 Anos ou mais , Antígeno B7-H1/imunologia , Biomarcadores Tumorais/biossíntese , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/imunologia , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/imunologia , Feminino , Humanos , Ipilimumab/administração & dosagem , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/imunologia , Masculino , Pessoa de Meia-Idade , Recidiva Local de Neoplasia/tratamento farmacológico , Estadiamento de Neoplasias , Nivolumabe/administração & dosagem , Resultado do TratamentoRESUMO
MOTIVATION: We describe an extension of the pathway-based enrichment approach for analyzing microarray data via a robust test for transcriptional variance. The use of a variance test is intended to identify additional patterns of transcriptional regulation in which many genes in a pathway are up- and down-regulated. Such patterns may be indicative of the reciprocal regulation of pathway activators and inhibitors or of the differential regulation of separate biological sub-processes and should extend the number of detectable patterns of transcriptional modulation. RESULTS: We validated this new statistical approach on a microarray experiment that captures the temporal transcriptional profile of muscle differentiation in mouse C2C12 cells. Comparisons of the transcriptional state of myoblasts and differentiated myotubes via a robust variance test implicated several novel pathways in muscle cell differentiation previously overlooked by a standard enrichment analysis. Specifically, pathways involved in cell structure, calcium-mediated signaling and muscle-specific signaling were identified as differentially modulated based on their increased transcriptional variance. These biologically relevant results validate this approach and demonstrate the flexible nature of pathway-based methods of data analysis. AVAILABILITY: The software is available as Supplementary Material.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Células Musculares/citologia , Células Musculares/metabolismo , Mioblastos/citologia , Mioblastos/metabolismo , Transdução de Sinais/fisiologia , Fatores de Transcrição/metabolismo , Animais , Diferenciação Celular , Simulação por Computador , Variação Genética/genética , Camundongos , Modelos Biológicos , Proteínas Musculares/metabolismo , Ativação Transcricional/fisiologiaRESUMO
Durable responses and encouraging survival have been demonstrated with immune checkpoint inhibitors in small-cell lung cancer (SCLC), but predictive markers are unknown. We used whole exome sequencing to evaluate the impact of tumor mutational burden on efficacy of nivolumab monotherapy or combined with ipilimumab in patients with SCLC from the nonrandomized or randomized cohorts of CheckMate 032. Patients received nivolumab (3 mg/kg every 2 weeks) or nivolumab plus ipilimumab (1 mg/kg plus 3 mg/kg every 3 weeks for four cycles, followed by nivolumab 3 mg/kg every 2 weeks). Efficacy of nivolumab ± ipilimumab was enhanced in patients with high tumor mutational burden. Nivolumab plus ipilimumab appeared to provide a greater clinical benefit than nivolumab monotherapy in the high tumor mutational burden tertile.
Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/administração & dosagem , Ipilimumab/administração & dosagem , Neoplasias Pulmonares/tratamento farmacológico , Mutação , Nivolumabe/administração & dosagem , Carcinoma de Pequenas Células do Pulmão/tratamento farmacológico , Adulto , Idoso , Idoso de 80 Anos ou mais , Protocolos de Quimioterapia Combinada Antineoplásica/farmacologia , Relação Dose-Resposta a Droga , Esquema de Medicação , Feminino , Humanos , Ipilimumab/farmacologia , Neoplasias Pulmonares/genética , Masculino , Pessoa de Meia-Idade , Nivolumabe/farmacologia , Carcinoma de Pequenas Células do Pulmão/genética , Resultado do Tratamento , Carga Tumoral/efeitos dos fármacos , Sequenciamento do ExomaRESUMO
BACKGROUND: Using a gene clustering strategy we determined intracellular pathway relationships within skeletal myotubes in response to an acute heat stress stimuli. Following heat shock, the transcriptome was analyzed by microarray in a temporal fashion to characterize the dynamic relationship of signaling pathways. RESULTS: Bioinformatics analyses exposed coordination of functionally-related gene sets, depicting mechanism-based responses to heat shock. Protein turnover-related pathways were significantly affected including protein folding, pre-mRNA processing, mRNA splicing, proteolysis and proteasome-related pathways. Many responses were transient, tending to normalize within 24 hours. CONCLUSION: In summary, we show that the transcriptional response to acute cell stress is largely transient and proteosome-centric.
Assuntos
Regulação da Expressão Gênica , Transtornos de Estresse por Calor , Família Multigênica , Animais , Linhagem Celular , Perfilação da Expressão Gênica , Camundongos , Fibras Musculares Esqueléticas/fisiologia , Análise de Sequência com Séries de Oligonucleotídeos , Proteoma/análise , Transdução de Sinais/fisiologia , Transcrição GênicaRESUMO
Therapeutic options for the treatment of an increasing variety of cancers have been expanded by the introduction of a new class of drugs, commonly referred to as checkpoint blocking agents, that target the host immune system to positively modulate anti-tumor immune response. Although efficacy of these agents has been linked to a pre-existing level of tumor immune infiltrate, it remains unclear why some patients exhibit deep and durable responses to these agents while others do not benefit. To examine the influence of tumor genetics on tumor immune state, we interrogated the relationship between somatic mutation and copy number alteration with infiltration levels of 7 immune cell types across 40 tumor cohorts in The Cancer Genome Atlas. Levels of cytotoxic T, regulatory T, total T, natural killer, and B cells, as well as monocytes and M2 macrophages, were estimated using a novel set of transcriptional signatures that were designed to resist interference from the cellular heterogeneity of tumors. Tumor mutational load and estimates of tumor purity were included in our association models to adjust for biases in multi-modal genomic data. Copy number alterations, mutations summarized at the gene level, and position-specific mutations were evaluated for association with tumor immune infiltration. We observed a strong relationship between copy number loss of a large region of chromosome 9p and decreased lymphocyte estimates in melanoma, pancreatic, and head/neck cancers. Mutations in the oncogenes PIK3CA, FGFR3, and RAS/RAF family members, as well as the tumor suppressor TP53, were linked to changes in immune infiltration, usually in restricted tumor types. Associations of specific WNT/beta-catenin pathway genetic changes with immune state were limited, but we noted a link between 9p loss and the expression of the WNT receptor FZD3, suggesting that there are interactions between 9p alteration and WNT pathways. Finally, two different cell death regulators, CASP8 and DIDO1, were often mutated in head/neck tumors that had higher lymphocyte infiltrates. In summary, our study supports the relevance of tumor genetics to questions of efficacy and resistance in checkpoint blockade therapies. It also highlights the need to assess genome-wide influences during exploration of any specific tumor pathway hypothesized to be relevant to therapeutic response. Some of the observed genetic links to immune state, like 9p loss, may influence response to cancer immune therapies. Others, like mutations in cell death pathways, may help guide combination therapeutic approaches.
Assuntos
Estudo de Associação Genômica Ampla , Neoplasias/genética , Neoplasias/imunologia , Biomarcadores Tumorais/genética , Linfócitos T CD8-Positivos/imunologia , Cromossomos Humanos Par 9/genética , Dosagem de Genes , Neoplasias de Cabeça e Pescoço/genética , Humanos , Mutação/genética , Proteínas de Neoplasias/genética , Análise de Sequência de RNA , Proteína Supressora de Tumor p53/genéticaRESUMO
MOTIVATION: Identification and characterization of protein structure regularities can reveal the mechanisms governing protein structure, function and evolution. Here we focus on an intermediate level of regularity. We have developed automated methods to systematically construct a dictionary of supersecondary structures that can be used as 'protein parts' to describe fold-sized structures. RESULTS: The dictionary was constructed by aligning representative structures of all known folds, clustering similar substructures and selecting the most descriptive substructures in a minimum description length fashion. We show that the dictionary is compact and descriptive, capable of describing a substantial fraction of all known protein folds. We performed simulations using independent sets of training and testing folds. Dictionaries generated using the training set had high coverage over the folds in the testing set, suggesting that dictionary entries reflect general features of protein structures and should be capable of describing novel protein folds.
Assuntos
Algoritmos , Bases de Dados de Proteínas , Dobramento de Proteína , Proteínas/química , Proteínas/ultraestrutura , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Inteligência Artificial , Dicionários como Assunto , Dados de Sequência Molecular , Estrutura Secundária de ProteínaRESUMO
BACKGROUND: The sequencing of the human genome has enabled us to access a comprehensive list of genes (both experimental and predicted) for further analysis. While a majority of the approximately 30,000 known and predicted human coding genes are characterized and have been assigned at least one function, there remains a fair number of genes (about 12,000) for which no annotation has been made. The recent sequencing of other genomes has provided us with a huge amount of auxiliary sequence data which could help in the characterization of the human genes. Clustering these sequences into families is one of the first steps to perform comparative studies across several genomes. RESULTS: Here we report a novel clustering algorithm (CLUGEN) that has been used to cluster sequences of experimentally verified and predicted proteins from all sequenced genomes using a novel distance metric which is a neural network score between a pair of protein sequences. This distance metric is based on the pairwise sequence similarity score and the similarity between their domain structures. The distance metric is the probability that a pair of protein sequences are of the same Interpro family/domain, which facilitates the modelling of transitive homology closure to detect remote homologues. The hierarchical average clustering method is applied with the new distance metric. CONCLUSION: Benchmarking studies of our algorithm versus those reported in the literature shows that our algorithm provides clustering results with lower false positive and false negative rates. The clustering algorithm is applied to cluster several eukaryotic genomes and several dozens of prokaryotic genomes.
Assuntos
Algoritmos , Redes Neurais de Computação , Alinhamento de Sequência , Análise de Sequência de Proteína/métodos , Benchmarking , Análise por Conglomerados , Curva ROC , Homologia de Sequência de Aminoácidos , Validação de Programas de ComputadorRESUMO
Ciliopathies are a large group of clinically and genetically heterogeneous disorders caused by defects in primary cilia. Here we identified mutations in TRAF3IP1 (TNF Receptor-Associated Factor Interacting Protein 1) in eight patients from five families with nephronophthisis (NPH) and retinal degeneration, two of the most common manifestations of ciliopathies. TRAF3IP1 encodes IFT54, a subunit of the IFT-B complex required for ciliogenesis. The identified mutations result in mild ciliary defects in patients but also reveal an unexpected role of IFT54 as a negative regulator of microtubule stability via MAP4 (microtubule-associated protein 4). Microtubule defects are associated with altered epithelialization/polarity in renal cells and with pronephric cysts and microphthalmia in zebrafish embryos. Our findings highlight the regulation of cytoplasmic microtubule dynamics as a role of the IFT54 protein beyond the cilium, contributing to the development of NPH-related ciliopathies.
Assuntos
Proteínas de Transporte/genética , Doenças Renais Císticas/genética , Proteínas Associadas aos Microtúbulos/genética , Proteínas Associadas aos Microtúbulos/metabolismo , Microtúbulos/metabolismo , Mutação , Degeneração Retiniana/genética , Proteínas de Peixe-Zebra/genética , Animais , Western Blotting , Proteínas de Transporte/metabolismo , Polaridade Celular/genética , Dicroísmo Circular , Embrião não Mamífero , Feminino , Imunofluorescência , Técnicas de Inativação de Genes , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunoprecipitação , Doenças Renais Císticas/metabolismo , Masculino , Microftalmia/genética , Linhagem , Degeneração Retiniana/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Peixe-Zebra , Proteínas de Peixe-Zebra/metabolismoRESUMO
To identify genes involved in hearing, 8494 expressed sequence tags (ESTs) were generated from a human fetal cochlear cDNA library in two distinct sequencing projects. Analysis of the first set of 4304 ESTs revealed clones representing 517 known human genes, 41 mammalian genes not previously detected in human tissues, 487 ESTs from other human tissues, and 541 cochlear-specific ESTs (http://hearing.bwh.harvard.edu). We now report results of a DNA sequence similarity (BLAST) analysis of an additional 4190 cochlear ESTs and a comparison to the first set. Among the 4190 new cochlear ESTs, 959 known human genes were identified; 594 were found only among the new ESTs and 365 were found among ESTs from both sequencing projects. COL1A2 was the most abundant transcript among both sets of ESTs, followed in order by COL3A1, SPARC, EEFY1A1, and TPTI. An additional 22 human homologs of known nonhuman mammalian genes and 1595 clusters of ESTs, of which 333 are cochlear-specific, were identified among the new cochlear ESTs. Map positions were determined for 373 of the new cochlear ESTs and revealed 318 additional loci. Forty-nine of the mapped ESTs are located within the genetic interval of 23 deafness loci. Reanalysis of unassigned ESTs from the prior study revealed 338 additional known human genes. The total number of known human genes identified from 8494 cochlear ESTs is 1449 and is represented by 4040 ESTs. Among the known human genes are 14 deafness-associated genes, including GJB2 (connexin 26) and KVLQT1. The total number of nonhuman mammalian genes identified is 43 and is represented by 58 ESTs. The total number of ESTs without sequence similarity to known genes is 4055. Of these, 778 also do not have sequence similarity to any other ESTs, are categorized into 700 clusters, and may represent genes uniquely or preferentially expressed in the cochlea. Identification of additional known genes, ESTs, and cochlear-specific ESTs provides new candidate genes for both syndromic and nonsyndromic deafness disorders.
Assuntos
Vias Auditivas/fisiologia , Cóclea/fisiologia , Expressão Gênica , Genes , Mapeamento Cromossômico , Conexina 26 , Conexinas , Bases de Dados Factuais , Surdez/genética , Feto , Biblioteca Gênica , Humanos , Sitios de Sequências RotuladasRESUMO
BACKGROUND: Exact sample annotation in expression microarray datasets is essential for any type of pharmacogenomics research. RESULTS: Candidate markers were explored through the application of Hartigans' dip test statistics to a publically available human whole genome microarray dataset. The marker performance was tested on 188 serial samples from 53 donors and of variable tissue origin from five public microarray datasets. A qualified transcript marker panel consisting of three probe sets for human leukocyte antigens HLA-DQA1 (2 probe sets) and HLA-DRB4 identified sample donor identifier inconsistencies in six of the 188 test samples. About 3% of the test samples require root-cause analysis due to unresolvable inaccuracies. CONCLUSIONS: The transcript marker panel consisting of HLA-DQA1 and HLA-DRB4 represents a robust, tissue-independent composite marker to assist control donor annotation concordance at the transcript level. Allele-selectivity of HLA genes renders them good candidates for "fingerprinting" with donor specific expression pattern.
RESUMO
OBJECTIVE: Macrophage activation syndrome (MAS), a life-threatening complication of systemic juvenile idiopathic arthritis (JIA), resembles familial hemophagocytic lymphohistiocytosis (HLH), a constellation of autosomal-recessive immune disorders resulting from deficiency in cytolytic pathway proteins. We undertook this study to test our hypothesis that MAS predisposition in systemic JIA could be attributed to rare gene sequence variants affecting the cytotolytic pathway. METHODS: Whole-exome sequencing was used in 14 patients with systemic JIA and MAS and in their parents to identify protein-altering single-nucleotide polymorphisms/indels in known HLH-associated genes. To discover new candidate genes, the entire whole-exome sequencing data were filtered to identify protein-altering, rare recessive homozygous, compound heterozygous, and de novo variants with the potential to affect the cytolytic pathway. RESULTS: Heterozygous protein-altering rare variants in the known genes (LYST,MUNC13-4, and STXBP2) were found in 5 of 14 patients with systemic JIA and MAS (35.7%). This was in contrast to only 4 variants in 4 of 29 patients with systemic JIA without MAS (13.8%). Homozygosity and compound heterozygosity analysis applied to the entire whole-exome sequencing data in systemic JIA/MAS revealed 3 recessive pairs in 3 genes and compound heterozygotes in 73 genes. We also identified 20 heterozygous rare protein-altering variants that occurred in at least 2 patients. Many of the identified genes encoded proteins with a role in actin and microtubule reorganization and vesicle-mediated transport. "Cellular assembly and organization" was the top cellular function category based on Ingenuity Pathways Analysis (P < 3.10 × 10(-5) ). CONCLUSION: Whole-exome sequencing performed in patients with systemic JIA and MAS identified rare protein-altering variants in known HLH-associated genes as well as in new candidate genes.