Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 128
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 184(10): 2633-2648.e19, 2021 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-33864768

RESUMO

Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.


Assuntos
Doença/genética , Herança Multifatorial/genética , População/genética , RNA Longo não Codificante/genética , Transcriptoma , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Perfilação da Expressão Gênica , Variação Genética , Humanos , Doenças Inflamatórias Intestinais/genética , Especificidade de Órgãos/genética , Locos de Características Quantitativas
2.
Cell ; 171(6): 1437-1452.e17, 2017 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-29195078

RESUMO

We previously piloted the concept of a Connectivity Map (CMap), whereby genes, drugs, and disease states are connected by virtue of common gene-expression signatures. Here, we report more than a 1,000-fold scale-up of the CMap as part of the NIH LINCS Consortium, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that we term L1000. We show that L1000 is highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts. We further show that the expanded CMap can be used to discover mechanism of action of small molecules, functionally annotate genetic variants of disease genes, and inform clinical trials. The 1.3 million L1000 profiles described here, as well as tools for their analysis, are available at https://clue.io.


Assuntos
Perfilação da Expressão Gênica/métodos , Linhagem Celular Tumoral , Resistencia a Medicamentos Antineoplásicos , Perfilação da Expressão Gênica/economia , Humanos , Neoplasias/tratamento farmacológico , Especificidade de Órgãos , Preparações Farmacêuticas/metabolismo , Análise de Sequência de RNA/economia , Análise de Sequência de RNA/métodos , Bibliotecas de Moléculas Pequenas
3.
Nat Rev Genet ; 2024 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-38548833

RESUMO

Germline variation and somatic mutation are intricately connected and together shape human traits and disease risks. Germline variants are present from conception, but they vary between individuals and accumulate over generations. By contrast, somatic mutations accumulate throughout life in a mosaic manner within an individual due to intrinsic and extrinsic sources of mutations and selection pressures acting on cells. Recent advancements, such as improved detection methods and increased resources for association studies, have drastically expanded our ability to investigate germline and somatic genetic variation and compare underlying mutational processes. A better understanding of the similarities and differences in the types, rates and patterns of germline and somatic variants, as well as their interplay, will help elucidate the mechanisms underlying their distinct yet interlinked roles in human health and biology.

4.
Cell ; 153(3): 666-77, 2013 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-23622249

RESUMO

The analysis of exonic DNA from prostate cancers has identified recurrently mutated genes, but the spectrum of genome-wide alterations has not been profiled extensively in this disease. We sequenced the genomes of 57 prostate tumors and matched normal tissues to characterize somatic alterations and to study how they accumulate during oncogenesis and progression. By modeling the genesis of genomic rearrangements, we identified abundant DNA translocations and deletions that arise in a highly interdependent manner. This phenomenon, which we term "chromoplexy," frequently accounts for the dysregulation of prostate cancer genes and appears to disrupt multiple cancer genes coordinately. Our modeling suggests that chromoplexy may induce considerable genomic derangement over relatively few events in prostate cancer and other neoplasms, supporting a model of punctuated cancer evolution. By characterizing the clonal hierarchy of genomic lesions in prostate tumors, we charted a path of oncogenic events along which chromoplexy may drive prostate carcinogenesis.


Assuntos
Aberrações Cromossômicas , Regulação Neoplásica da Expressão Gênica , Genoma Humano , Neoplasias da Próstata/genética , Adenocarcinoma/genética , Adenocarcinoma/patologia , Estudos de Coortes , Estudo de Associação Genômica Ampla , Humanos , Masculino , Tumores Neuroendócrinos/genética , Tumores Neuroendócrinos/patologia , Neoplasias da Próstata/patologia
5.
Nature ; 608(7922): 353-359, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35922509

RESUMO

Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.


Assuntos
Alelos , Perfilação da Expressão Gênica , Especificidade de Órgãos , RNA-Seq , Transcriptoma , Processamento Alternativo/genética , Linhagem Celular , Conjuntos de Dados como Assunto , Genótipo , Ribonucleoproteínas Nucleares Heterogêneas/deficiência , Ribonucleoproteínas Nucleares Heterogêneas/genética , Humanos , Especificidade de Órgãos/genética , Proteína de Ligação a Regiões Ricas em Polipirimidinas/deficiência , Proteína de Ligação a Regiões Ricas em Polipirimidinas/genética , Reprodutibilidade dos Testes , Transcriptoma/genética
6.
Am J Hum Genet ; 111(3): 445-455, 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38320554

RESUMO

Regulation of transcription and translation are mechanisms through which genetic variants affect complex traits. Expression quantitative trait locus (eQTL) studies have been more successful at identifying cis-eQTL (within 1 Mb of the transcription start site) than trans-eQTL. Here, we tested the cis component of gene expression for association with observed plasma protein levels to identify cis- and trans-acting genes that regulate protein levels. We used transcriptome prediction models from 49 Genotype-Tissue Expression (GTEx) Project tissues to predict the cis component of gene expression and tested the predicted expression of every gene in every tissue for association with the observed abundance of 3,622 plasma proteins measured in 3,301 individuals from the INTERVAL study. We tested significant results for replication in 971 individuals from the Trans-omics for Precision Medicine (TOPMed) Multi-Ethnic Study of Atherosclerosis (MESA). We found 1,168 and 1,210 cis- and trans-acting associations that replicated in TOPMed (FDR < 0.05) with a median expected true positive rate (π1) across tissues of 0.806 and 0.390, respectively. The target proteins of trans-acting genes were enriched for transcription factor binding sites and autoimmune diseases in the GWAS catalog. Furthermore, we found a higher correlation between predicted expression and protein levels of the same underlying gene (R = 0.17) than observed expression (R = 0.10, p = 7.50 × 10-11). This indicates the cis-acting genetically regulated (heritable) component of gene expression is more consistent across tissues than total observed expression (genetics + environment) and is useful in uncovering the function of SNPs associated with complex traits.


Assuntos
Proteoma , Transcriptoma , Humanos , Transcriptoma/genética , Proteoma/genética , Herança Multifatorial , Locos de Características Quantitativas/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética
7.
Am J Hum Genet ; 111(1): 133-149, 2024 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-38181730

RESUMO

Bulk-tissue molecular quantitative trait loci (QTLs) have been the starting point for interpreting disease-associated variants, and context-specific QTLs show particular relevance for disease. Here, we present the results of mapping interaction QTLs (iQTLs) for cell type, age, and other phenotypic variables in multi-omic, longitudinal data from the blood of individuals of diverse ancestries. By modeling the interaction between genotype and estimated cell-type proportions, we demonstrate that cell-type iQTLs could be considered as proxies for cell-type-specific QTL effects, particularly for the most abundant cell type in the tissue. The interpretation of age iQTLs, however, warrants caution because the moderation effect of age on the genotype and molecular phenotype association could be mediated by changes in cell-type composition. Finally, we show that cell-type iQTLs contribute to cell-type-specific enrichment of diseases that, in combination with additional functional data, could guide future functional studies. Overall, this study highlights the use of iQTLs to gain insights into the context specificity of regulatory effects.


Assuntos
Regulação da Expressão Gênica , Locos de Características Quantitativas , Humanos , Locos de Características Quantitativas/genética , Genótipo , Fenótipo
8.
Cell ; 150(2): 251-63, 2012 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-22817889

RESUMO

Despite recent insights into melanoma genetics, systematic surveys for driver mutations are challenged by an abundance of passenger mutations caused by carcinogenic UV light exposure. We developed a permutation-based framework to address this challenge, employing mutation data from intronic sequences to control for passenger mutational load on a per gene basis. Analysis of large-scale melanoma exome data by this approach discovered six novel melanoma genes (PPP6C, RAC1, SNX31, TACC1, STK19, and ARID2), three of which-RAC1, PPP6C, and STK19-harbored recurrent and potentially targetable mutations. Integration with chromosomal copy number data contextualized the landscape of driver mutations, providing oncogenic insights in BRAF- and NRAS-driven melanoma as well as those without known NRAS/BRAF mutations. The landscape also clarified a mutational basis for RB and p53 pathway deregulation in this malignancy. Finally, the spectrum of driver mutations provided unequivocal genomic evidence for a direct mutagenic role of UV light in melanoma pathogenesis.


Assuntos
Estudo de Associação Genômica Ampla , Melanoma/genética , Mutagênese , Raios Ultravioleta , Sequência de Aminoácidos , Células Cultivadas , Exoma , Humanos , Melanócitos/metabolismo , Modelos Moleculares , Dados de Sequência Molecular , Proteínas Proto-Oncogênicas B-raf/genética , Alinhamento de Sequência , Proteínas rac1 de Ligação ao GTP/genética
9.
Nature ; 590(7845): 290-299, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33568819

RESUMO

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Assuntos
Variação Genética/genética , Genoma Humano/genética , Genômica , National Heart, Lung, and Blood Institute (U.S.) , Medicina de Precisão , Citocromo P-450 CYP2D6/genética , Haplótipos/genética , Heterozigoto , Humanos , Mutação INDEL , Mutação com Perda de Função , Mutagênese , Fenótipo , Polimorfismo de Nucleotídeo Único , Densidade Demográfica , Medicina de Precisão/normas , Controle de Qualidade , Tamanho da Amostra , Estados Unidos , Sequenciamento Completo do Genoma/normas
10.
Nature ; 586(7831): 763-768, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33057201

RESUMO

Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown1. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer2-4 and coronary heart disease5-this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP)6. Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.


Assuntos
Hematopoiese Clonal/genética , Predisposição Genética para Doença , Genoma Humano/genética , Sequenciamento Completo do Genoma , Adulto , África/etnologia , Idoso , Idoso de 80 Anos ou mais , População Negra/genética , Autorrenovação Celular/genética , Proteínas de Ligação a DNA/genética , Dioxigenases , Feminino , Mutação em Linhagem Germinativa/genética , Células-Tronco Hematopoéticas/citologia , Células-Tronco Hematopoéticas/metabolismo , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/genética , Masculino , Pessoa de Meia-Idade , National Heart, Lung, and Blood Institute (U.S.) , Fenótipo , Medicina de Precisão , Proteínas Proto-Oncogênicas/genética , Proteínas com Motivo Tripartido/genética , Estados Unidos , alfa Carioferinas/genética
11.
PLoS Genet ; 19(5): e1010517, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37216410

RESUMO

Integrative approaches that simultaneously model multi-omics data have gained increasing popularity because they provide holistic system biology views of multiple or all components in a biological system of interest. Canonical correlation analysis (CCA) is a correlation-based integrative method designed to extract latent features shared between multiple assays by finding the linear combinations of features-referred to as canonical variables (CVs)-within each assay that achieve maximal across-assay correlation. Although widely acknowledged as a powerful approach for multi-omics data, CCA has not been systematically applied to multi-omics data in large cohort studies, which has only recently become available. Here, we adapted sparse multiple CCA (SMCCA), a widely-used derivative of CCA, to proteomics and methylomics data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Jackson Heart Study (JHS). To tackle challenges encountered when applying SMCCA to MESA and JHS, our adaptations include the incorporation of the Gram-Schmidt (GS) algorithm with SMCCA to improve orthogonality among CVs, and the development of Sparse Supervised Multiple CCA (SSMCCA) to allow supervised integration analysis for more than two assays. Effective application of SMCCA to the two real datasets reveals important findings. Applying our SMCCA-GS to MESA and JHS, we identified strong associations between blood cell counts and protein abundance, suggesting that adjustment of blood cell composition should be considered in protein-based association studies. Importantly, CVs obtained from two independent cohorts also demonstrate transferability across the cohorts. For example, proteomic CVs learned from JHS, when transferred to MESA, explain similar amounts of blood cell count phenotypic variance in MESA, explaining 39.0% ~ 50.0% variation in JHS and 38.9% ~ 49.1% in MESA. Similar transferability was observed for other omics-CV-trait pairs. This suggests that biologically meaningful and cohort-agnostic variation is captured by CVs. We anticipate that applying our SMCCA-GS and SSMCCA on various cohorts would help identify cohort-agnostic biologically meaningful relationships between multi-omics data and phenotypic traits.


Assuntos
Análise de Correlação Canônica , Proteômica , Humanos , Proteômica/métodos , Multiômica , Estudos de Coortes
12.
Am J Hum Genet ; 109(7): 1286-1297, 2022 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-35716666

RESUMO

Despite the growing number of genome-wide association studies (GWASs), it remains unclear to what extent gene-by-gene and gene-by-environment interactions influence complex traits in humans. The magnitude of genetic interactions in complex traits has been difficult to quantify because GWASs are generally underpowered to detect individual interactions of small effect. Here, we develop a method to test for genetic interactions that aggregates information across all trait-associated loci. Specifically, we test whether SNPs in regions of European ancestry shared between European American and admixed African American individuals have the same causal effect sizes. We hypothesize that in African Americans, the presence of genetic interactions will drive the causal effect sizes of SNPs in regions of European ancestry to be more similar to those of SNPs in regions of African ancestry. We apply our method to two traits: gene expression in 296 African Americans and 482 European Americans in the Multi-Ethnic Study of Atherosclerosis (MESA) and low-density lipoprotein cholesterol (LDL-C) in 74K African Americans and 296K European Americans in the Million Veteran Program (MVP). We find significant evidence for genetic interactions in our analysis of gene expression; for LDL-C, we observe a similar point estimate, although this is not significant, most likely due to lower statistical power. These results suggest that gene-by-gene or gene-by-environment interactions modify the effect sizes of causal variants in human complex traits.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , LDL-Colesterol , Expressão Gênica , Humanos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética
13.
PLoS Genet ; 18(1): e1009719, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-35100260

RESUMO

Tens of thousands of genetic variants associated with gene expression (cis-eQTLs) have been discovered in the human population. These eQTLs are active in various tissues and contexts, but the molecular mechanisms of eQTL variability are poorly understood, hindering our understanding of genetic regulation across biological contexts. Since many eQTLs are believed to act by altering transcription factor (TF) binding affinity, we hypothesized that analyzing eQTL effect size as a function of TF level may allow discovery of mechanisms of eQTL variability. Using GTEx Consortium eQTL data from 49 tissues, we analyzed the interaction between eQTL effect size and TF level across tissues and across individuals within specific tissues and generated a list of 10,098 TF-eQTL interactions across 2,136 genes that are supported by at least two lines of evidence. These TF-eQTLs were enriched for various TF binding measures, supporting with orthogonal evidence that these eQTLs are regulated by the implicated TFs. We also found that our TF-eQTLs tend to overlap genes with gene-by-environment regulatory effects and to colocalize with GWAS loci, implying that our approach can help to elucidate mechanisms of context-specificity and trait associations. Finally, we highlight an interesting example of IKZF1 TF regulation of an APBB1IP gene eQTL that colocalizes with a GWAS signal for blood cell traits. Together, our findings provide candidate TF mechanisms for a large number of eQTLs and offer a generalizable approach for researchers to discover TF regulators of genetic variant effects in additional QTL datasets.


Assuntos
Locos de Características Quantitativas , Fatores de Transcrição/fisiologia , Alelos , Sítios de Ligação , Técnicas de Silenciamento de Genes , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Humanos , Fator Regulador 1 de Interferon/genética , Modelos Genéticos , Fenótipo , Fatores de Transcrição/metabolismo
15.
Int J Obes (Lond) ; 47(2): 109-116, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36463326

RESUMO

BACKGROUND/OBJECTIVES: Obesity, defined as excessive fat accumulation that represents a health risk, is increasing in adults and children, reaching global epidemic proportions. Body mass index (BMI) correlates with body fat and future health risk, yet differs in prediction by fat distribution, across populations and by age. Nonetheless, few genetic studies of BMI have been conducted in ancestrally diverse populations. Gene expression association with BMI was assessed in the Multi-Ethnic Study of Atherosclerosis (MESA) in four self-identified race and ethnicity (SIRE) groups to identify genes associated with obesity. SUBJECTS/METHODS: RNA-sequencing was performed on 1096 MESA participants (37.8% white, 24.3% Hispanic, 28.4% African American, and 9.5% Chinese American) and linear models were used to assess the association of expression from each gene for its effect on BMI, adjusting for age, sex, sequencing center, study site, five expression and four genetic principal components in each self-identified race group. Sample-size-weighted meta-analysis was performed to identify genes with BMI-associated expression across ancestry groups. RESULTS: Within individual SIRE groups, there were zero to three genes whose expression is significantly (p < 1.97 × 10-6) associated with BMI. Across all groups, 45 genes were identified by meta-analysis whose expression was significantly associated with BMI, explaining 29.7% of BMI variation. The 45 genes are expressed in a variety of tissues and cell types and are enriched for obesity-related processes including erythrocyte function, oxygen binding and transport, and JAK-STAT signaling. CONCLUSIONS: We have identified genes whose expression is significantly associated with obesity in a multi-ethnic cohort. We have identified novel genes associated with BMI as well as confirmed previously identified genes from earlier genetic analyses. These novel genes and their biological pathways represent new targets for understanding the biology of obesity as well as new therapeutic intervention to reduce obesity and improve global public health.


Assuntos
Índice de Massa Corporal , Expressão Gênica , Obesidade , Adulto , Criança , Humanos , Aterosclerose , Obesidade/epidemiologia , Obesidade/genética
16.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34015820

RESUMO

Large datasets of hundreds to thousands of individuals measuring RNA-seq in observational studies are becoming available. Many popular software packages for analysis of RNA-seq data were constructed to study differences in expression signatures in an experimental design with well-defined conditions (exposures). In contrast, observational studies may have varying levels of confounding transcript-exposure associations; further, exposure measures may vary from discrete (exposed, yes/no) to continuous (levels of exposure), with non-normal distributions of exposure. We compare popular software for gene expression-DESeq2, edgeR and limma-as well as linear regression-based analyses for studying the association of continuous exposures with RNA-seq. We developed a computation pipeline that includes transformation, filtering and generation of empirical null distribution of association P-values, and we apply the pipeline to compute empirical P-values with multiple testing correction. We employ a resampling approach that allows for assessment of false positive detection across methods, power comparison and the computation of quantile empirical P-values. The results suggest that linear regression methods are substantially faster with better control of false detections than other methods, even with the resampling method to compute empirical P-values. We provide the proposed pipeline with fast algorithms in an R package Olivia, and implemented it to study the associations of measures of sleep disordered breathing with RNA-seq in peripheral blood mononuclear cells in participants from the Multi-Ethnic Study of Atherosclerosis.


Assuntos
Benchmarking/métodos , RNA-Seq , Análise de Sequência de RNA , Software , Algoritmos , Aterosclerose/epidemiologia , Aterosclerose/etiologia , Aterosclerose/metabolismo , Simulação por Computador , Suscetibilidade a Doenças , Predisposição Genética para Doença , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação , Fenótipo , Medição de Risco , Fatores de Risco , Navegador
17.
Respir Res ; 24(1): 30, 2023 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-36698131

RESUMO

BACKGROUND: Chronic obstructive pulmonary disease (COPD) varies significantly in symptomatic and physiologic presentation. Identifying disease subtypes from molecular data, collected from easily accessible blood samples, can help stratify patients and guide disease management and treatment. METHODS: Blood gene expression measured by RNA-sequencing in the COPDGene Study was analyzed using a network perturbation analysis method. Each COPD sample was compared against a learned reference gene network to determine the part that is deregulated. Gene deregulation values were used to cluster the disease samples. RESULTS: The discovery set included 617 former smokers from COPDGene. Four distinct gene network subtypes are identified with significant differences in symptoms, exercise capacity and mortality. These clusters do not necessarily correspond with the levels of lung function impairment and are independently validated in two external cohorts: 769 former smokers from COPDGene and 431 former smokers in the Multi-Ethnic Study of Atherosclerosis (MESA). Additionally, we identify several genes that are significantly deregulated across these subtypes, including DSP and GSTM1, which have been previously associated with COPD through genome-wide association study (GWAS). CONCLUSIONS: The identified subtypes differ in mortality and in their clinical and functional characteristics, underlining the need for multi-dimensional assessment potentially supplemented by selected markers of gene expression. The subtypes were consistent across cohorts and could be used for new patient stratification and disease prognosis.


Assuntos
Redes Reguladoras de Genes , Doença Pulmonar Obstrutiva Crônica , Humanos , Redes Reguladoras de Genes/genética , Fumantes , Estudo de Associação Genômica Ampla/métodos , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/genética , Prognóstico
18.
Nature ; 550(7675): 244-248, 2017 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-29022598

RESUMO

X chromosome inactivation (XCI) silences transcription from one of the two X chromosomes in female mammalian cells to balance expression dosage between XX females and XY males. XCI is, however, incomplete in humans: up to one-third of X-chromosomal genes are expressed from both the active and inactive X chromosomes (Xa and Xi, respectively) in female cells, with the degree of 'escape' from inactivation varying between genes and individuals. The extent to which XCI is shared between cells and tissues remains poorly characterized, as does the degree to which incomplete XCI manifests as detectable sex differences in gene expression and phenotypic traits. Here we describe a systematic survey of XCI, integrating over 5,500 transcriptomes from 449 individuals spanning 29 tissues from GTEx (v6p release) and 940 single-cell transcriptomes, combined with genomic sequence data. We show that XCI at 683 X-chromosomal genes is generally uniform across human tissues, but identify examples of heterogeneity between tissues, individuals and cells. We show that incomplete XCI affects at least 23% of X-chromosomal genes, identify seven genes that escape XCI with support from multiple lines of evidence and demonstrate that escape from XCI results in sex biases in gene expression, establishing incomplete XCI as a mechanism that is likely to introduce phenotypic diversity. Overall, this updated catalogue of XCI across human tissues helps to increase our understanding of the extent and impact of the incompleteness in the maintenance of XCI.


Assuntos
Especificidade de Órgãos/genética , Análise de Célula Única , Inativação do Cromossomo X/genética , Cromossomos Humanos X/genética , Feminino , Genes Ligados ao Cromossomo X/genética , Genoma Humano/genética , Genômica , Humanos , Masculino , Fenótipo , Análise de Sequência de RNA , Transcriptoma/genética
19.
Bioinformatics ; 37(18): 3048-3050, 2021 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-33677499

RESUMO

SUMMARY: Post-sequencing quality control is a crucial component of RNA sequencing (RNA-seq) data generation and analysis, as sample quality can be affected by sample storage, extraction and sequencing protocols. RNA-seq is increasingly applied to cohorts ranging from hundreds to tens of thousands of samples in size, but existing tools do not readily scale to these sizes, and were not designed for a wide range of sample types and qualities. Here, we describe RNA-SeQC 2, an efficient reimplementation of RNA-SeQC (DeLuca et al., 2012) that adds multiple metrics designed to characterize sample quality across a wide range of RNA-seq protocols. AVAILABILITY AND IMPLEMENTATION: The command-line tool, documentation and C++ source code are available at the GitHub repository https://github.com/getzlab/rnaseqc. Code and data for reproducing the figures in this paper are available at https://github.com/getzlab/rnaseqc2-paper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
RNA , Software , Humanos , RNA-Seq , Análise de Sequência de RNA/métodos , Controle de Qualidade
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa