Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Genet ; 53(1): 120-126, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33414550

RESUMO

Low-coverage whole-genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined because current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. GLIMPSE achieves imputation of a genome for less than US$1 in computational cost, considerably outperforming other methods and improving imputation accuracy over the full allele frequency range. As a proof of concept, we show that 1× coverage enables effective gene expression association studies and outperforms dense SNP arrays in rare variant burden tests. Overall, this study illustrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.


Assuntos
Análise de Sequência de DNA , Genoma Humano , Genótipo , Humanos , Funções Verossimilhança , Polimorfismo de Nucleotídeo Único/genética , Padrões de Referência
2.
PLoS Genet ; 16(11): e1009049, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33196638

RESUMO

Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model. Using the HRC reference panel, which has ∼65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.

3.
medRxiv ; 2020 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-32766602

RESUMO

During COVID19 and other viral pandemics, rapid generation of host and pathogen genomic data is critical to tracking infection and informing therapies. There is an urgent need for efficient approaches to this data generation at scale. We have developed a scalable, high throughput approach to generate high fidelity low pass whole genome and HLA sequencing, viral genomes, and representation of human transcriptome from single nasopharyngeal swabs of COVID19 patients.

4.
Nat Commun ; 10(1): 5436, 2019 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-31780650

RESUMO

The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here we present a method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear running times with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPEIT4 in an open source format and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle.


Assuntos
Interpretação Estatística de Dados , Haplótipos , Software , Bancos de Espécimes Biológicos , Conjuntos de Dados como Assunto , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra , Análise de Sequência de DNA
5.
PLoS Genet ; 15(4): e1008091, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-31009447

RESUMO

The HLA (Human Leukocyte Antigens) genes are well-documented targets of balancing selection, and variation at these loci is associated with many disease phenotypes. Variation in expression levels also influences disease susceptibility and resistance, but little information exists about the regulation and population-level patterns of expression. This results from the difficulty in mapping short reads originated from these highly polymorphic loci, and in accounting for the existence of several paralogues. We developed a computational pipeline to accurately estimate expression for HLA genes based on RNA-seq, improving both locus-level and allele-level estimates. First, reads are aligned to all known HLA sequences in order to infer HLA genotypes, then quantification of expression is carried out using a personalized index. We use simulations to show that expression estimates obtained in this way are not biased due to divergence from the reference genome. We applied our pipeline to the GEUVADIS dataset, and compared the quantifications to those obtained with reference transcriptome. Although the personalized pipeline recovers more reads, we found that using the reference transcriptome produces estimates similar to the personalized pipeline (r ≥ 0.87) with the exception of HLA-DQA1. We describe the impact of the HLA-personalized approach on downstream analyses for nine classical HLA loci (HLA-A, HLA-C, HLA-B, HLA-DRA, HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DPA1, HLA-DPB1). Although the influence of the HLA-personalized approach is modest for eQTL mapping, the p-values and the causality of the eQTLs obtained are better than when the reference transcriptome is used. We investigate how the eQTLs we identified explain variation in expression among lineages of HLA alleles. Finally, we discuss possible causes underlying differences between expression estimates obtained using RNA-seq, antibody-based approaches and qPCR.


Assuntos
Mapeamento Cromossômico , Expressão Gênica , Antígenos HLA/genética , Locos de Características Quantitativas , Alelos , Biologia Computacional/métodos , Frequência do Gene , Genótipo , Haplótipos , Humanos , Transcriptoma
6.
Nat Neurosci ; 22(3): 353-361, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30692689

RESUMO

There is mounting evidence that seemingly diverse psychiatric disorders share genetic etiology, but the biological substrates mediating this overlap are not well characterized. Here we leverage the unique Integrative Psychiatric Research Consortium (iPSYCH) study, a nationally representative cohort ascertained through clinical psychiatric diagnoses indicated in Danish national health registers. We confirm previous reports of individual and cross-disorder single-nucleotide polymorphism heritability for major psychiatric disorders and perform a cross-disorder genome-wide association study. We identify four novel genome-wide significant loci encompassing variants predicted to regulate genes expressed in radial glia and interneurons in the developing neocortex during mid-gestation. This epoch is supported by partitioning cross-disorder single-nucleotide polymorphism heritability, which is enriched at regulatory chromatin active during fetal neurodevelopment. These findings suggest that dysregulation of genes that direct neurodevelopment by common genetic variants may result in general liability for many later psychiatric outcomes.


Assuntos
Encéfalo/embriologia , Regulação da Expressão Gênica , Predisposição Genética para Doença , Transtornos Mentais/genética , Encéfalo/metabolismo , Estudos de Coortes , Feminino , Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Masculino , Polimorfismo de Nucleotídeo Único , Fatores de Risco
7.
Nature ; 562(7726): 203-209, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30305743

RESUMO

The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.


Assuntos
Bases de Dados Factuais , Genômica , Fenótipo , Adulto , Idoso , Alelos , Biomarcadores/sangue , Biomarcadores/urina , Estatura/genética , Encéfalo/diagnóstico por imagem , Estudos de Coortes , Grupos de Populações Continentais/genética , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Família , Feminino , Estudo de Associação Genômica Ampla , Haplótipos/genética , Humanos , Estilo de Vida , Complexo Principal de Histocompatibilidade/genética , Masculino , Pessoa de Meia-Idade , Controle de Qualidade , Reino Unido
8.
Nat Commun ; 8(1): 1358, 2017 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-29116076

RESUMO

The identification of genetic variants affecting gene expression, namely expression quantitative trait loci (eQTLs), has contributed to the understanding of mechanisms underlying human traits and diseases. The majority of these variants map in non-coding regulatory regions of the genome and their identification remains challenging. Here, we use natural genetic variation and CAGE transcriptomes from 154 EBV-transformed lymphoblastoid cell lines, derived from unrelated individuals, to map 5376 and 110 regulatory variants associated with promoter usage (puQTLs) and enhancer activity (eaQTLs), respectively. We characterize five categories of genes associated with puQTLs, distinguishing single from multi-promoter genes. Among multi-promoter genes, we find puQTL effects either specific to a single promoter or to multiple promoters with variable effect orientations. Regulatory variants associated with opposite effects on different mRNA isoforms suggest compensatory mechanisms occurring between alternative promoters. Our analyses identify differential promoter usage and modulation of enhancer activity as molecular mechanisms underlying eQTLs related to regulatory elements.


Assuntos
Elementos Facilitadores Genéticos/genética , Variação Genética , Regiões Promotoras Genéticas , Linhagem Celular Transformada , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Herpesvirus Humano 4/patogenicidade , Humanos , Locos de Características Quantitativas , Transcriptoma
9.
Nat Genet ; 49(12): 1747-1751, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29058714

RESUMO

Genetic association mapping produces statistical links between phenotypes and genomic regions, but identifying causal variants remains difficult. Whole-genome sequencing (WGS) can help by providing complete knowledge of all genetic variants, but it is financially prohibitive for well-powered GWAS studies. We performed mapping of expression quantitative trait loci (eQTLs) with WGS and RNA-seq, and found that lead eQTL variants called with WGS were more likely to be causal. Through simulations, we derived properties of causal variants and used them to develop a method for identifying likely causal SNPs. We estimated that 25-70% of causal variants were located in open-chromatin regions, depending on the tissue and experiment. Finally, we identified a set of high-confidence causal variants and showed that these were more enriched in GWAS associations than other eQTLs. Of those, we found 65 associations with GWAS traits and provide examples in which genes implicated by expression are functionally validated as being relevant for complex traits.


Assuntos
Perfilação da Expressão Gênica/métodos , Variação Genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mapeamento Cromossômico , Predisposição Genética para Doença/genética , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética , Reprodutibilidade dos Testes
10.
Nat Genet ; 49(12): 1676-1683, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29058715

RESUMO

How to interpret the biological causes underlying the predisposing markers identified through genome-wide association studies (GWAS) remains an open question. One direct and powerful way to assess the genetic causality behind GWAS is through analysis of expression quantitative trait loci (eQTLs). Here we describe a new approach to estimate the tissues behind the genetic causality of a variety of GWAS traits, using the cis-eQTLs in 44 tissues from the Genotype-Tissue Expression (GTEx) Consortium. We have adapted the regulatory trait concordance (RTC) score to measure the probability of eQTLs being active in multiple tissues and to calculate the probability that a GWAS-associated variant and an eQTL tag the same functional effect. By normalizing the GWAS-eQTL probabilities by the tissue-sharing estimates for eQTLs, we generate relative tissue-causality profiles for GWAS traits. Our approach not only implicates the gene likely mediating individual GWAS signals, but also highlights tissues where the genetic causality for an individual trait is likely manifested.


Assuntos
Perfilação da Expressão Gênica , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas/genética , Estudos de Associação Genética , Genótipo , Humanos , Especificidade de Órgãos/genética , Fenótipo , Polimorfismo de Nucleotídeo Único
11.
Nat Commun ; 8: 15452, 2017 05 18.
Artigo em Inglês | MEDLINE | ID: mdl-28516912

RESUMO

Population scale studies combining genetic information with molecular phenotypes (for example, gene expression) have become a standard to dissect the effects of genetic variants onto organismal phenotypes. These kinds of data sets require powerful, fast and versatile methods able to discover molecular Quantitative Trait Loci (molQTL). Here we propose such a solution, QTLtools, a modular framework that contains multiple new and well-established methods to prepare the data, to discover proximal and distal molQTLs and, finally, to integrate them with GWAS variants and functional annotations of the genome. We demonstrate its utility by performing a complete expression QTL study in a few easy-to-perform steps. QTLtools is open source and available at https://qtltools.github.io/qtltools/.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Locos de Características Quantitativas , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Controle de Qualidade
12.
Bioinformatics ; 33(12): 1895-1897, 2017 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-28186259

RESUMO

Motivation: Large genomic datasets combining genotype and sequence data, such as for expression quantitative trait loci (eQTL) detection, require perfect matching between both data types. Results: We described here MBV (Match BAM to VCF); a method to quickly solve sample mislabeling and detect cross-sample contamination and PCR amplification bias. Availability and Implementation: MBV is implemented in C ++ as an independent component of the QTLtools software package, the binary and source codes are freely available at https://qtltools.github.io/qtltools/ . Contact: olivier.delaneau@unige.ch or emmanouil.dermitzakis@unige.ch. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Técnicas de Genotipagem/métodos , Locos de Características Quantitativas , Análise de Sequência de DNA/métodos , Software , Viés , Genômica/métodos , Genômica/normas , Técnicas de Genotipagem/normas , Humanos , Análise de Sequência de DNA/normas
13.
Cell ; 167(5): 1415-1429.e19, 2016 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-27863252

RESUMO

Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.


Assuntos
Variação Genética , Estudo de Associação Genômica Ampla , Células-Tronco Hematopoéticas/metabolismo , Doenças do Sistema Imunitário/genética , Alelos , Diferenciação Celular , Grupo com Ancestrais do Continente Europeu/genética , Predisposição Genética para Doença , Células-Tronco Hematopoéticas/patologia , Humanos , Doenças do Sistema Imunitário/patologia , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
14.
Nat Genet ; 48(10): 1279-83, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27548312

RESUMO

We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.


Assuntos
Genótipo , Haplótipos , Polimorfismo de Nucleotídeo Único , Alelos , Técnicas Genéticas , Estudo de Associação Genômica Ampla , Humanos , Internet , Valores de Referência
15.
Nat Genet ; 48(7): 817-20, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27270105

RESUMO

The UK Biobank (UKB) has recently released genotypes on 152,328 individuals together with extensive phenotypic and lifestyle information. We present a new phasing method, SHAPEIT3, that can handle such biobank-scale data sets and results in switch error rates as low as ∼0.3%. The method exhibits O(NlogN) scaling with sample size N, enabling fast and accurate phasing of even larger cohorts.


Assuntos
Algoritmos , Bancos de Espécimes Biológicos , Biologia Computacional/métodos , Genética Populacional , Haplótipos/genética , Estudos de Coortes , Conjuntos de Dados como Assunto , Grupo com Ancestrais do Continente Europeu , Genoma Humano , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Reino Unido
16.
Bioinformatics ; 32(13): 1974-80, 2016 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-27153703

RESUMO

MOTIVATION: There is growing recognition that estimating haplotypes from high coverage sequencing of single samples in clinical settings is an important problem. At the same time very large datasets consisting of tens and hundreds of thousands of high-coverage sequenced samples will soon be available. We describe a method that takes advantage of these huge human genetic variation resources and rare variant sharing patterns to estimate haplotypes on single sequenced samples. Sharing rare variants between two individuals is more likely to arise from a recent common ancestor and, hence, also more likely to indicate similar shared haplotypes over a substantial flanking region of sequence. RESULTS: Our method exploits this idea to select a small set of highly informative copying states within a Hidden Markov Model (HMM) phasing algorithm. Using rare variants in this way allows us to avoid iterative MCMC methods to infer haplotypes. Compared to other approaches that do not explicitly use rare variants we obtain significant gains in phasing accuracy, less variation over phasing runs and improvements in speed. For example, using a reference panel of 7420 haplotypes from the UK10K project, we are able to reduce switch error rates by up to 50% when phasing samples sequenced at high-coverage. In addition, a single step rephasing of the UK10K panel, using rare variant information, has a downstream impact on phasing performance. These results represent a proof of concept that rare variant sharing patterns can be utilized to phase large high-coverage sequencing studies such as the 100 000 Genomes Project dataset. AVAILABILITY AND IMPLEMENTATION: A webserver that includes an implementation of this new method and allows phasing of high-coverage clinical samples is available at https://phasingserver.stats.ox.ac.uk/ CONTACT: marchini@stats.ox.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Variação Genética , Haplótipos , Algoritmos , Alelos , Genótipo , Humanos
17.
Bioinformatics ; 32(10): 1479-85, 2016 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-26708335

RESUMO

MOTIVATION: In order to discover quantitative trait loci, multi-dimensional genomic datasets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes with millions of genetic variants while appropriately controlling for multiple testing. RESULTS: We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted P-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot datasets can be now easily analyzed an order of magnitude faster than previous approaches. AVAILABILITY AND IMPLEMENTATION: Source code, binaries and comprehensive documentation of FastQTL are freely available to download at http://fastqtl.sourceforge.net/ CONTACT: emmanouil.dermitzakis@unige.ch or olivier.delaneau@unige.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Locos de Características Quantitativas , Genômica , Fenótipo , Software , Distribuições Estatísticas
18.
Lancet Respir Med ; 3(10): 769-81, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26423011

RESUMO

BACKGROUND: Understanding the genetic basis of airflow obstruction and smoking behaviour is key to determining the pathophysiology of chronic obstructive pulmonary disease (COPD). We used UK Biobank data to study the genetic causes of smoking behaviour and lung health. METHODS: We sampled individuals of European ancestry from UK Biobank, from the middle and extremes of the forced expiratory volume in 1 s (FEV1) distribution among heavy smokers (mean 35 pack-years) and never smokers. We developed a custom array for UK Biobank to provide optimum genome-wide coverage of common and low-frequency variants, dense coverage of genomic regions already implicated in lung health and disease, and to assay rare coding variants relevant to the UK population. We investigated whether there were shared genetic causes between different phenotypes defined by extremes of FEV1. We also looked for novel variants associated with extremes of FEV1 and smoking behaviour and assessed regions of the genome that had already shown evidence for a role in lung health and disease. We set genome-wide significance at p<5 × 10(-8). FINDINGS: UK Biobank participants were recruited from March 15, 2006, to July 7, 2010. Sample selection for the UK BiLEVE study started on Nov 22, 2012, and was completed on Dec 20, 2012. We selected 50,008 unique samples: 10,002 individuals with low FEV1, 10,000 with average FEV1, and 5002 with high FEV1 from each of the heavy smoker and never smoker groups. We noted a substantial sharing of genetic causes of low FEV1 between heavy smokers and never smokers (p=2.29 × 10(-16)) and between individuals with and without doctor-diagnosed asthma (p=6.06 × 10(-11)). We discovered six novel genome-wide significant signals of association with extremes of FEV1, including signals at four novel loci (KANSL1, TSEN54, TET2, and RBM19/TBX5) and independent signals at two previously reported loci (NPNT and HLA-DQB1/HLA-DQA2). These variants also showed association with COPD, including in individuals with no history of smoking. The number of copies of a 150 kb region containing the 5' end of KANSL1, a gene that is important for epigenetic gene regulation, was associated with extremes of FEV1. We also discovered five new genome-wide significant signals for smoking behaviour, including a variant in NCAM1 (chromosome 11) and a variant on chromosome 2 (between TEX41 and PABPC1P2) that has a trans effect on expression of NCAM1 in brain tissue. INTERPRETATION: By sampling from the extremes of the lung function distribution in UK Biobank, we identified novel genetic causes of lung function and smoking behaviour. These results provide new insight into the specific mechanisms underlying airflow obstruction, COPD, and tobacco addiction, and show substantial shared genetic architecture underlying airflow obstruction across individuals, irrespective of smoking behaviour and other airway disease. FUNDING: Medical Research Council.


Assuntos
Pulmão/fisiopatologia , Doença Pulmonar Obstrutiva Crônica/genética , Fumar/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Bancos de Espécimes Biológicos , Estudos de Casos e Controles , Feminino , Volume Expiratório Forçado/genética , Estudos de Associação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Reino Unido , Adulto Jovem
19.
PLoS One ; 10(9): e0136989, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26367535

RESUMO

BACKGROUND: Many genome-wide association studies have been performed on progression towards the acquired immune deficiency syndrome (AIDS) and they mainly identified associations within the HLA loci. In this study, we demonstrate that the integration of biological information, namely gene expression data, can enhance the sensitivity of genetic studies to unravel new genetic associations relevant to AIDS. METHODS: We collated the biological information compiled from three databases of expression quantitative trait loci (eQTLs) involved in cells of the immune system. We derived a list of single nucleotide polymorphisms (SNPs) that are functional in that they correlate with differential expression of genes in at least two of the databases. We tested the association of those SNPs with AIDS progression in two cohorts, GRIV and ACS. Tests on permuted phenotypes of the GRIV and ACS cohorts or on randomised sets of equivalent SNPs allowed us to assess the statistical robustness of this method and to estimate the true positive rate. RESULTS: Eight genes were identified with high confidence (p = 0.001, rate of true positives 75%). Some of those genes had previously been linked with HIV infection. Notably, ENTPD4 belongs to the same family as CD39, whose expression has already been associated with AIDS progression; while DNAJB12 is part of the HSP90 pathway, which is involved in the control of HIV latency. Our study also drew our attention to lesser-known functions such as mitochondrial ribosomal proteins and a zinc finger protein, ZFP57, which could be central to the effectiveness of HIV infection. Interestingly, for six out of those eight genes, down-regulation is associated with non-progression, which makes them appealing targets to develop drugs against HIV.


Assuntos
Síndrome de Imunodeficiência Adquirida/genética , Perfilação da Expressão Gênica/métodos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Transcriptoma , Estudos de Coortes , Proteínas de Ligação a DNA/genética , Bases de Dados Genéticas , Progressão da Doença , Regulação da Expressão Gênica , Estudos de Associação Genética , Predisposição Genética para Doença , Proteínas de Choque Térmico HSP40/genética , Humanos , Pirofosfatases/genética , Distribuição Aleatória , Proteínas Repressoras , Fatores de Transcrição/genética
20.
Cell ; 162(5): 1039-50, 2015 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-26300124

RESUMO

Chromatin state variation at gene regulatory elements is abundant across individuals, yet we understand little about the genetic basis of this variability. Here, we profiled several histone modifications, the transcription factor (TF) PU.1, RNA polymerase II, and gene expression in lymphoblastoid cell lines from 47 whole-genome sequenced individuals. We observed that distinct cis-regulatory elements exhibit coordinated chromatin variation across individuals in the form of variable chromatin modules (VCMs) at sub-Mb scale. VCMs were associated with thousands of genes and preferentially cluster within chromosomal contact domains. We mapped strong proximal and weak, yet more ubiquitous, distal-acting chromatin quantitative trait loci (cQTL) that frequently explain this variation. cQTLs were associated with molecular activity at clusters of cis-regulatory elements and mapped preferentially within TF-bound regions. We propose that local, sequence-independent chromatin variation emerges as a result of genetic perturbations in cooperative interactions between cis-regulatory elements that are located within the same genomic domain.


Assuntos
Cromatina/química , Regulação da Expressão Gênica , Variação Genética , Genoma Humano , Cromatina/metabolismo , Cromossomos Humanos/química , Genética Populacional , Humanos , Locos de Características Quantitativas , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA