Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 184(8): 2068-2083.e11, 2021 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-33861964

RESUMO

Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.


Assuntos
Etnicidade/genética , Saúde da População , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Genômica , Humanos , Autorrelato
2.
Am J Hum Genet ; 108(2): 219-239, 2021 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-33440170

RESUMO

We present a full-likelihood method to infer polygenic adaptation from DNA sequence variation and GWAS summary statistics to quantify recent transient directional selection acting on a complex trait. Through simulations of polygenic trait architecture evolution and GWASs, we show the method substantially improves power over current methods. We examine the robustness of the method under stratification, uncertainty and bias in marginal effects, uncertainty in the causal SNPs, allelic heterogeneity, negative selection, and low GWAS sample size. The method can quantify selection acting on correlated traits, controlling for pleiotropy even among traits with strong genetic correlation (|rg|=80%) while retaining high power to attribute selection to the causal trait. When the causal trait is excluded from analysis, selection is attributed to its closest proxy. We discuss limitations of the method, cautioning against strongly causal interpretations of the results, and the possibility of undetectable gene-by-environment (GxE) interactions. We apply the method to 56 human polygenic traits, revealing signals of directional selection on pigmentation, life history, glycated hemoglobin (HbA1c), and other traits. We also conduct joint testing of 137 pairs of genetically correlated traits, revealing widespread correlated response acting on these traits (2.6-fold enrichment, p = 1.5 × 10-7). Signs of selection on some traits previously reported as adaptive (e.g., educational attainment and hair color) are largely attributable to correlated response (p = 2.9 × 10-6 and 1.7 × 10-4, respectively). Lastly, our joint test shows antagonistic selection has increased type 2 diabetes risk and decrease HbA1c (p = 1.5 × 10-5).


Assuntos
Genoma Humano , Herança Multifatorial , Seleção Genética , Simulação por Computador , Diabetes Mellitus Tipo 2/genética , Evolução Molecular , Interação Gene-Ambiente , Heterogeneidade Genética , Pleiotropia Genética , Estudo de Associação Genômica Ampla , Hemoglobinas Glicadas/genética , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra
3.
Am J Hum Genet ; 100(1): 31-39, 2017 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-28017371

RESUMO

Mixed models have become the tool of choice for genetic association studies; however, standard mixed model methods may be poorly calibrated or underpowered under family sampling bias and/or case-control ascertainment. Previously, we introduced a liability threshold-based mixed model association statistic (LTMLM) to address case-control ascertainment in unrelated samples. Here, we consider family-biased case-control ascertainment, where case and control subjects are ascertained non-randomly with respect to family relatedness. Previous work has shown that this type of ascertainment can severely bias heritability estimates; we show here that it also impacts mixed model association statistics. We introduce a family-based association statistic (LT-Fam) that is robust to this problem. Similar to LTMLM, LT-Fam is computed from posterior mean liabilities (PML) under a liability threshold model; however, LT-Fam uses published narrow-sense heritability estimates to avoid the problem of biased heritability estimation, enabling correct calibration. In simulations with family-biased case-control ascertainment, LT-Fam was correctly calibrated (average χ2 = 1.00-1.02 for null SNPs), whereas the Armitage trend test (ATT), standard mixed model association (MLM), and case-control retrospective association test (CARAT) were mis-calibrated (e.g., average χ2 = 0.50-1.22 for MLM, 0.89-2.65 for CARAT). LT-Fam also attained higher power than other methods in some settings. In 1,259 type 2 diabetes-affected case subjects and 5,765 control subjects from the CARe cohort, downsampled to induce family-biased ascertainment, LT-Fam was correctly calibrated whereas ATT, MLM, and CARAT were again mis-calibrated. Our results highlight the importance of modeling family sampling bias in case-control datasets with related samples.


Assuntos
Família , Estudos de Associação Genética/métodos , Modelos Genéticos , Viés , Calibragem , Diabetes Mellitus Tipo 2/genética , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Estudos Retrospectivos
5.
Genome Res ; 26(7): 863-73, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27197206

RESUMO

The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature.


Assuntos
Evolução Molecular , Modelos Genéticos , Alelos , Cromossomos Humanos/genética , Variação Genética , Genética Médica , Humanos , Fenótipo , Crescimento Demográfico , População Branca/genética
6.
Am J Hum Genet ; 96(5): 720-30, 2015 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-25892111

RESUMO

We introduce a liability-threshold mixed linear model (LTMLM) association statistic for case-control studies and show that it has a well-controlled false-positive rate and more power than existing mixed-model methods for diseases with low prevalence. Existing mixed-model methods suffer a loss in power under case-control ascertainment, but no solution has been proposed. Here, we solve this problem by using a χ(2) score statistic computed from posterior mean liabilities (PMLs) under the liability-threshold model. Each individual's PML is conditional not only on that individual's case-control status but also on every individual's case-control status and the genetic relationship matrix (GRM) obtained from the data. The PMLs are estimated with a multivariate Gibbs sampler; the liability-scale phenotypic covariance matrix is based on the GRM, and a heritability parameter is estimated via Haseman-Elston regression on case-control phenotypes and then transformed to the liability scale. In simulations of unrelated individuals, the LTMLM statistic was correctly calibrated and achieved higher power than existing mixed-model methods for diseases with low prevalence, and the magnitude of the improvement depended on sample size and severity of case-control ascertainment. In a Wellcome Trust Case Control Consortium 2 multiple sclerosis dataset with >10,000 samples, LTMLM was correctly calibrated and attained a 4.3% improvement (p = 0.005) in χ(2) statistics over existing mixed-model methods at 75 known associated SNPs, consistent with simulations. Larger increases in power are expected at larger sample sizes. In conclusion, case-control studies of diseases with low prevalence can achieve power higher than that in existing mixed-model methods.


Assuntos
Estudos de Associação Genética , Modelos Genéticos , Modelos Teóricos , Estudos de Casos e Controles , Mapeamento Cromossômico , Simulação por Computador , Humanos , Esclerose Múltipla/genética , Esclerose Múltipla/patologia , Fenótipo , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra
7.
Genome Res ; 25(7): 927-36, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25953952

RESUMO

Genomic imprinting is an important regulatory mechanism that silences one of the parental copies of a gene. To systematically characterize this phenomenon, we analyze tissue specificity of imprinting from allelic expression data in 1582 primary tissue samples from 178 individuals from the Genotype-Tissue Expression (GTEx) project. We characterize imprinting in 42 genes, including both novel and previously identified genes. Tissue specificity of imprinting is widespread, and gender-specific effects are revealed in a small number of genes in muscle with stronger imprinting in males. IGF2 shows maternal expression in the brain instead of the canonical paternal expression elsewhere. Imprinting appears to have only a subtle impact on tissue-specific expression levels, with genes lacking a systematic expression difference between tissues with imprinted and biallelic expression. In summary, our systematic characterization of imprinting in adult tissues highlights variation in imprinting between genes, individuals, and tissues.


Assuntos
Impressão Genômica , Genômica , Adulto , Alelos , Análise por Conglomerados , Metilação de DNA , Bases de Dados de Ácidos Nucleicos , Feminino , Regulação da Expressão Gênica , Variação Genética , Genótipo , Humanos , Masculino , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Fatores Sexuais
8.
Bioinformatics ; 31(15): 2497-504, 2015 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-25819081

RESUMO

MOTIVATION: RNA sequencing enables allele-specific expression (ASE) studies that complement standard genotype expression studies for common variants and, importantly, also allow measuring the regulatory impact of rare variants. The Genotype-Tissue Expression (GTEx) project is collecting RNA-seq data on multiple tissues of a same set of individuals and novel methods are required for the analysis of these data. RESULTS: We present a statistical method to compare different patterns of ASE across tissues and to classify genetic variants according to their impact on the tissue-wide expression profile. We focus on strong ASE effects that we are expecting to see for protein-truncating variants, but our method can also be adjusted for other types of ASE effects. We illustrate the method with a real data example on a tissue-wide expression profile of a variant causal for lipoid proteinosis, and with a simulation study to assess our method more generally.


Assuntos
Proteínas da Matriz Extracelular/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Proteinose Lipoide de Urbach e Wiethe/metabolismo , Polimorfismo de Nucleotídeo Único/genética , RNA/análise , Alelos , Proteínas da Matriz Extracelular/genética , Humanos , Proteinose Lipoide de Urbach e Wiethe/genética , Especificidade de Órgãos , Isoformas de Proteínas , RNA/genética
9.
Nat Rev Genet ; 11(7): 459-63, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20548291

RESUMO

Genome-wide association (GWA) studies are an effective approach for identifying genetic variants associated with disease risk. GWA studies can be confounded by population stratification--systematic ancestry differences between cases and controls--which has previously been addressed by methods that infer genetic ancestry. Those methods perform well in data sets in which population structure is the only kind of structure present but are inadequate in data sets that also contain family structure or cryptic relatedness. Here, we review recent progress on methods that correct for stratification while accounting for these additional complexities.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Simulação por Computador , Humanos , Polimorfismo de Nucleotídeo Único
11.
Nat Genet ; 55(6): 952-963, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37231098

RESUMO

We explored ancestry-related differences in the genetic architecture of whole-blood gene expression using whole-genome and RNA sequencing data from 2,733 African Americans, Puerto Ricans and Mexican Americans. We found that heritability of gene expression significantly increased with greater proportions of African genetic ancestry and decreased with higher proportions of Indigenous American ancestry, reflecting the relationship between heterozygosity and genetic variance. Among heritable protein-coding genes, the prevalence of ancestry-specific expression quantitative trait loci (anc-eQTLs) was 30% in African ancestry and 8% for Indigenous American ancestry segments. Most anc-eQTLs (89%) were driven by population differences in allele frequency. Transcriptome-wide association analyses of multi-ancestry summary statistics for 28 traits identified 79% more gene-trait associations using transcriptome prediction models trained in our admixed population than models trained using data from the Genotype-Tissue Expression project. Our study highlights the importance of measuring gene expression across large and ancestrally diverse populations for enabling new discoveries and reducing disparities.


Assuntos
Negro ou Afro-Americano , Hispânico ou Latino , Americanos Mexicanos , Humanos , Negro ou Afro-Americano/genética , Estudo de Associação Genômica Ampla , Hispânico ou Latino/genética , Americanos Mexicanos/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Transcriptoma
12.
Science ; 378(6621): 754-761, 2022 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-36395242

RESUMO

The observation of genetic correlations between disparate human traits has been interpreted as evidence of widespread pleiotropy. Here, we introduce cross-trait assortative mating (xAM) as an alternative explanation. We observe that xAM affects many phenotypes and that phenotypic cross-mate correlation estimates are strongly associated with genetic correlation estimates (R2=74%). We demonstrate that existing xAM plausibly accounts for substantial fractions of genetic correlation estimates and that previously reported genetic correlation estimates between some pairs of psychiatric disorders are congruent with xAM alone. Finally, we provide evidence for a history of xAM at the genetic level using cross-trait even/odd chromosome polygenic score correlations. Together, our results demonstrate that previous reports have likely overestimated the true genetic similarity between many phenotypes.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Comunicação Celular , Fenótipo
13.
Nat Commun ; 13(1): 1632, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35347136

RESUMO

To identify genetic determinants of airway dysfunction, we performed a transcriptome-wide association study for asthma by combining RNA-seq data from the nasal airway epithelium of 681 children, with UK Biobank genetic association data. Our airway analysis identified 95 asthma genes, 58 of which were not identified by transcriptome-wide association analyses using other asthma-relevant tissues. Among these genes were MUC5AC, an airway mucin, and FOXA3, a transcriptional driver of mucus metaplasia. Muco-ciliary epithelial cultures from genotyped donors revealed that the MUC5AC risk variant increases MUC5AC protein secretion and mucus secretory cell frequency. Airway transcriptome-wide association analyses for mucus production and chronic cough also identified MUC5AC. These cis-expression variants were associated with trans effects on expression; the MUC5AC variant was associated with upregulation of non-inflammatory mucus secretory network genes, while the FOXA3 variant was associated with upregulation of type-2 inflammation-induced mucus-metaplasia pathway genes. Our results reveal genetic mechanisms of airway mucus pathobiology.


Assuntos
Asma , Transcriptoma , Asma/genética , Asma/metabolismo , Criança , Epitélio/metabolismo , Humanos , Metaplasia/metabolismo , Mucina-5AC/genética , Mucina-5AC/metabolismo , Muco/metabolismo
14.
Front Genet ; 12: 673167, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34108994

RESUMO

Genome-wide association studies (GWAS) are primarily conducted in single-ancestry settings. The low transferability of results has limited our understanding of human genetic architecture across a range of complex traits. In contrast to homogeneous populations, admixed populations provide an opportunity to capture genetic architecture contributed from multiple source populations and thus improve statistical power. Here, we provide a mechanistic simulation framework to investigate the statistical power and transferability of GWAS under directional polygenic selection or varying divergence. We focus on a two-way admixed population and show that GWAS in admixed populations can be enriched for power in discovery by up to 2-fold compared to the ancestral populations under similar sample size. Moreover, higher accuracy of cross-population polygenic score estimates is also observed if variants and weights are trained in the admixed group rather than in the ancestral groups. Common variant associations are also more likely to replicate if first discovered in the admixed group and then transferred to an ancestral population, than the other way around (across 50 iterations with 1,000 causal SNPs, training on 10,000 individuals, testing on 1,000 in each population, p = 3.78e-6, 6.19e-101, ∼0 for FST = 0.2, 0.5, 0.8, respectively). While some of these FST values may appear extreme, we demonstrate that they are found across the entire phenome in the GWAS catalog. This framework demonstrates that investigation of admixed populations harbors significant advantages over GWAS in single-ancestry cohorts for uncovering the genetic architecture of traits and will improve downstream applications such as personalized medicine across diverse populations.

15.
Cell Rep ; 31(1): 107489, 2020 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-32268104

RESUMO

Gene expression levels vary across developmental stage, cell type, and region in the brain. Genomic variants also contribute to the variation in expression, and some neuropsychiatric disorder loci may exert their effects through this mechanism. To investigate these relationships, we present BrainVar, a unique resource of paired whole-genome and bulk tissue RNA sequencing from the dorsolateral prefrontal cortex of 176 individuals across prenatal and postnatal development. Here we identify common variants that alter gene expression (expression quantitative trait loci [eQTLs]) constantly across development or predominantly during prenatal or postnatal stages. Both "constant" and "temporal-predominant" eQTLs are enriched for loci associated with neuropsychiatric traits and disorders and colocalize with specific variants. Expression levels of more than 12,000 genes rise or fall in a concerted late-fetal transition, with the transitional genes enriched for cell-type-specific genes and neuropsychiatric risk loci, underscoring the importance of cataloging developmental trajectories in understanding cortical physiology and pathology.


Assuntos
Encéfalo/embriologia , Biologia Computacional/métodos , Córtex Pré-Frontal/metabolismo , Sequência de Bases/genética , Encéfalo/crescimento & desenvolvimento , Encéfalo/metabolismo , Bases de Dados Genéticas , Predisposição Genética para Doença/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Sequenciamento do Exoma/métodos , Sequenciamento Completo do Genoma/métodos
16.
Genetics ; 178(3): 1709-23, 2008 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18385116

RESUMO

Genomewide association mapping in model organisms such as inbred mouse strains is a promising approach for the identification of risk factors related to human diseases. However, genetic association studies in inbred model organisms are confronted by the problem of complex population structure among strains. This induces inflated false positive rates, which cannot be corrected using standard approaches applied in human association studies such as genomic control or structured association. Recent studies demonstrated that mixed models successfully correct for the genetic relatedness in association mapping in maize and Arabidopsis panel data sets. However, the currently available mixed-model methods suffer from computational inefficiency. In this article, we propose a new method, efficient mixed-model association (EMMA), which corrects for population structure and genetic relatedness in model organism association mapping. Our method takes advantage of the specific nature of the optimization problem in applying mixed models for association mapping, which allows us to substantially increase the computational speed and reliability of the results. We applied EMMA to in silico whole-genome association mapping of inbred mouse strains involving hundreds of thousands of SNPs, in addition to Arabidopsis and maize data sets. We also performed extensive simulation studies to estimate the statistical power of EMMA under various SNP effects, varying degrees of population structure, and differing numbers of multiple measurements per strain. Despite the limited power of inbred mouse association mapping due to the limited number of available inbred strains, we are able to identify significantly associated SNPs, which fall into known QTL or genes identified through previous studies while avoiding an inflation of false positives. An R package implementation and webserver of our EMMA method are publicly available.


Assuntos
Arabidopsis/genética , Mapeamento Cromossômico/métodos , Modelos Biológicos , Zea mays/genética , Animais , Peso Corporal/genética , Flores/genética , Genoma/genética , Endogamia , Camundongos , Camundongos Endogâmicos , Modelos Genéticos , Tamanho do Órgão/genética , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Dinâmica Populacional , Característica Quantitativa Herdável , Sacarina/metabolismo , Software
17.
Evol Lett ; 3(1): 69-79, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-30788143

RESUMO

Selection and mutation shape the genetic variation underlying human traits, but the specific evolutionary mechanisms driving complex trait variation are largely unknown. We developed a statistical method that uses polarized genome-wide association study (GWAS) summary statistics from a single population to detect signals of mutational bias and selection. We found evidence for nonneutral signals on variation underlying several traits (body mass index [BMI], schizophrenia, Crohn's disease, educational attainment, and height). We then used simulations that incorporate simultaneous negative and positive selection to show that these signals are consistent with mutational bias and shifts in the fitness-phenotype relationship, but not stabilizing selection or mutational bias alone. We additionally replicate two of our top three signals (BMI and educational attainment) in an external cohort, and show that population stratification may have confounded GWAS summary statistics for height in the GIANT cohort. Our results provide a flexible and powerful framework for evolutionary analysis of complex phenotypes in humans and other species, and offer insights into the evolutionary mechanisms driving variation in human polygenic traits.

19.
Nat Genet ; 46(2): 100-6, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24473328

RESUMO

Mixed linear models are emerging as a method of choice for conducting genetic association studies in humans and other organisms. The advantages of the mixed-linear-model association (MLMA) method include the prevention of false positive associations due to population or relatedness structure and an increase in power obtained through the application of a correction that is specific to this structure. An underappreciated point is that MLMA can also increase power in studies without sample structure by implicitly conditioning on associated loci other than the candidate locus. Numerous variations on the standard MLMA approach have recently been published, with a focus on reducing computational cost. These advances provide researchers applying MLMA methods with many options to choose from, but we caution that MLMA methods are still subject to potential pitfalls. Here we describe and quantify the advantages and pitfalls of MLMA methods as a function of study design and provide recommendations for the application of these methods in practical settings.


Assuntos
Estudos de Associação Genética/métodos , Modelos Lineares , Projetos de Pesquisa , Colite Ulcerativa/genética , Simulação por Computador , Estudos de Associação Genética/estatística & dados numéricos , Humanos , Esclerose Múltipla/genética
20.
J Comput Biol ; 17(3): 547-60, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20377463

RESUMO

Genome-wide association studies have proven to be a highly successful method for identification of genetic loci for complex phenotypes in both humans and model organisms. These large scale studies rely on the collection of hundreds of thousands of single nucleotide polymorphisms (SNPs) across the genome. Standard high-throughput genotyping technologies capture only a fraction of the total genetic variation. Recent efforts have shown that it is possible to "impute" with high accuracy the genotypes of SNPs that are not collected in the study provided that they are present in a reference data set which contains both SNPs collected in the study as well as other SNPs. We here introduce a novel HMM based technique to solve the imputation problem that addresses several shortcomings of existing methods. First, our method is adaptive which lets it estimate population genetic parameters from the data and be applied to model organisms that have very different evolutionary histories. Compared to previous methods, our method is up to ten times more accurate on model organisms such as mouse. Second, our algorithm scales in memory usage in the number of collected markers as opposed to the number of known SNPs. This issue is very relevant due to the size of the reference data sets currently being generated. We compare our method over mouse and human data sets to existing methods, and show that each has either comparable or better performance and much lower memory usage. The method is available for download at http://genetics.cs.ucla.edu/eminim.


Assuntos
Algoritmos , Haplótipos/genética , Animais , Estudos de Casos e Controles , Diploide , Humanos , Cadeias de Markov , Camundongos , Camundongos Endogâmicos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA