Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Nat Genet ; 55(6): 952-963, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37231098

RESUMEN

We explored ancestry-related differences in the genetic architecture of whole-blood gene expression using whole-genome and RNA sequencing data from 2,733 African Americans, Puerto Ricans and Mexican Americans. We found that heritability of gene expression significantly increased with greater proportions of African genetic ancestry and decreased with higher proportions of Indigenous American ancestry, reflecting the relationship between heterozygosity and genetic variance. Among heritable protein-coding genes, the prevalence of ancestry-specific expression quantitative trait loci (anc-eQTLs) was 30% in African ancestry and 8% for Indigenous American ancestry segments. Most anc-eQTLs (89%) were driven by population differences in allele frequency. Transcriptome-wide association analyses of multi-ancestry summary statistics for 28 traits identified 79% more gene-trait associations using transcriptome prediction models trained in our admixed population than models trained using data from the Genotype-Tissue Expression project. Our study highlights the importance of measuring gene expression across large and ancestrally diverse populations for enabling new discoveries and reducing disparities.


Asunto(s)
Negro o Afroamericano , Hispánicos o Latinos , Americanos Mexicanos , Humanos , Negro o Afroamericano/genética , Estudio de Asociación del Genoma Completo , Hispánicos o Latinos/genética , Americanos Mexicanos/genética , Fenotipo , Polimorfismo de Nucleótido Simple , Transcriptoma
2.
Science ; 378(6621): 754-761, 2022 11 18.
Artículo en Inglés | MEDLINE | ID: mdl-36395242

RESUMEN

The observation of genetic correlations between disparate human traits has been interpreted as evidence of widespread pleiotropy. Here, we introduce cross-trait assortative mating (xAM) as an alternative explanation. We observe that xAM affects many phenotypes and that phenotypic cross-mate correlation estimates are strongly associated with genetic correlation estimates (R2=74%). We demonstrate that existing xAM plausibly accounts for substantial fractions of genetic correlation estimates and that previously reported genetic correlation estimates between some pairs of psychiatric disorders are congruent with xAM alone. Finally, we provide evidence for a history of xAM at the genetic level using cross-trait even/odd chromosome polygenic score correlations. Together, our results demonstrate that previous reports have likely overestimated the true genetic similarity between many phenotypes.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Comunicación Celular , Fenotipo
4.
Nat Commun ; 13(1): 1632, 2022 03 28.
Artículo en Inglés | MEDLINE | ID: mdl-35347136

RESUMEN

To identify genetic determinants of airway dysfunction, we performed a transcriptome-wide association study for asthma by combining RNA-seq data from the nasal airway epithelium of 681 children, with UK Biobank genetic association data. Our airway analysis identified 95 asthma genes, 58 of which were not identified by transcriptome-wide association analyses using other asthma-relevant tissues. Among these genes were MUC5AC, an airway mucin, and FOXA3, a transcriptional driver of mucus metaplasia. Muco-ciliary epithelial cultures from genotyped donors revealed that the MUC5AC risk variant increases MUC5AC protein secretion and mucus secretory cell frequency. Airway transcriptome-wide association analyses for mucus production and chronic cough also identified MUC5AC. These cis-expression variants were associated with trans effects on expression; the MUC5AC variant was associated with upregulation of non-inflammatory mucus secretory network genes, while the FOXA3 variant was associated with upregulation of type-2 inflammation-induced mucus-metaplasia pathway genes. Our results reveal genetic mechanisms of airway mucus pathobiology.


Asunto(s)
Asma , Transcriptoma , Asma/genética , Asma/metabolismo , Niño , Epitelio/metabolismo , Humanos , Metaplasia/metabolismo , Mucina 5AC/genética , Mucina 5AC/metabolismo , Moco/metabolismo
5.
Front Genet ; 12: 673167, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34108994

RESUMEN

Genome-wide association studies (GWAS) are primarily conducted in single-ancestry settings. The low transferability of results has limited our understanding of human genetic architecture across a range of complex traits. In contrast to homogeneous populations, admixed populations provide an opportunity to capture genetic architecture contributed from multiple source populations and thus improve statistical power. Here, we provide a mechanistic simulation framework to investigate the statistical power and transferability of GWAS under directional polygenic selection or varying divergence. We focus on a two-way admixed population and show that GWAS in admixed populations can be enriched for power in discovery by up to 2-fold compared to the ancestral populations under similar sample size. Moreover, higher accuracy of cross-population polygenic score estimates is also observed if variants and weights are trained in the admixed group rather than in the ancestral groups. Common variant associations are also more likely to replicate if first discovered in the admixed group and then transferred to an ancestral population, than the other way around (across 50 iterations with 1,000 causal SNPs, training on 10,000 individuals, testing on 1,000 in each population, p = 3.78e-6, 6.19e-101, ∼0 for FST = 0.2, 0.5, 0.8, respectively). While some of these FST values may appear extreme, we demonstrate that they are found across the entire phenome in the GWAS catalog. This framework demonstrates that investigation of admixed populations harbors significant advantages over GWAS in single-ancestry cohorts for uncovering the genetic architecture of traits and will improve downstream applications such as personalized medicine across diverse populations.

6.
Cell ; 184(8): 2068-2083.e11, 2021 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-33861964

RESUMEN

Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.


Asunto(s)
Etnicidad/genética , Salud Poblacional , Bases de Datos Genéticas , Registros Electrónicos de Salud , Genómica , Humanos , Autoinforme
7.
Am J Hum Genet ; 108(2): 219-239, 2021 02 04.
Artículo en Inglés | MEDLINE | ID: mdl-33440170

RESUMEN

We present a full-likelihood method to infer polygenic adaptation from DNA sequence variation and GWAS summary statistics to quantify recent transient directional selection acting on a complex trait. Through simulations of polygenic trait architecture evolution and GWASs, we show the method substantially improves power over current methods. We examine the robustness of the method under stratification, uncertainty and bias in marginal effects, uncertainty in the causal SNPs, allelic heterogeneity, negative selection, and low GWAS sample size. The method can quantify selection acting on correlated traits, controlling for pleiotropy even among traits with strong genetic correlation (|rg|=80%) while retaining high power to attribute selection to the causal trait. When the causal trait is excluded from analysis, selection is attributed to its closest proxy. We discuss limitations of the method, cautioning against strongly causal interpretations of the results, and the possibility of undetectable gene-by-environment (GxE) interactions. We apply the method to 56 human polygenic traits, revealing signals of directional selection on pigmentation, life history, glycated hemoglobin (HbA1c), and other traits. We also conduct joint testing of 137 pairs of genetically correlated traits, revealing widespread correlated response acting on these traits (2.6-fold enrichment, p = 1.5 × 10-7). Signs of selection on some traits previously reported as adaptive (e.g., educational attainment and hair color) are largely attributable to correlated response (p = 2.9 × 10-6 and 1.7 × 10-4, respectively). Lastly, our joint test shows antagonistic selection has increased type 2 diabetes risk and decrease HbA1c (p = 1.5 × 10-5).


Asunto(s)
Genoma Humano , Herencia Multifactorial , Selección Genética , Simulación por Computador , Diabetes Mellitus Tipo 2/genética , Evolución Molecular , Interacción Gen-Ambiente , Heterogeneidad Genética , Pleiotropía Genética , Estudio de Asociación del Genoma Completo , Hemoglobina Glucada/genética , Humanos , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple , Tamaño de la Muestra
9.
Cell Rep ; 31(1): 107489, 2020 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-32268104

RESUMEN

Gene expression levels vary across developmental stage, cell type, and region in the brain. Genomic variants also contribute to the variation in expression, and some neuropsychiatric disorder loci may exert their effects through this mechanism. To investigate these relationships, we present BrainVar, a unique resource of paired whole-genome and bulk tissue RNA sequencing from the dorsolateral prefrontal cortex of 176 individuals across prenatal and postnatal development. Here we identify common variants that alter gene expression (expression quantitative trait loci [eQTLs]) constantly across development or predominantly during prenatal or postnatal stages. Both "constant" and "temporal-predominant" eQTLs are enriched for loci associated with neuropsychiatric traits and disorders and colocalize with specific variants. Expression levels of more than 12,000 genes rise or fall in a concerted late-fetal transition, with the transitional genes enriched for cell-type-specific genes and neuropsychiatric risk loci, underscoring the importance of cataloging developmental trajectories in understanding cortical physiology and pathology.


Asunto(s)
Encéfalo/embriología , Biología Computacional/métodos , Corteza Prefrontal/metabolismo , Secuencia de Bases/genética , Encéfalo/crecimiento & desarrollo , Encéfalo/metabolismo , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma/genética , Secuenciación del Exoma/métodos , Secuenciación Completa del Genoma/métodos
10.
Evol Lett ; 3(1): 69-79, 2019 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-30788143

RESUMEN

Selection and mutation shape the genetic variation underlying human traits, but the specific evolutionary mechanisms driving complex trait variation are largely unknown. We developed a statistical method that uses polarized genome-wide association study (GWAS) summary statistics from a single population to detect signals of mutational bias and selection. We found evidence for nonneutral signals on variation underlying several traits (body mass index [BMI], schizophrenia, Crohn's disease, educational attainment, and height). We then used simulations that incorporate simultaneous negative and positive selection to show that these signals are consistent with mutational bias and shifts in the fitness-phenotype relationship, but not stabilizing selection or mutational bias alone. We additionally replicate two of our top three signals (BMI and educational attainment) in an external cohort, and show that population stratification may have confounded GWAS summary statistics for height in the GIANT cohort. Our results provide a flexible and powerful framework for evolutionary analysis of complex phenotypes in humans and other species, and offer insights into the evolutionary mechanisms driving variation in human polygenic traits.

11.
Am J Hum Genet ; 100(1): 31-39, 2017 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-28017371

RESUMEN

Mixed models have become the tool of choice for genetic association studies; however, standard mixed model methods may be poorly calibrated or underpowered under family sampling bias and/or case-control ascertainment. Previously, we introduced a liability threshold-based mixed model association statistic (LTMLM) to address case-control ascertainment in unrelated samples. Here, we consider family-biased case-control ascertainment, where case and control subjects are ascertained non-randomly with respect to family relatedness. Previous work has shown that this type of ascertainment can severely bias heritability estimates; we show here that it also impacts mixed model association statistics. We introduce a family-based association statistic (LT-Fam) that is robust to this problem. Similar to LTMLM, LT-Fam is computed from posterior mean liabilities (PML) under a liability threshold model; however, LT-Fam uses published narrow-sense heritability estimates to avoid the problem of biased heritability estimation, enabling correct calibration. In simulations with family-biased case-control ascertainment, LT-Fam was correctly calibrated (average χ2 = 1.00-1.02 for null SNPs), whereas the Armitage trend test (ATT), standard mixed model association (MLM), and case-control retrospective association test (CARAT) were mis-calibrated (e.g., average χ2 = 0.50-1.22 for MLM, 0.89-2.65 for CARAT). LT-Fam also attained higher power than other methods in some settings. In 1,259 type 2 diabetes-affected case subjects and 5,765 control subjects from the CARe cohort, downsampled to induce family-biased ascertainment, LT-Fam was correctly calibrated whereas ATT, MLM, and CARAT were again mis-calibrated. Our results highlight the importance of modeling family sampling bias in case-control datasets with related samples.


Asunto(s)
Familia , Estudios de Asociación Genética/métodos , Modelos Genéticos , Sesgo , Calibración , Diabetes Mellitus Tipo 2/genética , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Estudios Retrospectivos
12.
Genome Res ; 26(7): 863-73, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-27197206

RESUMEN

The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature.


Asunto(s)
Evolución Molecular , Modelos Genéticos , Alelos , Cromosomas Humanos/genética , Variación Genética , Genética Médica , Humanos , Fenotipo , Crecimiento Demográfico , Población Blanca/genética
13.
Genome Res ; 25(7): 927-36, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25953952

RESUMEN

Genomic imprinting is an important regulatory mechanism that silences one of the parental copies of a gene. To systematically characterize this phenomenon, we analyze tissue specificity of imprinting from allelic expression data in 1582 primary tissue samples from 178 individuals from the Genotype-Tissue Expression (GTEx) project. We characterize imprinting in 42 genes, including both novel and previously identified genes. Tissue specificity of imprinting is widespread, and gender-specific effects are revealed in a small number of genes in muscle with stronger imprinting in males. IGF2 shows maternal expression in the brain instead of the canonical paternal expression elsewhere. Imprinting appears to have only a subtle impact on tissue-specific expression levels, with genes lacking a systematic expression difference between tissues with imprinted and biallelic expression. In summary, our systematic characterization of imprinting in adult tissues highlights variation in imprinting between genes, individuals, and tissues.


Asunto(s)
Impresión Genómica , Genómica , Adulto , Alelos , Análisis por Conglomerados , Metilación de ADN , Bases de Datos de Ácidos Nucleicos , Femenino , Regulación de la Expresión Génica , Variación Genética , Genotipo , Humanos , Masculino , Especificidad de Órganos/genética , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Factores Sexuales
14.
Am J Hum Genet ; 96(5): 720-30, 2015 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-25892111

RESUMEN

We introduce a liability-threshold mixed linear model (LTMLM) association statistic for case-control studies and show that it has a well-controlled false-positive rate and more power than existing mixed-model methods for diseases with low prevalence. Existing mixed-model methods suffer a loss in power under case-control ascertainment, but no solution has been proposed. Here, we solve this problem by using a χ(2) score statistic computed from posterior mean liabilities (PMLs) under the liability-threshold model. Each individual's PML is conditional not only on that individual's case-control status but also on every individual's case-control status and the genetic relationship matrix (GRM) obtained from the data. The PMLs are estimated with a multivariate Gibbs sampler; the liability-scale phenotypic covariance matrix is based on the GRM, and a heritability parameter is estimated via Haseman-Elston regression on case-control phenotypes and then transformed to the liability scale. In simulations of unrelated individuals, the LTMLM statistic was correctly calibrated and achieved higher power than existing mixed-model methods for diseases with low prevalence, and the magnitude of the improvement depended on sample size and severity of case-control ascertainment. In a Wellcome Trust Case Control Consortium 2 multiple sclerosis dataset with >10,000 samples, LTMLM was correctly calibrated and attained a 4.3% improvement (p = 0.005) in χ(2) statistics over existing mixed-model methods at 75 known associated SNPs, consistent with simulations. Larger increases in power are expected at larger sample sizes. In conclusion, case-control studies of diseases with low prevalence can achieve power higher than that in existing mixed-model methods.


Asunto(s)
Estudios de Asociación Genética , Modelos Genéticos , Modelos Teóricos , Estudios de Casos y Controles , Mapeo Cromosómico , Simulación por Computador , Humanos , Esclerosis Múltiple/genética , Esclerosis Múltiple/patología , Fenotipo , Polimorfismo de Nucleótido Simple , Tamaño de la Muestra
15.
Bioinformatics ; 31(15): 2497-504, 2015 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-25819081

RESUMEN

MOTIVATION: RNA sequencing enables allele-specific expression (ASE) studies that complement standard genotype expression studies for common variants and, importantly, also allow measuring the regulatory impact of rare variants. The Genotype-Tissue Expression (GTEx) project is collecting RNA-seq data on multiple tissues of a same set of individuals and novel methods are required for the analysis of these data. RESULTS: We present a statistical method to compare different patterns of ASE across tissues and to classify genetic variants according to their impact on the tissue-wide expression profile. We focus on strong ASE effects that we are expecting to see for protein-truncating variants, but our method can also be adjusted for other types of ASE effects. We illustrate the method with a real data example on a tissue-wide expression profile of a variant causal for lipoid proteinosis, and with a simulation study to assess our method more generally.


Asunto(s)
Proteínas de la Matriz Extracelular/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Proteinosis Lipoidea de Urbach y Wiethe/metabolismo , Polimorfismo de Nucleótido Simple/genética , ARN/análisis , Alelos , Proteínas de la Matriz Extracelular/genética , Humanos , Proteinosis Lipoidea de Urbach y Wiethe/genética , Especificidad de Órganos , Isoformas de Proteínas , ARN/genética
16.
Nat Genet ; 46(2): 100-6, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24473328

RESUMEN

Mixed linear models are emerging as a method of choice for conducting genetic association studies in humans and other organisms. The advantages of the mixed-linear-model association (MLMA) method include the prevention of false positive associations due to population or relatedness structure and an increase in power obtained through the application of a correction that is specific to this structure. An underappreciated point is that MLMA can also increase power in studies without sample structure by implicitly conditioning on associated loci other than the candidate locus. Numerous variations on the standard MLMA approach have recently been published, with a focus on reducing computational cost. These advances provide researchers applying MLMA methods with many options to choose from, but we caution that MLMA methods are still subject to potential pitfalls. Here we describe and quantify the advantages and pitfalls of MLMA methods as a function of study design and provide recommendations for the application of these methods in practical settings.


Asunto(s)
Estudios de Asociación Genética/métodos , Modelos Lineales , Proyectos de Investigación , Colitis Ulcerosa/genética , Simulación por Computador , Estudios de Asociación Genética/estadística & datos numéricos , Humanos , Esclerosis Múltiple/genética
18.
Nat Rev Genet ; 11(7): 459-63, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20548291

RESUMEN

Genome-wide association (GWA) studies are an effective approach for identifying genetic variants associated with disease risk. GWA studies can be confounded by population stratification--systematic ancestry differences between cases and controls--which has previously been addressed by methods that infer genetic ancestry. Those methods perform well in data sets in which population structure is the only kind of structure present but are inadequate in data sets that also contain family structure or cryptic relatedness. Here, we review recent progress on methods that correct for stratification while accounting for these additional complexities.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Modelos Genéticos , Simulación por Computador , Humanos , Polimorfismo de Nucleótido Simple
19.
J Comput Biol ; 17(3): 547-60, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20377463

RESUMEN

Genome-wide association studies have proven to be a highly successful method for identification of genetic loci for complex phenotypes in both humans and model organisms. These large scale studies rely on the collection of hundreds of thousands of single nucleotide polymorphisms (SNPs) across the genome. Standard high-throughput genotyping technologies capture only a fraction of the total genetic variation. Recent efforts have shown that it is possible to "impute" with high accuracy the genotypes of SNPs that are not collected in the study provided that they are present in a reference data set which contains both SNPs collected in the study as well as other SNPs. We here introduce a novel HMM based technique to solve the imputation problem that addresses several shortcomings of existing methods. First, our method is adaptive which lets it estimate population genetic parameters from the data and be applied to model organisms that have very different evolutionary histories. Compared to previous methods, our method is up to ten times more accurate on model organisms such as mouse. Second, our algorithm scales in memory usage in the number of collected markers as opposed to the number of known SNPs. This issue is very relevant due to the size of the reference data sets currently being generated. We compare our method over mouse and human data sets to existing methods, and show that each has either comparable or better performance and much lower memory usage. The method is available for download at http://genetics.cs.ucla.edu/eminim.


Asunto(s)
Algoritmos , Haplotipos/genética , Animales , Estudios de Casos y Controles , Diploidia , Humanos , Cadenas de Markov , Ratones , Ratones Endogámicos , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genética
20.
Nat Genet ; 42(4): 348-54, 2010 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-20208533

RESUMEN

Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Estadísticos , Grupos de Población/genética , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal , Sitios de Carácter Cuantitativo , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...