Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 100
Filtrar
1.
Science ; 376(6589): eabf1970, 2022 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-35389781

RESUMO

Systemic lupus erythematosus (SLE) is a heterogeneous autoimmune disease. Knowledge of circulating immune cell types and states associated with SLE remains incomplete. We profiled more than 1.2 million peripheral blood mononuclear cells (162 cases, 99 controls) with multiplexed single-cell RNA sequencing (mux-seq). Cases exhibited elevated expression of type 1 interferon-stimulated genes (ISGs) in monocytes, reduction of naïve CD4+ T cells that correlated with monocyte ISG expression, and expansion of repertoire-restricted cytotoxic GZMH+ CD8+ T cells. Cell type-specific expression features predicted case-control status and stratified patients into two molecular subtypes. We integrated dense genotyping data to map cell type-specific cis-expression quantitative trait loci and to link SLE-associated variants to cell type-specific expression. These results demonstrate mux-seq as a systematic approach to characterize cellular composition, identify transcriptional signatures, and annotate genetic variants associated with SLE.


Assuntos
Interferon Tipo I , Lúpus Eritematoso Sistêmico , Linfócitos T CD8-Positivos/metabolismo , Estudos de Casos e Controles , Humanos , Interferon Tipo I/metabolismo , Leucócitos Mononucleares , Lúpus Eritematoso Sistêmico/genética , RNA-Seq , Transcrição Genética
2.
Nat Commun ; 13(1): 1632, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35347136

RESUMO

To identify genetic determinants of airway dysfunction, we performed a transcriptome-wide association study for asthma by combining RNA-seq data from the nasal airway epithelium of 681 children, with UK Biobank genetic association data. Our airway analysis identified 95 asthma genes, 58 of which were not identified by transcriptome-wide association analyses using other asthma-relevant tissues. Among these genes were MUC5AC, an airway mucin, and FOXA3, a transcriptional driver of mucus metaplasia. Muco-ciliary epithelial cultures from genotyped donors revealed that the MUC5AC risk variant increases MUC5AC protein secretion and mucus secretory cell frequency. Airway transcriptome-wide association analyses for mucus production and chronic cough also identified MUC5AC. These cis-expression variants were associated with trans effects on expression; the MUC5AC variant was associated with upregulation of non-inflammatory mucus secretory network genes, while the FOXA3 variant was associated with upregulation of type-2 inflammation-induced mucus-metaplasia pathway genes. Our results reveal genetic mechanisms of airway mucus pathobiology.


Assuntos
Asma , Transcriptoma , Asma/genética , Asma/metabolismo , Criança , Epitélio/metabolismo , Humanos , Metaplasia/metabolismo , Mucina-5AC/genética , Mucina-5AC/metabolismo , Muco/metabolismo
3.
Genome Med ; 14(1): 7, 2022 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-35042540

RESUMO

BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a complex, late-onset, neurodegenerative disease with a genetic contribution to disease liability. Genome-wide association studies (GWAS) have identified ten risk loci to date, including the TNIP1/GPX3 locus on chromosome five. Given association analysis data alone cannot determine the most plausible risk gene for this locus, we undertook a comprehensive suite of in silico, in vivo and in vitro studies to address this. METHODS: The Functional Mapping and Annotation (FUMA) pipeline and five tools (conditional and joint analysis (GCTA-COJO), Stratified Linkage Disequilibrium Score Regression (S-LDSC), Polygenic Priority Scoring (PoPS), Summary-based Mendelian Randomisation (SMR-HEIDI) and transcriptome-wide association study (TWAS) analyses) were used to perform bioinformatic integration of GWAS data (Ncases = 20,806, Ncontrols = 59,804) with 'omics reference datasets including the blood (eQTLgen consortium N = 31,684) and brain (N = 2581). This was followed up by specific expression studies in ALS case-control cohorts (microarray Ntotal = 942, protein Ntotal = 300) and gene knockdown (KD) studies of human neuronal iPSC cells and zebrafish-morpholinos (MO). RESULTS: SMR analyses implicated both TNIP1 and GPX3 (p < 1.15 × 10-6), but there was no simple SNP/expression relationship. Integrating multiple datasets using PoPS supported GPX3 but not TNIP1. In vivo expression analyses from blood in ALS cases identified that lower GPX3 expression correlated with a more progressed disease (ALS functional rating score, p = 5.5 × 10-3, adjusted R2 = 0.042, Beffect = 27.4 ± 13.3 ng/ml/ALSFRS unit) with microarray and protein data suggesting lower expression with risk allele (recessive model p = 0.06, p = 0.02 respectively). Validation in vivo indicated gpx3 KD caused significant motor deficits in zebrafish-MO (mean difference vs. control ± 95% CI, vs. control, swim distance = 112 ± 28 mm, time = 1.29 ± 0.59 s, speed = 32.0 ± 2.53 mm/s, respectively, p for all < 0.0001), which were rescued with gpx3 expression, with no phenotype identified with tnip1 KD or gpx3 overexpression. CONCLUSIONS: These results support GPX3 as a lead ALS risk gene in this locus, with more data needed to confirm/reject a role for TNIP1. This has implications for understanding disease mechanisms (GPX3 acts in the same pathway as SOD1, a well-established ALS-associated gene) and identifying new therapeutic approaches. Few previous examples of in-depth investigations of risk loci in ALS exist and a similar approach could be applied to investigate future expected GWAS findings.


Assuntos
Esclerose Amiotrófica Lateral , Doenças Neurodegenerativas , Esclerose Amiotrófica Lateral/genética , Animais , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Peixe-Zebra/genética
4.
G3 (Bethesda) ; 12(1)2022 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-34849835

RESUMO

AU-rich elements (AREs) are 3' UTR cis-regulatory elements that regulate the stability of mRNAs. Consensus ARE motifs have been determined, but little is known about how differences in 3' UTR sequences that conform to these motifs affect their function. Here, we use functional annotation of sequences from 3' UTRs (fast-UTR), a massively parallel reporter assay (MPRA), to investigate the effects of 41,288 3' UTR sequence fragments from 4653 transcripts on gene expression and mRNA stability in Jurkat and Beas2B cells. Our analyses demonstrate that the length of an ARE and its registration (the first and last nucleotides of the repeating ARE motif) have significant effects on gene expression and stability. Based on this finding, we propose improved ARE classification and concomitant methods to categorize and predict the effect of AREs on gene expression and stability. Finally, to investigate the advantages of our general experimental design we examine other motifs including constitutive decay elements (CDEs), where we show that the length of the CDE stem-loop has a significant impact on steady-state expression and mRNA stability. We conclude that fast-UTR, in conjunction with our analytical approach, can produce improved yet simple sequence-based rules for predicting the activity of human 3' UTRs.


Assuntos
Regulação da Expressão Gênica , Estabilidade de RNA , Regiões 3' não Traduzidas , Humanos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Sequências Reguladoras de Ácido Nucleico
5.
Genome Med ; 13(1): 179, 2021 11 08.
Artigo em Inglês | MEDLINE | ID: mdl-34749793

RESUMO

BACKGROUND: Hundreds of thousands of cancer patients have had targeted (panel) tumor sequencing to identify clinically meaningful mutations. In addition to improving patient outcomes, this activity has led to significant discoveries in basic and translational domains. However, the targeted nature of clinical tumor sequencing has a limited scope, especially for germline genetics. In this work, we assess the utility of discarded, off-target reads from tumor-only panel sequencing for the recovery of genome-wide germline genotypes through imputation. METHODS: We developed a framework for inference of germline variants from tumor panel sequencing, including imputation, quality control, inference of genetic ancestry, germline polygenic risk scores, and HLA alleles. We benchmarked our framework on 833 individuals with tumor sequencing and matched germline SNP array data. We then applied our approach to a prospectively collected panel sequencing cohort of 25,889 tumors. RESULTS: We demonstrate high to moderate accuracy of each inferred feature relative to direct germline SNP array genotyping: individual common variants were imputed with a mean accuracy (correlation) of 0.86, genetic ancestry was inferred with a correlation of > 0.98, polygenic risk scores were inferred with a correlation of > 0.90, and individual HLA alleles were inferred with a correlation of > 0.80. We demonstrate a minimal influence on the accuracy of somatic copy number alterations and other tumor features. We showcase the feasibility and utility of our framework by analyzing 25,889 tumors and identifying the relationships between genetic ancestry, polygenic risk, and tumor characteristics that could not be studied with conventional on-target tumor data. CONCLUSIONS: We conclude that targeted tumor sequencing can be leveraged to build rich germline research cohorts from existing data and make our analysis pipeline publicly available to facilitate this effort.


Assuntos
Predisposição Genética para Doença/genética , Células Germinativas , Neoplasias/genética , Análise de Sequência de DNA , Alelos , Biologia Computacional , Variações do Número de Cópias de DNA , Frequência do Gene , Estudo de Associação Genômica Ampla , Genótipo , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação , Polimorfismo de Nucleotídeo Único
6.
Front Genet ; 12: 673167, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34108994

RESUMO

Genome-wide association studies (GWAS) are primarily conducted in single-ancestry settings. The low transferability of results has limited our understanding of human genetic architecture across a range of complex traits. In contrast to homogeneous populations, admixed populations provide an opportunity to capture genetic architecture contributed from multiple source populations and thus improve statistical power. Here, we provide a mechanistic simulation framework to investigate the statistical power and transferability of GWAS under directional polygenic selection or varying divergence. We focus on a two-way admixed population and show that GWAS in admixed populations can be enriched for power in discovery by up to 2-fold compared to the ancestral populations under similar sample size. Moreover, higher accuracy of cross-population polygenic score estimates is also observed if variants and weights are trained in the admixed group rather than in the ancestral groups. Common variant associations are also more likely to replicate if first discovered in the admixed group and then transferred to an ancestral population, than the other way around (across 50 iterations with 1,000 causal SNPs, training on 10,000 individuals, testing on 1,000 in each population, p = 3.78e-6, 6.19e-101, ∼0 for FST = 0.2, 0.5, 0.8, respectively). While some of these FST values may appear extreme, we demonstrate that they are found across the entire phenome in the GWAS catalog. This framework demonstrates that investigation of admixed populations harbors significant advantages over GWAS in single-ancestry cohorts for uncovering the genetic architecture of traits and will improve downstream applications such as personalized medicine across diverse populations.

7.
Nat Commun ; 12(1): 2717, 2021 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-33976150

RESUMO

Circulating cell-free DNA (cfDNA) in the bloodstream originates from dying cells and is a promising noninvasive biomarker for cell death. Here, we propose an algorithm, CelFiE, to accurately estimate the relative abundances of cell types and tissues contributing to cfDNA from epigenetic cfDNA sequencing. In contrast to previous work, CelFiE accommodates low coverage data, does not require CpG site curation, and estimates contributions from multiple unknown cell types that are not available in external reference data. In simulations, CelFiE accurately estimates known and unknown cell type proportions from low coverage and noisy cfDNA mixtures, including from cell types composing less than 1% of the total mixture. When used in two clinically-relevant situations, CelFiE correctly estimates a large placenta component in pregnant women, and an elevated skeletal muscle component in amyotrophic lateral sclerosis (ALS) patients, consistent with the occurrence of muscle wasting typical in these patients. Together, these results show how CelFiE could be a useful tool for biomarker discovery and monitoring the progression of degenerative disease.


Assuntos
Algoritmos , Esclerose Amiotrófica Lateral/genética , Ácidos Nucleicos Livres/genética , Metilação de DNA , Epigênese Genética , Adulto , Esclerose Amiotrófica Lateral/sangue , Esclerose Amiotrófica Lateral/imunologia , Esclerose Amiotrófica Lateral/patologia , Linfócitos B/imunologia , Linfócitos B/metabolismo , Biomarcadores/sangue , Estudos de Casos e Controles , Ácidos Nucleicos Livres/sangue , Ácidos Nucleicos Livres/classificação , Feminino , Humanos , Macrófagos/imunologia , Macrófagos/metabolismo , Masculino , Monócitos/imunologia , Monócitos/metabolismo , Músculo Esquelético/imunologia , Músculo Esquelético/metabolismo , Músculo Esquelético/patologia , Neutrófilos/imunologia , Neutrófilos/metabolismo , Especificidade de Órgãos , Gravidez , Trimestres da Gravidez/sangue , Trimestres da Gravidez/genética , Linfócitos T/imunologia , Linfócitos T/metabolismo
8.
Proc Natl Acad Sci U S A ; 118(15)2021 04 13.
Artigo em Inglês | MEDLINE | ID: mdl-33833052

RESUMO

Interactions between genetic variants-epistasis-is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue-trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.


Assuntos
Epistasia Genética , Modelos Genéticos , Herança Multifatorial/genética , Evolução Molecular , Predisposição Genética para Doença , Humanos , Característica Quantitativa Herdável
9.
Cell ; 184(8): 2068-2083.e11, 2021 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-33861964

RESUMO

Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.


Assuntos
/genética , Saúde da População , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Genômica , Humanos , Autorrelato
11.
Am J Hum Genet ; 108(2): 219-239, 2021 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-33440170

RESUMO

We present a full-likelihood method to infer polygenic adaptation from DNA sequence variation and GWAS summary statistics to quantify recent transient directional selection acting on a complex trait. Through simulations of polygenic trait architecture evolution and GWASs, we show the method substantially improves power over current methods. We examine the robustness of the method under stratification, uncertainty and bias in marginal effects, uncertainty in the causal SNPs, allelic heterogeneity, negative selection, and low GWAS sample size. The method can quantify selection acting on correlated traits, controlling for pleiotropy even among traits with strong genetic correlation (|rg|=80%) while retaining high power to attribute selection to the causal trait. When the causal trait is excluded from analysis, selection is attributed to its closest proxy. We discuss limitations of the method, cautioning against strongly causal interpretations of the results, and the possibility of undetectable gene-by-environment (GxE) interactions. We apply the method to 56 human polygenic traits, revealing signals of directional selection on pigmentation, life history, glycated hemoglobin (HbA1c), and other traits. We also conduct joint testing of 137 pairs of genetically correlated traits, revealing widespread correlated response acting on these traits (2.6-fold enrichment, p = 1.5 × 10-7). Signs of selection on some traits previously reported as adaptive (e.g., educational attainment and hair color) are largely attributable to correlated response (p = 2.9 × 10-6 and 1.7 × 10-4, respectively). Lastly, our joint test shows antagonistic selection has increased type 2 diabetes risk and decrease HbA1c (p = 1.5 × 10-5).


Assuntos
Genoma Humano , Herança Multifatorial , Seleção Genética , Simulação por Computador , Diabetes Mellitus Tipo 2/genética , Evolução Molecular , Interação Gene-Ambiente , Heterogeneidade Genética , Pleiotropia Genética , Estudo de Associação Genômica Ampla , Hemoglobina A Glicada/genética , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra
13.
PLoS Genet ; 16(10): e1009165, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33104702

RESUMO

BACKGROUND: The majority of quantitative genetic models used to map complex traits assume that alleles have similar effects across all individuals. Significant evidence suggests, however, that epistatic interactions modulate the impact of many alleles. Nevertheless, identifying epistatic interactions remains computationally and statistically challenging. In this work, we address some of these challenges by developing a statistical test for polygenic epistasis that determines whether the effect of an allele is altered by the global genetic ancestry proportion from distinct progenitors. RESULTS: We applied our method to data from mice and yeast. For the mice, we observed 49 significant genotype-by-ancestry interaction associations across 14 phenotypes as well as over 1,400 Bonferroni-corrected genotype-by-ancestry interaction associations for mouse gene expression data. For the yeast, we observed 92 significant genotype-by-ancestry interactions across 38 phenotypes. Given this evidence of epistasis, we test for and observe evidence of rapid selection pressure on ancestry specific polymorphisms within one of the cohorts, consistent with epistatic selection. CONCLUSIONS: Unlike our prior work in human populations, we observe widespread evidence of ancestry-modified SNP effects, perhaps reflecting the greater divergence present in crosses using mice and yeast.


Assuntos
Epistasia Genética , Evolução Molecular , Herança Multifatorial/genética , Seleção Genética/genética , Alelos , Animais , Genótipo , Humanos , Camundongos , Modelos Genéticos , Fenótipo , Locos de Características Quantitativas/genética , Saccharomyces cerevisiae/genética
15.
Annu Rev Genomics Hum Genet ; 21: 413-435, 2020 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-32873077

RESUMO

Disease classification, or nosology, was historically driven by careful examination of clinical features of patients. As technologies to measure and understand human phenotypes advanced, so too did classifications of disease, and the advent of genetic data has led to a surge in genetic subtyping in the past decades. Although the fundamental process of refining disease definitions and subtypes is shared across diverse fields, each field is driven by its own goals and technological expertise, leading to inconsistent and conflicting definitions of disease subtypes. Here, we review several classical and recent subtypes and subtyping approaches and provide concrete definitions to delineate subtypes. In particular, we focus on subtypes with distinct causal disease biology, which are of primary interest to scientists, and subtypes with pragmatic medical benefits, which are of primary interest to physicians. We propose genetic heterogeneity as a gold standard for establishing biologically distinct subtypes of complex polygenic disease. We focus especially on methods to find and validate genetic subtypes, emphasizing common pitfalls and how to avoid them.


Assuntos
Biomarcadores/análise , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Herança Multifatorial , Mutação , Neoplasias/genética , Regulação Neoplásica da Expressão Gênica , Estudos de Associação Genética , Doenças Genéticas Inatas/classificação , Doenças Genéticas Inatas/patologia , Humanos , Neoplasias/classificação , Neoplasias/patologia
16.
PLoS Genet ; 16(8): e1008927, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32797036

RESUMO

The genetic control of gene expression is a core component of human physiology. For the past several years, transcriptome-wide association studies have leveraged large datasets of linked genotype and RNA sequencing information to create a powerful gene-based test of association that has been used in dozens of studies. While numerous discoveries have been made, the populations in the training data are overwhelmingly of European descent, and little is known about the generalizability of these models to other populations. Here, we test for cross-population generalizability of gene expression prediction models using a dataset of African American individuals with RNA-Seq data in whole blood. We find that the default models trained in large datasets such as GTEx and DGN fare poorly in African Americans, with a notable reduction in prediction accuracy when compared to European Americans. We replicate these limitations in cross-population generalizability using the five populations in the GEUVADIS dataset. Via realistic simulations of both populations and gene expression, we show that accurate cross-population generalizability of transcriptome prediction only arises when eQTL architecture is substantially shared across populations. In contrast, models with non-identical eQTLs showed patterns similar to real-world data. Therefore, generating RNA-Seq data in diverse populations is a critical step towards multi-ethnic utility of gene expression prediction.


Assuntos
Afro-Americanos/genética , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Transcriptoma , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/normas , Estudo de Associação Genômica Ampla/normas , Humanos , Locos de Características Quantitativas , RNA-Seq/métodos , RNA-Seq/normas , Padrões de Referência
17.
Genome Biol ; 21(1): 211, 2020 08 24.
Artigo em Inglês | MEDLINE | ID: mdl-32831138

RESUMO

The observation that disease-associated genetic variants typically reside outside of exons has inspired widespread investigation into the genetic basis of transcriptional regulation. While associations between the mRNA abundance of a gene and its proximal SNPs (cis-eQTLs) are now readily identified, identification of high-quality distal associations (trans-eQTLs) has been limited by a heavy multiple testing burden and the proneness to false-positive signals. To address these issues, we develop GBAT, a powerful gene-based pipeline that allows robust detection of high-quality trans-gene regulation signal.


Assuntos
Regulação da Expressão Gênica , Testes Genéticos/métodos , Estudo de Associação Genômica Ampla , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único , RNA Mensageiro
18.
Nat Commun ; 11(1): 3126, 2020 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-32561710

RESUMO

Profiling immunoglobulin (Ig) receptor repertoires with specialized assays can be cost-ineffective and time-consuming. Here we report ImReP, a computational method for rapid and accurate profiling of the Ig repertoire, including the complementary-determining region 3 (CDR3), using regular RNA sequencing data such as those from 8,555 samples across 53 tissues types from 544 individuals in the Genotype-Tissue Expression (GTEx v6) project. Using ImReP and GTEx v6 data, we generate a collection of 3.6 million Ig sequences, termed the atlas of immunoglobulin repertoires (TAIR), across a broad range of tissue types that often do not have reported Ig repertoires information. Moreover, the flow of Ig clonotypes and inter-tissue repertoire similarities across immune-related tissues are also evaluated. In summary, TAIR is one of the largest collections of CDR3 sequences and tissue types, and should serve as an important resource for studying immunological diseases.


Assuntos
Regiões Determinantes de Complementaridade/genética , Biologia Computacional/métodos , RNA-Seq , Conjuntos de Dados como Assunto , Estudos de Viabilidade , Humanos , Receptores de Antígenos de Linfócitos B/genética
19.
Cell Rep ; 31(1): 107489, 2020 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-32268104

RESUMO

Gene expression levels vary across developmental stage, cell type, and region in the brain. Genomic variants also contribute to the variation in expression, and some neuropsychiatric disorder loci may exert their effects through this mechanism. To investigate these relationships, we present BrainVar, a unique resource of paired whole-genome and bulk tissue RNA sequencing from the dorsolateral prefrontal cortex of 176 individuals across prenatal and postnatal development. Here we identify common variants that alter gene expression (expression quantitative trait loci [eQTLs]) constantly across development or predominantly during prenatal or postnatal stages. Both "constant" and "temporal-predominant" eQTLs are enriched for loci associated with neuropsychiatric traits and disorders and colocalize with specific variants. Expression levels of more than 12,000 genes rise or fall in a concerted late-fetal transition, with the transitional genes enriched for cell-type-specific genes and neuropsychiatric risk loci, underscoring the importance of cataloging developmental trajectories in understanding cortical physiology and pathology.


Assuntos
Encéfalo/embriologia , Biologia Computacional/métodos , Córtex Pré-Frontal/metabolismo , Sequência de Bases/genética , Encéfalo/crescimento & desenvolvimento , Encéfalo/metabolismo , Bases de Dados Genéticas , Predisposição Genética para Doença/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Sequenciamento Completo do Exoma/métodos , Sequenciamento Completo do Genoma/métodos
20.
J Comput Biol ; 27(4): 599-612, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32077750

RESUMO

Large-scale cohorts with combined genetic and phenotypic data, coupled with methodological advances, have produced increasingly accurate genetic predictors of complex human phenotypes called polygenic risk scores (PRSs). In addition to the potential translational impacts of identifying at-risk individuals, PRS are being utilized for a growing list of scientific applications, including causal inference, identifying pleiotropy and genetic correlation, and powerful gene-based and mixed-model association tests. Existing PRS approaches rely on external large-scale genetic cohorts that have also measured the phenotype of interest. They further require matching on ancestry and genotyping platform or imputation quality. In this work, we present a novel reference-free method to produce a PRS that does not rely on an external cohort. We show that naive implementations of reference-free PRS either result in substantial overfitting or prohibitive increases in computational time. We show that our algorithm avoids both of these issues and can produce informative in-sample PRSs over a single cohort without overfitting. We then demonstrate several novel applications of reference-free PRSs, including detection of pleiotropy across 246 metabolic traits and efficient mixed-model association testing.


Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Herança Multifatorial/genética , Humanos , Modelos Lineares , Fenótipo , Fatores de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...