Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 233
Filtrar
1.
Cell ; 185(16): 3041-3055.e25, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35917817

RESUMO

Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.


Assuntos
Variações do Número de Cópias de DNA , Genoma Humano , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Haploinsuficiência/genética , Humanos
2.
Nature ; 625(7993): 92-100, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38057664

RESUMO

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Assuntos
Genoma Humano , Genômica , Modelos Genéticos , Mutação , Humanos , Acesso à Informação , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Frequência do Gene , Genoma Humano/genética , Mutação/genética , Seleção Genética
3.
Nature ; 614(7948): 492-499, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36755099

RESUMO

Both common and rare genetic variants influence complex traits and common diseases. Genome-wide association studies have identified thousands of common-variant associations, and more recently, large-scale exome sequencing studies have identified rare-variant associations in hundreds of genes1-3. However, rare-variant genetic architecture is not well characterized, and the relationship between common-variant and rare-variant architecture is unclear4. Here we quantify the heritability explained by the gene-wise burden of rare coding variants across 22 common traits and diseases in 394,783 UK Biobank exomes5. Rare coding variants (allele frequency < 1 × 10-3) explain 1.3% (s.e. = 0.03%) of phenotypic variance on average-much less than common variants-and most burden heritability is explained by ultrarare loss-of-function variants (allele frequency < 1 × 10-5). Common and rare variants implicate the same cell types, with similar enrichments, and they have pleiotropic effects on the same pairs of traits, with similar genetic correlations. They partially colocalize at individual genes and loci, but not to the same extent: burden heritability is strongly concentrated in significant genes, while common-variant heritability is more polygenic, and burden heritability is also more strongly concentrated in constrained genes. Finally, we find that burden heritability for schizophrenia and bipolar disorder6,7 is approximately 2%. Our results indicate that rare coding variants will implicate a tractable number of large-effect genes, that common and rare associations are mechanistically convergent, and that rare coding variants will contribute only modestly to missing heritability and population risk stratification.


Assuntos
Exoma , Frequência do Gene , Variação Genética , Herança Multifatorial , Humanos , Exoma/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Fatores de Risco , Reino Unido , Loci Gênicos/genética , Esquizofrenia/genética , Transtorno Bipolar/genética
4.
Nature ; 620(7975): 839-848, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37587338

RESUMO

Mitochondrial DNA (mtDNA) is a maternally inherited, high-copy-number genome required for oxidative phosphorylation1. Heteroplasmy refers to the presence of a mixture of mtDNA alleles in an individual and has been associated with disease and ageing. Mechanisms underlying common variation in human heteroplasmy, and the influence of the nuclear genome on this variation, remain insufficiently explored. Here we quantify mtDNA copy number (mtCN) and heteroplasmy using blood-derived whole-genome sequences from 274,832 individuals and perform genome-wide association studies to identify associated nuclear loci. Following blood cell composition correction, we find that mtCN declines linearly with age and is associated with variants at 92 nuclear loci. We observe that nearly everyone harbours heteroplasmic mtDNA variants obeying two principles: (1) heteroplasmic single nucleotide variants tend to arise somatically and accumulate sharply after the age of 70 years, whereas (2) heteroplasmic indels are maternally inherited as mixtures with relative levels associated with 42 nuclear loci involved in mtDNA replication, maintenance and novel pathways. These loci may act by conferring a replicative advantage to certain mtDNA alleles. As an illustrative example, we identify a length variant carried by more than 50% of humans at position chrM:302 within a G-quadruplex previously proposed to mediate mtDNA transcription/replication switching2,3. We find that this variant exerts cis-acting genetic control over mtDNA abundance and is itself associated in-trans with nuclear loci encoding machinery for this regulatory switch. Our study suggests that common variation in the nuclear genome can shape variation in mtCN and heteroplasmy dynamics across the human population.


Assuntos
Núcleo Celular , Variações do Número de Cópias de DNA , DNA Mitocondrial , Heteroplasmia , Mitocôndrias , Idoso , Humanos , Variações do Número de Cópias de DNA/genética , DNA Mitocondrial/genética , Estudo de Associação Genômica Ampla , Heteroplasmia/genética , Mitocôndrias/genética , Núcleo Celular/genética , Alelos , Polimorfismo de Nucleotídeo Único , Mutação INDEL , Quadruplex G
5.
Genome Res ; 34(5): 796-809, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38749656

RESUMO

Underrepresented populations are often excluded from genomic studies owing in part to a lack of resources supporting their analyses. The 1000 Genomes Project (1kGP) and Human Genome Diversity Project (HGDP), which have recently been sequenced to high coverage, are valuable genomic resources because of the global diversity they capture and their open data sharing policies. Here, we harmonized a high-quality set of 4094 whole genomes from 80 populations in the HGDP and 1kGP with data from the Genome Aggregation Database (gnomAD) and identified over 153 million high-quality SNVs, indels, and SVs. We performed a detailed ancestry analysis of this cohort, characterizing population structure and patterns of admixture across populations, analyzing site frequency spectra, and measuring variant counts at global and subcontinental levels. We also show substantial added value from this data set compared with the prior versions of the component resources, typically combined via liftOver and variant intersection; for example, we catalog millions of new genetic variants, mostly rare, compared with previous releases. In addition to unrestricted individual-level public release, we provide detailed tutorials for conducting many of the most common quality-control steps and analyses with these data in a scalable cloud-computing environment and publicly release this new phased joint callset for use as a haplotype resource in phasing and imputation pipelines. This jointly called reference panel will serve as a key resource to support research of diverse ancestry populations.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Humanos , Projeto Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variação Genética , Genômica/métodos
6.
Am J Hum Genet ; 110(12): 2068-2076, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38000370

RESUMO

DNA sample contamination is a major issue in clinical and research applications of whole-genome and -exome sequencing. Even modest levels of contamination can substantially affect the overall quality of variant calls and lead to widespread genotyping errors. Currently, popular tools for estimating the contamination level use short-read data (BAM/CRAM files), which are expensive to store and manipulate and often not retained or shared widely. We propose a metric to estimate DNA sample contamination from variant-level whole-genome and -exome sequence data called CHARR, contamination from homozygous alternate reference reads, which leverages the infiltration of reference reads within homozygous alternate variant calls. CHARR uses a small proportion of variant-level genotype information and thus can be computed from single-sample gVCFs or callsets in VCF or BCF formats, as well as efficiently stored variant calls in Hail VariantDataset format. Our results demonstrate that CHARR accurately recapitulates results from existing tools with substantially reduced costs, improving the accuracy and efficiency of downstream analyses of ultra-large whole-genome and exome sequencing datasets.


Assuntos
DNA , Truta , Humanos , Animais , Análise de Sequência de DNA/métodos , Genótipo , Homozigoto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software
7.
Genome Res ; 33(6): 999-1005, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37253541

RESUMO

Large-scale high-throughput sequencing data sets have been transformative for informing clinical variant interpretation and for use as reference panels for statistical and population genetic efforts. Although such resources are often treated as ground truth, we find that in widely used reference data sets such as the Genome Aggregation Database (gnomAD), some variants pass gold-standard filters, yet are systematically different in their genotype calls across genotype discovery approaches. The inclusion of such discordant sites in study designs involving multiple genotype discovery strategies could bias results and lead to false-positive hits in association studies owing to technological artifacts rather than a true relationship to the phenotype. Here, we describe this phenomenon of discordant genotype calls across genotype discovery approaches, characterize the error mode of wrong calls, provide a list of discordant sites identified in gnomAD that should be treated with caution in analyses, and present a metric and machine learning classifier trained on gnomAD data to identify likely discordant variants in other data sets. We find that different genotype discovery approaches have different sets of variants at which this problem occurs, but there are characteristic variant features that can be used to predict discordant behavior. Discordant sites are largely shared across ancestry groups, although different populations are powered for the discovery of different variants. We find that the most common error mode is that of a variant being heterozygous for one approach and homozygous for the other, with heterozygous in the genomes and homozygous reference in the exomes making up the majority of miscalls.


Assuntos
Exoma , Genética Populacional , Genótipo , Heterozigoto , Fenótipo , Polimorfismo de Nucleotídeo Único
8.
Nature ; 583(7814): 83-89, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32460305

RESUMO

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.


Assuntos
Variação Genética , Genoma Humano/genética , Sequenciamento Completo do Genoma , Alelos , Estudos de Casos e Controles , Epigênese Genética , Feminino , Dosagem de Genes/genética , Genética Populacional , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Anotação de Sequência Molecular , Locos de Características Quantitativas , Grupos Raciais/genética , Software
9.
Nature ; 586(7831): 769-775, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33057200

RESUMO

Myeloproliferative neoplasms (MPNs) are blood cancers that are characterized by the excessive production of mature myeloid cells and arise from the acquisition of somatic driver mutations in haematopoietic stem cells (HSCs). Epidemiological studies indicate a substantial heritable component of MPNs that is among the highest known for cancers1. However, only a limited number of genetic risk loci have been identified, and the underlying biological mechanisms that lead to the acquisition of MPNs remain unclear. Here, by conducting a large-scale genome-wide association study (3,797 cases and 1,152,977 controls), we identify 17 MPN risk loci (P < 5.0 × 10-8), 7 of which have not been previously reported. We find that there is a shared genetic architecture between MPN risk and several haematopoietic traits from distinct lineages; that there is an enrichment for MPN risk variants within accessible chromatin of HSCs; and that increased MPN risk is associated with longer telomere length in leukocytes and other clonal haematopoietic states-collectively suggesting that MPN risk is associated with the function and self-renewal of HSCs. We use gene mapping to identify modulators of HSC biology linked to MPN risk, and show through targeted variant-to-function assays that CHEK2 and GFI1B have roles in altering the function of HSCs to confer disease risk. Overall, our results reveal a previously unappreciated mechanism for inherited MPN risk through the modulation of HSC function.


Assuntos
Predisposição Genética para Doença/genética , Células-Tronco Hematopoéticas/patologia , Transtornos Mieloproliferativos/genética , Transtornos Mieloproliferativos/patologia , Neoplasias/genética , Neoplasias/patologia , Linhagem da Célula/genética , Autorrenovação Celular , Quinase do Ponto de Checagem 2/genética , Feminino , Humanos , Leucócitos/patologia , Masculino , Proteínas Proto-Oncogênicas/genética , Proteínas Repressoras/genética , Risco , Homeostase do Telômero
10.
Nature ; 581(7809): 444-451, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461652

RESUMO

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Assuntos
Doença/genética , Variação Genética , Genética Médica/normas , Genética Populacional/normas , Genoma Humano/genética , Feminino , Testes Genéticos , Técnicas de Genotipagem , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Polimorfismo de Nucleotídeo Único/genética , Grupos Raciais/genética , Padrões de Referência , Seleção Genética , Sequenciamento Completo do Genoma
11.
Am J Hum Genet ; 109(12): 2110-2125, 2022 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-36400022

RESUMO

The use of population descriptors such as race, ethnicity, and ancestry in science, medicine, and public health has a long, complicated, and at times dark history, particularly for genetics, given the field's perceived importance for understanding between-group differences. The historical and potential harms that come with irresponsible use of these categories suggests a clear need for definitive guidance about when and how they can be used appropriately. However, while many prior authors have provided such guidance, no established consensus exists, and the extant literature has not been examined for implied consensus and sources of disagreement. Here, we present the results of a scoping review of published normative recommendations regarding the use of population categories, particularly in genetics research. Following PRISMA guidelines, we extracted recommendations from n = 121 articles matching inclusion criteria. Articles were published consistently throughout the time period examined and in a broad range of journals, demonstrating an ongoing and interdisciplinary perceived need for guidance. Examined recommendations fall under one of eight themes identified during analysis. Seven are characterized by broad agreement across articles; one, "appropriate definitions of population categories and contexts for use," revealed substantial fundamental disagreement among articles. Additionally, while many articles focus on the inappropriate use of race, none fundamentally problematize ancestry. This work can be a resource to researchers looking for normative guidance on the use of population descriptors and can orient authors of future guidelines to this complex field, thereby contributing to the development of more effective future guidelines for genetics research.


Assuntos
Etnicidade , Comportamento Problema , Humanos , Povo Asiático , Consenso , Etnicidade/genética , Pesquisadores
12.
Am J Hum Genet ; 109(9): 1667-1679, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36055213

RESUMO

African populations are the most diverse in the world yet are sorely underrepresented in medical genetics research. Here, we examine the structure of African populations using genetic and comprehensive multi-generational ethnolinguistic data from the Neuropsychiatric Genetics of African Populations-Psychosis study (NeuroGAP-Psychosis) consisting of 900 individuals from Ethiopia, Kenya, South Africa, and Uganda. We find that self-reported language classifications meaningfully tag underlying genetic variation that would be missed with consideration of geography alone, highlighting the importance of culture in shaping genetic diversity. Leveraging our uniquely rich multi-generational ethnolinguistic metadata, we track language transmission through the pedigree, observing the disappearance of several languages in our cohort as well as notable shifts in frequency over three generations. We find suggestive evidence for the rate of language transmission in matrilineal groups having been higher than that for patrilineal ones. We highlight both the diversity of variation within Africa as well as how within-Africa variation can be informative for broader variant interpretation; many variants that are rare elsewhere are common in parts of Africa. The work presented here improves the understanding of the spectrum of genetic variation in African populations and highlights the enormous and complex genetic and ethnolinguistic diversity across Africa.


Assuntos
Variação Genética , Genética Populacional , África Austral , População Negra/genética , Estruturas Genéticas , Variação Genética/genética , Humanos
15.
Am J Hum Genet ; 108(4): 656-668, 2021 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-33770507

RESUMO

Genetic studies in underrepresented populations identify disproportionate numbers of novel associations. However, most genetic studies use genotyping arrays and sequenced reference panels that best capture variation most common in European ancestry populations. To compare data generation strategies best suited for underrepresented populations, we sequenced the whole genomes of 91 individuals to high coverage as part of the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study with participants from Ethiopia, Kenya, South Africa, and Uganda. We used a downsampling approach to evaluate the quality of two cost-effective data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole-genome sequencing data. We show that low-coverage sequencing at a depth of ≥4× captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1×) performed comparably to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation; 4× sequencing detects 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, effectively identify novel variation particularly in underrepresented populations, and present opportunities to enhance variant discovery at a cost similar to traditional approaches.


Assuntos
Análise Mutacional de DNA/economia , Análise Mutacional de DNA/normas , Variação Genética/genética , Genética Populacional/economia , África , Análise Mutacional de DNA/métodos , Genética Populacional/métodos , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Equidade em Saúde , Humanos , Microbiota , Sequenciamento Completo do Genoma/economia , Sequenciamento Completo do Genoma/normas
16.
N Engl J Med ; 385(1): 78-86, 2021 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-34192436

RESUMO

Companies have recently begun to sell a new service to patients considering in vitro fertilization: embryo selection based on polygenic scores (ESPS). These scores represent individualized predictions of health and other outcomes derived from genomewide association studies in adults to partially predict these outcomes. This article includes a discussion of many factors that lower the predictive power of polygenic scores in the context of embryo selection and quantifies these effects for a variety of clinical and nonclinical traits. Also discussed are potential unintended consequences of ESPS (including selecting for adverse traits, altering population demographics, exacerbating inequalities in society, and devaluing certain traits). Recommendations for the responsible communication about ESPS by practitioners are provided, and a call for a society-wide conversation about this technology is made. (Funded by the National Institute on Aging and others.).


Assuntos
Embrião de Mamíferos , Fertilização in vitro , Testes Genéticos , Variação Genética , Herança Multifatorial/genética , Fenótipo , Diagnóstico Pré-Implantação , Escolaridade , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Humanos , Valor Preditivo dos Testes
17.
Hum Mol Genet ; 30(16): 1521-1534, 2021 07 28.
Artigo em Inglês | MEDLINE | ID: mdl-33987664

RESUMO

It is important to study the genetics of complex traits in diverse populations. Here, we introduce covariate-adjusted linkage disequilibrium (LD) score regression (cov-LDSC), a method to estimate SNP-heritability (${\boldsymbol{h}}_{\boldsymbol{g}}^{\mathbf{2}})$ and its enrichment in homogenous and admixed populations with summary statistics and in-sample LD estimates. In-sample LD can be estimated from a subset of the genome-wide association studies samples, allowing our method to be applied efficiently to very large cohorts. In simulations, we show that unadjusted LDSC underestimates ${\boldsymbol{h}}_{\boldsymbol{g}}^{\mathbf{2}}$ by 10-60% in admixed populations; in contrast, cov-LDSC is robustly accurate. We apply cov-LDSC to genotyping data from 8124 individuals, mostly of admixed ancestry, from the Slim Initiative in Genomic Medicine for the Americas study, and to approximately 161 000 Latino-ancestry individuals, 47 000 African American-ancestry individuals and 135 000 European-ancestry individuals, as classified by 23andMe. We estimate ${\boldsymbol{h}}_{\boldsymbol{g}}^{\mathbf{2}}$ and detect heritability enrichment in three quantitative and five dichotomous phenotypes, making this, to our knowledge, the most comprehensive heritability-based analysis of admixed individuals to date. Most traits have high concordance of ${\boldsymbol{h}}_{\boldsymbol{g}}^{\mathbf{2}}$ and consistent tissue-specific heritability enrichment among different populations. However, for age at menarche, we observe population-specific heritability estimates of ${\boldsymbol{h}}_{\boldsymbol{g}}^{\mathbf{2}}$. We observe consistent patterns of tissue-specific heritability enrichment across populations; for example, in the limbic system for BMI, the per-standardized-annotation effect size $ \tau $* is 0.16 ± 0.04, 0.28 ± 0.11 and 0.18 ± 0.03 in the Latino-, African American- and European-ancestry populations, respectively. Our approach is a powerful way to analyze genetic data for complex traits from admixed populations.


Assuntos
Genética Populacional , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Desequilíbrio de Ligação/genética , Herança Multifatorial/genética , Técnicas de Genotipagem/estatística & dados numéricos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Característica Quantitativa Herdável
18.
Am J Hum Genet ; 107(1): 46-59, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32470373

RESUMO

In complex trait genetics, the ability to predict phenotype from genotype is the ultimate measure of our understanding of genetic architecture underlying the heritability of a trait. A complete understanding of the genetic basis of a trait should allow for predictive methods with accuracies approaching the trait's heritability. The highly polygenic nature of quantitative traits and most common phenotypes has motivated the development of statistical strategies focused on combining myriad individually non-significant genetic effects. Now that predictive accuracies are improving, there is a growing interest in the practical utility of such methods for predicting risk of common diseases responsive to early therapeutic intervention. However, existing methods require individual-level genotypes or depend on accurately specifying the genetic architecture underlying each disease to be predicted. Here, we propose a polygenic risk prediction method that does not require explicitly modeling any underlying genetic architecture. We start with summary statistics in the form of SNP effect sizes from a large GWAS cohort. We then remove the correlation structure across summary statistics arising due to linkage disequilibrium and apply a piecewise linear interpolation on conditional mean effects. In both simulated and real datasets, this new non-parametric shrinkage (NPS) method can reliably allow for linkage disequilibrium in summary statistics of 5 million dense genome-wide markers and consistently improves prediction accuracy. We show that NPS improves the identification of groups at high risk for breast cancer, type 2 diabetes, inflammatory bowel disease, and coronary heart disease, all of which have available early intervention or prevention treatments.


Assuntos
Herança Multifatorial/genética , Idoso , Estudos de Coortes , Diabetes Mellitus Tipo 2/genética , Feminino , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Desequilíbrio de Ligação/genética , Masculino , Pessoa de Meia-Idade , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética
19.
Annu Rev Neurosci ; 38: 47-68, 2015 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-25840007

RESUMO

Next-generation sequencing, which allows genome-wide detection of rare and de novo mutations, is transforming neuropsychiatric disease genetics through identifying on an unprecedented scale genes and protein-coding mutations that confer risk. Although understanding how regulatory variants influence risk remains a challenge, we are likely transitioning into a phase of neuropsychiatric disease genetics in which the rate-limiting step may no longer be gene discovery. Instead, the future will concentrate more on the biological and clinical translation of the torrent of specific risk mutations identified through next-generation sequencing. Here, we review the recent progress that resulted specifically from exome sequencing and emphasize the need for rigorous statistical evaluation of the expanding data sets, as well as expanded functional analysis of implicated proteins and mutations. Then, we introduce some of the expected opportunities and challenges investigators face when moving beyond the exome. Finally, we briefly highlight the challenge of deriving translational benefit from the progress in genetics.


Assuntos
Exoma/genética , Predisposição Genética para Doença/genética , Transtornos Mentais/genética , Doenças do Sistema Nervoso/genética , Estudo de Associação Genômica Ampla/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Mutação
20.
Perspect Biol Med ; 66(2): 225-248, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37755714

RESUMO

A wide range of research uses patterns of genetic variation to infer genetic similarity between individuals, typically referred to as genetic ancestry. This research includes inference of human demographic history, understanding the genetic architecture of traits, and predicting disease risk. Researchers are not just structuring an intellectual inquiry when using genetic ancestry, they are also creating analytical frameworks with broader societal ramifications. This essay presents an ethics framework in the spirit of virtue ethics for these researchers: rather than focus on rule following, the framework is designed to build researchers' capacities to react to the ethical dimensions of their work. The authors identify one overarching principle of intellectual freedom and responsibility, noting that freedom in all its guises comes with responsibility, and they identify and define four principles that collectively uphold researchers' intellectual responsibility: truthfulness, justice and fairness, anti-racism, and public beneficence. Researchers should bring their practices into alignment with these principles, and to aid this, the authors name three common ways research practices infringe these principles, suggest a step-by-step process for aligning research choices with the principles, provide rules of thumb for achieving alignment, and give a worked case. The essay concludes by identifying support needed by researchers to act in accord with the proposed framework.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA