Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nature ; 625(7993): 92-100, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38057664

RESUMO

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Assuntos
Genoma Humano , Genômica , Modelos Genéticos , Mutação , Humanos , Acesso à Informação , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Frequência do Gene , Genoma Humano/genética , Mutação/genética , Seleção Genética
2.
Genome Res ; 34(5): 796-809, 2024 06 25.
Artigo em Inglês | MEDLINE | ID: mdl-38749656

RESUMO

Underrepresented populations are often excluded from genomic studies owing in part to a lack of resources supporting their analyses. The 1000 Genomes Project (1kGP) and Human Genome Diversity Project (HGDP), which have recently been sequenced to high coverage, are valuable genomic resources because of the global diversity they capture and their open data sharing policies. Here, we harmonized a high-quality set of 4094 whole genomes from 80 populations in the HGDP and 1kGP with data from the Genome Aggregation Database (gnomAD) and identified over 153 million high-quality SNVs, indels, and SVs. We performed a detailed ancestry analysis of this cohort, characterizing population structure and patterns of admixture across populations, analyzing site frequency spectra, and measuring variant counts at global and subcontinental levels. We also show substantial added value from this data set compared with the prior versions of the component resources, typically combined via liftOver and variant intersection; for example, we catalog millions of new genetic variants, mostly rare, compared with previous releases. In addition to unrestricted individual-level public release, we provide detailed tutorials for conducting many of the most common quality-control steps and analyses with these data in a scalable cloud-computing environment and publicly release this new phased joint callset for use as a haplotype resource in phasing and imputation pipelines. This jointly called reference panel will serve as a key resource to support research of diverse ancestry populations.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Humanos , Projeto Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variação Genética , Genômica/métodos
3.
Am J Hum Genet ; 109(9): 1667-1679, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36055213

RESUMO

African populations are the most diverse in the world yet are sorely underrepresented in medical genetics research. Here, we examine the structure of African populations using genetic and comprehensive multi-generational ethnolinguistic data from the Neuropsychiatric Genetics of African Populations-Psychosis study (NeuroGAP-Psychosis) consisting of 900 individuals from Ethiopia, Kenya, South Africa, and Uganda. We find that self-reported language classifications meaningfully tag underlying genetic variation that would be missed with consideration of geography alone, highlighting the importance of culture in shaping genetic diversity. Leveraging our uniquely rich multi-generational ethnolinguistic metadata, we track language transmission through the pedigree, observing the disappearance of several languages in our cohort as well as notable shifts in frequency over three generations. We find suggestive evidence for the rate of language transmission in matrilineal groups having been higher than that for patrilineal ones. We highlight both the diversity of variation within Africa as well as how within-Africa variation can be informative for broader variant interpretation; many variants that are rare elsewhere are common in parts of Africa. The work presented here improves the understanding of the spectrum of genetic variation in African populations and highlights the enormous and complex genetic and ethnolinguistic diversity across Africa.


Assuntos
Variação Genética , Genética Populacional , África Austral , População Negra/genética , Estruturas Genéticas , Variação Genética/genética , Humanos
5.
Psychol Med ; : 1-9, 2024 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-39282852

RESUMO

BACKGROUND: Major depressive disorder (MDD) is the leading cause of disability globally, with moderate heritability and well-established socio-environmental risk factors. Genetic studies have been mostly restricted to European settings, with polygenic scores (PGS) demonstrating low portability across diverse global populations. METHODS: This study examines genetic architecture, polygenic prediction, and socio-environmental correlates of MDD in a family-based sample of 10 032 individuals from Nepal with array genotyping data. We used genome-based restricted maximum likelihood to estimate heritability, applied S-LDXR to estimate the cross-ancestry genetic correlation between Nepalese and European samples, and modeled PGS trained on a GWAS meta-analysis of European and East Asian ancestry samples. RESULTS: We estimated the narrow-sense heritability of lifetime MDD in Nepal to be 0.26 (95% CI 0.18-0.34, p = 8.5 × 10-6). Our analysis was underpowered to estimate the cross-ancestry genetic correlation (rg = 0.26, 95% CI -0.29 to 0.81). MDD risk was associated with higher age (beta = 0.071, 95% CI 0.06-0.08), female sex (beta = 0.160, 95% CI 0.15-0.17), and childhood exposure to potentially traumatic events (beta = 0.050, 95% CI 0.03-0.07), while neither the depression PGS (beta = 0.004, 95% CI -0.004 to 0.01) or its interaction with childhood trauma (beta = 0.007, 95% CI -0.01 to 0.03) were strongly associated with MDD. CONCLUSIONS: Estimates of lifetime MDD heritability in this Nepalese sample were similar to previous European ancestry samples, but PGS trained on European data did not predict MDD in this sample. This may be due to differences in ancestry-linked causal variants, differences in depression phenotyping between the training and target data, or setting-specific environmental factors that modulate genetic effects. Additional research among under-represented global populations will ensure equitable translation of genomic findings.

6.
bioRxiv ; 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-36747613

RESUMO

Underrepresented populations are often excluded from genomic studies due in part to a lack of resources supporting their analyses. The 1000 Genomes Project (1kGP) and Human Genome Diversity Project (HGDP), which have recently been sequenced to high coverage, are valuable genomic resources because of the global diversity they capture and their open data sharing policies. Here, we harmonized a high quality set of 4,094 whole genomes from HGDP and 1kGP with data from the Genome Aggregation Database (gnomAD) and identified over 153 million high-quality SNVs, indels, and SVs. We performed a detailed ancestry analysis of this cohort, characterizing population structure and patterns of admixture across populations, analyzing site frequency spectra, and measuring variant counts at global and subcontinental levels. We also demonstrate substantial added value from this dataset compared to the prior versions of the component resources, typically combined via liftover and variant intersection; for example, we catalog millions of new genetic variants, mostly rare, compared to previous releases. In addition to unrestricted individual-level public release, we provide detailed tutorials for conducting many of the most common quality control steps and analyses with these data in a scalable cloud-computing environment and publicly release this new phased joint callset for use as a haplotype resource in phasing and imputation pipelines. This jointly called reference panel will serve as a key resource to support research of diverse ancestry populations.

7.
Microbiol Resour Announc ; 9(30)2020 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-32703830

RESUMO

Synechococcus bacteria are unicellular cyanobacteria that contribute significantly to global marine primary production. We report the nearly complete genome sequence of Synechococcus sp. strain MIT S9220, which lacks the nitrate utilization genes present in most marine Synechococcus genomes. Assembly also produced the complete genome sequence of a cyanophage present in the MIT S9220 culture.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA