RESUMO
Primary open-angle glaucoma (POAG), the leading cause of irreversible blindness worldwide, disproportionately affects individuals of African ancestry. We conducted a genome-wide association study (GWAS) for POAG in 11,275 individuals of African ancestry (6,003 cases; 5,272 controls). We detected 46 risk loci associated with POAG at genome-wide significance. Replication and post-GWAS analyses, including functionally informed fine-mapping, multiple trait co-localization, and in silico validation, implicated two previously undescribed variants (rs1666698 mapping to DBF4P2; rs34957764 mapping to ROCK1P1) and one previously associated variant (rs11824032 mapping to ARHGEF12) as likely causal. For individuals of African ancestry, a polygenic risk score (PRS) for POAG from our mega-analysis (African ancestry individuals) outperformed a PRS from summary statistics of a much larger GWAS derived from European ancestry individuals. This study quantifies the genetic architecture similarities and differences between African and non-African ancestry populations for this blinding disease.
Assuntos
Estudo de Associação Genômica Ampla , Glaucoma de Ângulo Aberto , Humanos , Predisposição Genética para Doença , Glaucoma de Ângulo Aberto/genética , População Negra/genética , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Codon-dependent translation underlies genetics and phylogenetic inferences, but its origins pose two challenges. Prevailing narratives cannot account for the fact that aminoacyl-tRNA synthetases (aaRSs), which translate the genetic code, must collectively enforce the rules used to assemble themselves. Nor can they explain how specific assignments arose from rudimentary differentiation between ancestral aaRSs and corresponding transfer RNAs (tRNAs). Experimental deconstruction of the two aaRS superfamilies created new experimental tools with which to analyze the emergence of the code. Amino acid and tRNA substrate recognition are linked to phase transfer free energies of amino acids and arise largely from aaRS class-specific differences in secondary structure. Sensitivity to protein folding rules endowed ancestral aaRS-tRNA pairs with the feedback necessary to rapidly compare alternative genetic codes and coding sequences. These and other experimental data suggest that the aaRS bidirectional genetic ancestry stabilized the differentiation and interdependence required to initiate and elaborate the genetic coding table.
Assuntos
Aminoacil-tRNA Sintetases/genética , Aminoacil-tRNA Sintetases/metabolismo , Evolução Molecular , Código Genético , Seleção Genética , Aminoácidos/metabolismo , Aminoacil-tRNA Sintetases/química , Catálise , Genótipo , Fenótipo , Filogenia , Biossíntese de Proteínas , Dobramento de Proteína , Estrutura Secundária de Proteína , RNA de Transferência/genética , TermodinâmicaRESUMO
Human cell lines (CLs) are key assets for biomedicine but lack ancestral diversity. Here, we explore why genetic diversity among cell-based models is essential for making preclinical research more equitable and widely translatable. We lay out practical actions that can be taken to improve inclusivity in study design.
Assuntos
Pesquisa Biomédica/ética , Negro ou Afro-Americano/genética , Linhagem Celular , Medicina de Precisão/ética , População Branca/genética , Variação Genética , Humanos , Testes FarmacogenômicosRESUMO
Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.
Assuntos
Etnicidade/genética , Saúde da População , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Genômica , Humanos , AutorrelatoRESUMO
Past human genetic diversity and migration between southern China and Southeast Asia have not been well characterized, in part due to poor preservation of ancient DNA in hot and humid regions. We sequenced 31 ancient genomes from southern China (Guangxi and Fujian), including two â¼12,000- to 10,000-year-old individuals representing the oldest humans sequenced from southern China. We discovered a deeply diverged East Asian ancestry in the Guangxi region that persisted until at least 6,000 years ago. We found that â¼9,000- to 6,000-year-old Guangxi populations were a mixture of local ancestry, southern ancestry previously sampled in Fujian, and deep Asian ancestry related to Southeast Asian Hòabìnhian hunter-gatherers, showing broad admixture in the region predating the appearance of farming. Historical Guangxi populations dating to â¼1,500 to 500 years ago are closely related to Tai-Kadai and Hmong-Mien speakers. Our results show heavy interactions among three distinct ancestries at the crossroads of East and Southeast Asia.
Assuntos
Genética Populacional , Sudeste Asiático , Ásia Oriental , Geografia , HumanosRESUMO
We report genome-wide DNA data for 73 individuals from five archaeological sites across the Bronze and Iron Ages Southern Levant. These individuals, who share the "Canaanite" material culture, can be modeled as descending from two sources: (1) earlier local Neolithic populations and (2) populations related to the Chalcolithic Zagros or the Bronze Age Caucasus. The non-local contribution increased over time, as evinced by three outliers who can be modeled as descendants of recent migrants. We show evidence that different "Canaanite" groups genetically resemble each other more than other populations. We find that Levant-related modern populations typically have substantial ancestry coming from populations related to the Chalcolithic Zagros and the Bronze Age Southern Levant. These groups also harbor ancestry from sources we cannot fully model with the available data, highlighting the critical role of post-Bronze-Age migrations into the region over the past 3,000 years.
Assuntos
DNA Antigo/análise , Etnicidade/genética , Fluxo Gênico/genética , Arqueologia/métodos , DNA Mitocondrial/genética , Etnicidade/história , Fluxo Gênico/fisiologia , Variação Genética/genética , Genética Populacional/métodos , Genoma Humano/genética , Genômica/métodos , Haplótipos , História Antiga , Migração Humana/história , Humanos , Região do Mediterrâneo , Oriente Médio , Análise de Sequência de DNARESUMO
Genome-wide association studies (GWASs) have focused primarily on populations of European descent, but it is essential that diverse populations become better represented. Increasing diversity among study participants will advance our understanding of genetic architecture in all populations and ensure that genetic research is broadly applicable. To facilitate and promote research in multi-ancestry and admixed cohorts, we outline key methodological considerations and highlight opportunities, challenges, solutions, and areas in need of development. Despite the perception that analyzing genetic data from diverse populations is difficult, it is scientifically and ethically imperative, and there is an expanding analytical toolbox to do it well.
Assuntos
Estudo de Associação Genômica Ampla/métodos , Técnicas de Genotipagem/métodos , Genética Humana/métodos , Confiabilidade dos Dados , Variação Genética , Genética Populacional/métodos , Genética Populacional/normas , Estudo de Associação Genômica Ampla/normas , Técnicas de Genotipagem/normas , Genética Humana/normas , Humanos , LinhagemRESUMO
Genome sequences are known for two archaic hominins-Neanderthals and Denisovans-which interbred with anatomically modern humans as they dispersed out of Africa. We identified high-confidence archaic haplotypes in 161 new genomes spanning 14 island groups in Island Southeast Asia and New Guinea and found large stretches of DNA that are inconsistent with a single introgressing Denisovan origin. Instead, modern Papuans carry hundreds of gene variants from two deeply divergent Denisovan lineages that separated over 350 thousand years ago. Spatial and temporal structure among these lineages suggest that introgression from one of these Denisovan groups predominantly took place east of the Wallace line and continued until near the end of the Pleistocene. A third Denisovan lineage occurs in modern East Asians. This regional mosaic suggests considerable complexity in archaic contact, with modern humans interbreeding with multiple Denisovan groups that were geographically isolated from each other over deep evolutionary time.
Assuntos
Introgressão Genética/genética , Haplótipos/genética , Hominidae/genética , Animais , Povo Asiático/genética , Evolução Biológica , Fluxo Gênico , Variação Genética/genética , Genoma Humano/genética , Humanos , Indonésia , Homem de Neandertal/genética , OceaniaRESUMO
Lennon et al. recently proposed a clinical polygenic score (PGS) pipeline as part of the Electronic Medical Records and Genomics (eMERGE) network initiative. In this spotlight article we discuss the broader context for the use of PGS in preventive medicine and highlight key limitations and challenges facing their inclusion in prediction models.
Assuntos
Herança Multifatorial , Herança Multifatorial/genética , Humanos , Genômica , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Registros Eletrônicos de Saúde , Medicina PreventivaRESUMO
We present shaPRS, a method that leverages widespread pleiotropy between traits or shared genetic effects across ancestries, to improve the accuracy of polygenic scores. The method uses genome-wide summary statistics from two diseases or ancestries to improve the genetic effect estimate and standard error at SNPs where there is homogeneity of effect between the two datasets. When there is significant evidence of heterogeneity, the genetic effect from the disease or population closest to the target population is maintained. We show via simulation and a series of real-world examples that shaPRS substantially enhances the accuracy of polygenic risk scores (PRSs) for complex diseases and greatly improves PRS performance across ancestries. shaPRS is a PRS pre-processing method that is agnostic to the actual PRS generation method, and as a result, it can be integrated into existing PRS generation pipelines and continue to be applied as more performant PRS methods are developed over time.
Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Herança Multifatorial/genética , Humanos , Modelos Genéticos , Simulação por Computador , Pleiotropia Genética , FenótipoRESUMO
Large-scale, multi-ethnic whole-genome sequencing (WGS) studies, such as the National Human Genome Research Institute Genome Sequencing Program's Centers for Common Disease Genomics (CCDG), play an important role in increasing diversity for genetic research. Before performing association analyses, assessing Hardy-Weinberg equilibrium (HWE) is a crucial step in quality control procedures to remove low quality variants and ensure valid downstream analyses. Diverse WGS studies contain ancestrally heterogeneous samples; however, commonly used HWE methods assume that the samples are homogeneous. Therefore, directly applying these to the whole dataset can yield statistically invalid results. To account for this heterogeneity, HWE can be tested on subsets of samples that have genetically homogeneous ancestries and the results aggregated at each variant. To facilitate valid HWE subset testing, we developed a semi-supervised learning approach that predicts homogeneous ancestries based on the genotype. This method provides a convenient tool for estimating HWE in the presence of population structure and missing self-reported race and ethnicities in diverse WGS studies. In addition, assessing HWE within the homogeneous ancestries provides reliable HWE estimates that will directly benefit downstream analyses, including association analyses in WGS studies. We applied our proposed method on the CCDG dataset, predicting homogeneous genetic ancestry groups for 60,545 multi-ethnic WGS samples to assess HWE within each group.
Assuntos
Aprendizado de Máquina Supervisionado , Sequenciamento Completo do Genoma , Humanos , Sequenciamento Completo do Genoma/métodos , Genoma Humano , Genética Populacional/métodos , Etnicidade/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , GenótipoRESUMO
Tumor mutational burden (TMB), the total number of somatic mutations in the tumor, and copy number burden (CNB), the corresponding measure of aneuploidy, are established fundamental somatic features and emerging biomarkers for immunotherapy. However, the genetic and non-genetic influences on TMB/CNB and, critically, the manner by which they influence patient outcomes remain poorly understood. Here, we present a large germline-somatic study of TMB/CNB with >23,000 individuals across 17 cancer types, of which 12,000 also have extensive clinical, treatment, and overall survival (OS) measurements available. We report dozens of clinical associations with TMB/CNB, observing older age and male sex to have a strong effect on TMB and weaker impact on CNB. We additionally identified significant germline influences on TMB/CNB, including fine-scale European ancestry and germline polygenic risk scores (PRSs) for smoking, tanning, white blood cell counts, and educational attainment. We quantify the causal effect of exposures on somatic mutational processes using Mendelian randomization. Many of the identified features associated with TMB/CNB were additionally associated with OS for individuals treated at a single tertiary cancer center. For individuals receiving immunotherapy, we observed a complex relationship between PRSs for educational attainment, self-reported college attainment, TMB, and survival, suggesting that the influence of this biomarker may be substantially modified by socioeconomic status. While the accumulation of somatic alterations is a stochastic process, our work demonstrates that it can be shaped by host characteristics including germline genetics.
Assuntos
Neoplasias , Humanos , Masculino , Mutação/genética , Neoplasias/genética , Neoplasias/patologia , Imunoterapia , Biomarcadores Tumorais/genética , Células Germinativas/patologiaRESUMO
Genome analysis of individuals affected by retinitis pigmentosa (RP) identified two rare nucleotide substitutions at the same genomic location on chromosome 11 (g.61392563 [GRCh38]), 69 base pairs upstream of the start codon of the ciliopathy gene TMEM216 (c.-69G>A, c.-69G>T [GenBank: NM_001173991.3]), in individuals of South Asian and African ancestry, respectively. Genotypes included 71 homozygotes and 3 mixed heterozygotes in trans with a predicted loss-of-function allele. Haplotype analysis showed single-nucleotide variants (SNVs) common across families, suggesting ancestral alleles within the two distinct ethnic populations. Clinical phenotype analysis of 62 available individuals from 49 families indicated a similar clinical presentation with night blindness in the first decade and progressive peripheral field loss thereafter. No evident systemic ciliopathy features were noted. Functional characterization of these variants by luciferase reporter gene assay showed reduced promotor activity. Nanopore sequencing confirmed the lower transcription of the TMEM216 c.-69G>T allele in blood-derived RNA from a heterozygous carrier, and reduced expression was further recapitulated by qPCR, using both leukocytes-derived RNA of c.-69G>T homozygotes and total RNA from genome-edited hTERT-RPE1 cells carrying homozygous TMEM216 c.-69G>A. In conclusion, these variants explain a significant proportion of unsolved cases, specifically in individuals of African ancestry, suggesting that reduced TMEM216 expression might lead to abnormal ciliogenesis and photoreceptor degeneration.
Assuntos
Linhagem , Polimorfismo de Nucleotídeo Único , Retinose Pigmentar , Adulto , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Adulto Jovem , Alelos , Haplótipos , Heterozigoto , Homozigoto , Proteínas de Membrana/genética , Fenótipo , Retinose Pigmentar/genética , Retinose Pigmentar/patologiaRESUMO
Human humoral immune responses to SARS-CoV-2 vaccines exhibit substantial inter-individual variability and have been linked to vaccine efficacy. To elucidate the underlying mechanism behind this variability, we conducted a genome-wide association study (GWAS) on the anti-spike IgG serostatus of UK Biobank participants who were previously uninfected by SARS-CoV-2 and had received either the first dose (n = 54,066) or the second dose (n = 46,232) of COVID-19 vaccines. Our analysis revealed significant genome-wide associations between the IgG antibody serostatus following the initial vaccine and human leukocyte antigen (HLA) class II alleles. Specifically, the HLA-DRB1∗13:02 allele (MAF = 4.0%, OR = 0.75, p = 2.34e-16) demonstrated the most statistically significant protective effect against IgG seronegativity. This protective effect was driven by an alteration from arginine (Arg) to glutamic acid (Glu) at position 71 on HLA-DRß1 (p = 1.88e-25), leading to a change in the electrostatic potential of pocket 4 of the peptide binding groove. Notably, the impact of HLA alleles on IgG responses was cell type specific, and we observed a shared genetic predisposition between IgG status and susceptibility/severity of COVID-19. These results were replicated within independent cohorts where IgG serostatus was assayed by two different antibody serology tests. Our findings provide insights into the biological mechanism underlying individual variation in responses to COVID-19 vaccines and highlight the need to consider the influence of constitutive genetics when designing vaccination strategies for optimizing protection and control of infectious disease across diverse populations.
Assuntos
COVID-19 , Imunoglobulina G , Humanos , Formação de Anticorpos/genética , Vacinas contra COVID-19 , Estudo de Associação Genômica Ampla , COVID-19/genética , COVID-19/prevenção & controle , SARS-CoV-2 , VacinaçãoRESUMO
The Merovingian period (5th to 8th cc AD) was a time of demographic, socioeconomic, cultural, and political realignment in Western Europe. Here, we report the whole-genome shotgun sequence data of 30 human skeletal remains from a coastal Late Merovingian site of Koksijde (675 to 750 AD), alongside 18 remains from two Early to Late Medieval sites in present-day Flanders, Belgium. We find two distinct ancestries, one shared with Early Medieval England and the Netherlands, while the other, minor component, reflecting likely continental Gaulish ancestry. Kinship analyses identified no large pedigrees characteristic to elite burials revealing instead a high modularity of distant relationships among individuals of the main ancestry group. In contrast, individuals with >90% Gaulish ancestry had no kinship links among sampled individuals. Evidence for population structure and major differences in the extent of Gaulish ancestry in the main group, including in a mother-daughter pair, suggests ongoing admixture in the community at the time of their burial. The isotopic and genetic evidence combined supports a model by which the burials, representing an established coastal nonelite community, had incorporated migrants from inland populations. The main group of burials at Koksijde shows an abundance of >5 cM long shared allelic intervals with the High Medieval site nearby, implying long-term continuity and suggesting that similarly to Britain, the Early Medieval ancestry shifts left a significant and long-lasting impact on the genetic makeup of the Flemish population. We find substantial allele frequency differences between the two ancestry groups in pigmentation and diet-associated variants, including those linked with lactase persistence, likely reflecting ancestry change rather than local adaptation.
Assuntos
Linhagem , Humanos , História Medieval , Bélgica , Sepultamento/história , Genética Populacional/métodos , Feminino , Masculino , DNA Antigo/análise , Inglaterra , Migração Humana , Arqueologia , Países Baixos , Genoma HumanoRESUMO
The genome of an individual from an admixed population consists of segments originated from different ancestral populations. Most existing ancestry inference approaches focus on calling these segments for the extant individual. In this paper, we present a general ancestry inference approach for inferring recent ancestors from an extant genome. Given the genome of an individual from a recently admixed population, our method can estimate the proportions of the genomes of the recent ancestors of this individual that originated from some ancestral populations. The key step of our method is the inference of ancestors (called founders) right after the formation of an admixed population. The inferred founders can then be used to infer the ancestry of recent ancestors of an extant individual. Our method is implemented in a computer program called PedMix2. To the best of our knowledge, there is no existing method that can practically infer ancestors beyond grandparents from an extant individual's genome. Results on both simulated and real data show that PedMix2 performs well in ancestry inference.
Assuntos
Genética Populacional , Avós , Humanos , Software , Genoma Humano/genéticaRESUMO
BACKGROUND: Expansion of genome-wide association studies across population groups is needed to improve our understanding of shared and unique genetic contributions to breast cancer. We performed association and replication studies guided by a priori linkage findings from African ancestry (AA) relative pairs. METHODS: We performed fixed-effect inverse-variance weighted meta-analysis under three significant AA breast cancer linkage peaks (3q26-27, 12q22-23, and 16q21-22) in 9241 AA cases and 10 193 AA controls. We examined associations with overall breast cancer as well as estrogen receptor (ER)-positive and negative subtypes (193,132 SNPs). We replicated associations in the African-ancestry Breast Cancer Genetic Consortium (AABCG). RESULTS: In AA women, we identified two associations on chr12q for overall breast cancer (rs1420647, OR = 1.15, p = 2.50×10-6; rs12322371, OR = 1.14, p = 3.15×10-6), and one for ER-negative breast cancer (rs77006600, OR = 1.67, p = 3.51×10-6). On chr3, we identified two associations with ER-negative disease (rs184090918, OR = 3.70, p = 1.23×10-5; rs76959804, OR = 3.57, p = 1.77×10-5) and on chr16q we identified an association with ER-negative disease (rs34147411, OR = 1.62, p = 8.82×10-6). In the replication study, the chr3 associations were significant and effect sizes were larger (rs184090918, OR: 6.66, 95% CI: 1.43, 31.01; rs76959804, OR: 5.24, 95% CI: 1.70, 16.16). CONCLUSION: The two chr3 SNPs are upstream to open chromatin ENSR00000710716, a regulatory feature that is actively regulated in mammary tissues, providing evidence that variants in this chr3 region may have a regulatory role in our target organ. Our study provides support for breast cancer variant discovery using prioritization based on linkage evidence.
Assuntos
População Negra , Neoplasias da Mama , Predisposição Genética para Doença , Feminino , Humanos , População Negra/genética , Neoplasias da Mama/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Early or late pubertal onset can lead to disease in adulthood, including cancer, obesity, type 2 diabetes, metabolic disorders, bone fractures, and psychopathologies. Thus, knowing the age at which puberty is attained is crucial as it can serve as a risk factor for future diseases. Pubertal development is divided into five stages of sexual maturation in boys and girls according to the standardized Tanner scale. We performed genome-wide association studies (GWAS) on the "Growth and Obesity Chilean Cohort Study" cohort composed of admixed children with mainly European and Native American ancestry. Using joint models that integrate time-to-event data with longitudinal trajectories of body mass index (BMI), we identified genetic variants associated with phenotypic transitions between pairs of Tanner stages. We identified $42$ novel significant associations, most of them in boys. The GWAS on Tanner $3\rightarrow 4$ transition in boys captured an association peak around the growth-related genes LARS2 and LIMD1 genes, the former of which causes ovarian dysfunction when mutated. The associated variants are expression and splicing Quantitative Trait Loci regulating gene expression and alternative splicing in multiple tissues. Further, higher individual Native American genetic ancestry proportions predicted a significantly earlier puberty onset in boys but not in girls. Finally, the joint models identified a longitudinal BMI parameter significantly associated with several Tanner stages' transitions, confirming the association of BMI with pubertal timing.
Assuntos
Índice de Massa Corporal , Estudo de Associação Genômica Ampla , Puberdade , Humanos , Masculino , Puberdade/genética , Feminino , Chile , Criança , Adolescente , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas , Maturidade Sexual/genética , Estudos de Coortes , Obesidade/genéticaRESUMO
Genetic ancestry is an important biological determinant of cancer health disparities that is not captured by self-identified race and ethnicity (SIRE). Belleau et al. recently developed a systematic computational approach to infer genetic ancestry from cancer-derived molecular data from different genomic and transcriptomic profiling assays, creating opportunities to interrogate population-scale data sets.
Assuntos
Etnicidade , Neoplasias , Humanos , Neoplasias/genética , Genoma , GenômicaRESUMO
Genetic data contain a record of our evolutionary history. The availability of large-scale datasets of human populations from various geographic areas and timescales, coupled with advances in the computational methods to analyze these data, has transformed our ability to use genetic data to learn about our evolutionary past. Here, we review some of the widely used statistical methods to explore and characterize population relationships and history using genomic data. We describe the intuition behind commonly used approaches, their interpretation, and important limitations. For illustration, we apply some of these techniques to genome-wide autosomal data from 929 individuals representing 53 worldwide populations that are part of the Human Genome Diversity Project. Finally, we discuss the new frontiers in genomic methods to learn about population history. In sum, this review highlights the power (and limitations) of DNA to infer features of human evolutionary history, complementing the knowledge gleaned from other disciplines, such as archaeology, anthropology, and linguistics.