RESUMEN
During human evolution, the knee adapted to the biomechanical demands of bipedalism by altering chondrocyte developmental programs. This adaptive process was likely not without deleterious consequences to health. Today, osteoarthritis occurs in 250 million people, with risk variants enriched in non-coding sequences near chondrocyte genes, loci that likely became optimized during knee evolution. We explore this relationship by epigenetically profiling joint chondrocytes, revealing ancient selection and recent constraint and drift on knee regulatory elements, which also overlap osteoarthritis variants that contribute to disease heritability by tending to modify constrained functional sequence. We propose a model whereby genetic violations to regulatory constraint, tolerated during knee development, lead to adult pathology. In support, we discover a causal enhancer variant (rs6060369) present in billions of people at a risk locus (GDF5-UQCC1), showing how it impacts mouse knee-shape and osteoarthritis. Overall, our methods link an evolutionarily novel aspect of human anatomy to its pathogenesis.
Asunto(s)
Condrocitos/fisiología , Articulación de la Rodilla/fisiología , Osteoartritis/genética , Animales , Evolución Biológica , Condrocitos/metabolismo , Evolución Molecular , Predisposición Genética a la Enfermedad/genética , Factor 5 de Diferenciación de Crecimiento/genética , Factor 5 de Diferenciación de Crecimiento/metabolismo , Células HEK293 , Humanos , Rodilla/fisiología , Ratones , Células 3T3 NIH , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de RiesgoRESUMEN
Gene regulation in the human genome is controlled by distal enhancers that activate specific nearby promoters1. A proposed model for this specificity is that promoters have sequence-encoded preferences for certain enhancers, for example, mediated by interacting sets of transcription factors or cofactors2. This 'biochemical compatibility' model has been supported by observations at individual human promoters and by genome-wide measurements in Drosophila3-9. However, the degree to which human enhancers and promoters are intrinsically compatible has not yet been systematically measured, and how their activities combine to control RNA expression remains unclear. Here we design a high-throughput reporter assay called enhancer × promoter self-transcribing active regulatory region sequencing (ExP STARR-seq) and applied it to examine the combinatorial compatibilities of 1,000 enhancer and 1,000 promoter sequences in human K562 cells. We identify simple rules for enhancer-promoter compatibility, whereby most enhancers activate all promoters by similar amounts, and intrinsic enhancer and promoter activities multiplicatively combine to determine RNA output (R2 = 0.82). In addition, two classes of enhancers and promoters show subtle preferential effects. Promoters of housekeeping genes contain built-in activating motifs for factors such as GABPA and YY1, which decrease the responsiveness of promoters to distal enhancers. Promoters of variably expressed genes lack these motifs and show stronger responsiveness to enhancers. Together, this systematic assessment of enhancer-promoter compatibility suggests a multiplicative model tuned by enhancer and promoter class to control gene transcription in the human genome.
Asunto(s)
Elementos de Facilitación Genéticos , Regiones Promotoras Genéticas , Elementos de Facilitación Genéticos/genética , Humanos , Regiones Promotoras Genéticas/genética , ARN/biosíntesis , ARN/genética , Factores de Transcripción/metabolismoRESUMEN
Although some variation introgressed from Neanderthals has undergone selective sweeps, little is known about its functional significance. We used a Massively Parallel Reporter Assay (MPRA) to assay 5,353 high-frequency introgressed variants for their ability to modulate the gene expression within 170 bp of endogenous sequence. We identified 2,548 variants in active putative cis-regulatory elements (CREs) and 292 expression-modulating variants (emVars). These emVars are predicted to alter the binding motifs of important immune transcription factors, are enriched for associations with neutrophil and white blood cell count, and are associated with the expression of genes that function in innate immune pathways including inflammatory response and antiviral defense. We combined the MPRA data with other data sets to identify strong candidates to be driver variants of positive selection including an emVar that may contribute to protection against severe COVID-19 response. We endogenously deleted two CREs containing expression-modulation variants linked to immune function, rs11624425 and rs80317430, identifying their primary genic targets as ELMSAN1, and PAN2 and STAT2, respectively, three genes differentially expressed during influenza infection. Overall, we present the first database of experimentally identified expression-modulating Neanderthal-introgressed alleles contributing to potential immune response in modern humans.
Asunto(s)
Variación Genética , Genoma Humano , Inmunidad Innata/genética , Hombre de Neandertal , Animales , Expresión Génica , Humanos , Inflamación , Hombre de Neandertal/genéticaRESUMEN
High-coverage whole-genome sequence studies have so far focused on a limited number of geographically restricted populations, or been targeted at specific diseases, such as cancer. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history and refuelled the debate on the mutation rate in humans. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record, and admixture between AMHs and Neanderthals predating the main Eurasian expansion, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago.
Asunto(s)
Genoma Humano/genética , Genómica , Migración Humana/historia , Grupos Raciales/genética , África/etnología , Animales , Asia , Conjuntos de Datos como Asunto , Estonia , Europa (Continente) , Fósiles , Flujo Génico , Genética de Población , Heterocigoto , Historia Antigua , Humanos , Nativos de Hawái y Otras Islas del Pacífico/genética , Hombre de Neandertal/genética , Nueva Guinea , Dinámica PoblacionalRESUMEN
Variation in pelvic morphology has a complex genetic basis and its patterning and specification is governed by conserved developmental pathways. Whether the mechanisms underlying the differentiation and specification of the pelvis also produce the morphological covariation on which natural selection may act, is still an open question in evolutionary developmental biology. We use high-resolution quantitative trait locus (QTL) mapping in the F34 generation of an advanced intercross experiment (LG,SM-G34 ) to characterize the genetic architecture of the mouse pelvis. We test the prediction that genomic features linked to developmental patterning and differentiation of the hind limb and pelvis and the regulation of chondrogenesis are overrepresented in QTL. We find 31 single QTL trait associations at the genome- or chromosome-wise significance level coalescing to 27 pleiotropic loci. We recover further QTL at a more relaxed significance threshold replicating locations found in a previous experiment in an earlier generation of the same population. QTL were more likely than chance to harbor Pitx1 and Sox9 Class II chromatin immunoprecipitation-seq features active during development of skeletal features. There was weak or no support for the enrichment of seven more categories of developmental features drawn from the literature. Our results suggest that genotypic variation is channeled through a subset of developmental processes involved in the generation of phenotypic variation in the pelvis. This finding indicates that the evolvability of complex traits may be subject to biases not evident from patterns of covariance among morphological features or developmental patterning when either is considered in isolation.
Asunto(s)
Factores de Transcripción Paired Box/metabolismo , Pelvis/crecimiento & desarrollo , Factor de Transcripción SOX9/metabolismo , Animales , Evolución Biológica , Regulación del Desarrollo de la Expresión Génica , Genómica , Genotipo , Ratones , Factores de Transcripción Paired Box/genética , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Factor de Transcripción SOX9/genéticaRESUMEN
Recent studies have reported evidence suggesting that portions of contemporary human genomes introgressed from archaic hominin populations went to high frequencies due to positive selection. However, no study to date has specifically addressed the postintrogression population dynamics of these putative cases of adaptive introgression. Here, for the first time, we specifically define cases of immediate adaptive introgression (iAI) in which archaic haplotypes rose to high frequencies in humans as a result of a selective sweep that occurred shortly after the introgression event. We define these cases as distinct from instances of selection on standing introgressed variation (SI), in which an introgressed haplotype initially segregated neutrally and subsequently underwent positive selection. Using a geographically diverse data set, we report novel cases of selection on introgressed variation in living humans and shortlist among these cases those whose selective sweeps are more consistent with having been the product of iAI rather than SI. Many of these novel inferred iAI haplotypes have potential biological relevance, including three that contain immune-related genes in West Siberians, South Asians, and West Eurasians. Overall, our results suggest that iAI may not represent the full picture of positive selection on archaically introgressed haplotypes in humans and that more work needs to be done to analyze the role of SI in the archaic introgression landscape of living humans.
RESUMEN
Enhancers are key drivers of gene regulation thought to act via 3D physical interactions with the promoters of their target genes. However, genome-wide depletions of architectural proteins such as cohesin result in only limited changes in gene expression, despite a loss of contact domains and loops. Consequently, the role of cohesin and 3D contacts in enhancer function remains debated. Here, we developed CRISPRi of regulatory elements upon degron operation (CRUDO), a novel approach to measure how changes in contact frequency impact enhancer effects on target genes by perturbing enhancers with CRISPRi and measuring gene expression in the presence or absence of cohesin. We systematically perturbed all 1,039 candidate enhancers near five cohesin-dependent genes and identified 34 enhancer-gene regulatory interactions. Of 26 regulatory interactions with sufficient statistical power to evaluate cohesin dependence, 18 show cohesin-dependent effects. A decrease in enhancer-promoter contact frequency upon removal of cohesin is frequently accompanied by a decrease in the regulatory effect of the enhancer on gene expression, consistent with a contact-based model for enhancer function. However, changes in contact frequency and regulatory effects on gene expression vary as a function of distance, with distal enhancers (e.g., >50Kb) experiencing much larger changes than proximal ones (e.g., <50Kb). Because most enhancers are located close to their target genes, these observations can explain how only a small subset of genes - those with strong distal enhancers - are sensitive to cohesin. Together, our results illuminate how 3D contacts, influenced by both cohesin and genomic distance, tune enhancer effects on gene expression.
RESUMEN
Individuals infected with the SARS-CoV-2 virus present with a wide variety of symptoms ranging from asymptomatic to severe and even lethal outcomes. Past research has revealed a genetic haplotype on chromosome 3 that entered the human population via introgression from Neanderthals as the strongest genetic risk factor for the severe response to COVID-19. However, the specific variants along this introgressed haplotype that contribute to this risk and the biological mechanisms that are involved remain unclear. Here, we assess the variants present on the risk haplotype for their likelihood of driving the genetic predisposition to severe COVID-19 outcomes. We do this by first exploring their impact on the regulation of genes involved in COVID-19 infection using a variety of population genetics and functional genomics tools. We then perform a locus-specific massively parallel reporter assay to individually assess the regulatory potential of each allele on the haplotype in a multipotent immune-related cell line. We ultimately reduce the set of over 600 linked genetic variants to identify four introgressed alleles that are strong functional candidates for driving the association between this locus and severe COVID-19. Using reporter assays in the presence/absence of SARS-CoV-2, we find evidence that these variants respond to viral infection. These variants likely drive the locus' impact on severity by modulating the regulation of two critical chemokine receptor genes: CCR1 and CCR5. These alleles are ideal targets for future functional investigations into the interaction between host genomics and COVID-19 outcomes.
Asunto(s)
COVID-19 , Hombre de Neandertal , Virosis , Humanos , Animales , COVID-19/genética , Hombre de Neandertal/genética , SARS-CoV-2/genética , Genética de PoblaciónRESUMEN
Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.
RESUMEN
BACKGROUND: Transposable elements are biologically important components of eukaryote genomes. In particular, non-LTR retrotransposons (N-LTRrs) played a key role in shaping the human genome throughout evolution. In this study, we compared retrotransposon insertions differentially present in the genomes of Anatomically Modern Humans, Neanderthals, Denisovans and Chimpanzees, in order to assess the possible impact of retrotransposition in the differentiation of the human lineage. RESULTS: We first identified species-specific N-LTRrs and established their distribution in present day human populations. These analyses shortlisted a group of N-LTRr insertions that were found exclusively in Anatomically Modern Humans. These insertions are associated with an increase in the number of transcriptional/splicing variants of those genes they inserted in. The analysis of the functionality of genes containing human-specific N-LTRr insertions reflects changes that occurred during human evolution. In particular, the expression of genes containing the most recent N-LTRr insertions is enriched in the brain, especially in undifferentiated neurons, and these genes associate in networks related to neuron maturation and migration. Additionally, we identified candidate N-LTRr insertions that have likely produced new functional variants exclusive to modern humans, whose genomic loci show traces of positive selection. CONCLUSIONS: Our results strongly suggest that N-LTRr impacted our differentiation as a species, most likely inducing an increase in neural complexity, and have been a constant source of genomic variability all throughout the evolution of the human lineage.
RESUMEN
GWAS have identified hundreds of height-associated loci. However, determining causal mechanisms is challenging, especially since height-relevant tissues (e.g. growth plates) are difficult to study. To uncover mechanisms by which height GWAS variants function, we performed epigenetic profiling of murine femoral growth plates. The profiled open chromatin regions recapitulate known chondrocyte and skeletal biology, are enriched at height GWAS loci, particularly near differentially expressed growth plate genes, and enriched for binding motifs of transcription factors with roles in chondrocyte biology. At specific loci, our analyses identified compelling mechanisms for GWAS variants. For example, at CHSY1, we identified a candidate causal variant (rs9920291) overlapping an open chromatin region. Reporter assays demonstrated that rs9920291 shows allelic regulatory activity, and CRISPR/Cas9 targeting of human chondrocytes demonstrates that the region regulates CHSY1 expression. Thus, integrating biologically relevant epigenetic information (here, from growth plates) with genetic association results can identify biological mechanisms important for human growth.
Asunto(s)
Estatura , Condrocitos/fisiología , Epigénesis Genética , Variación Genética , Placa de Crecimiento/citología , Animales , Cromatina/metabolismo , Sitios Genéticos , Humanos , RatonesRESUMEN
Prolonged human interactions and artificial selection have influenced the genotypic and phenotypic diversity among dog breeds. Because humans and dogs occupy diverse habitats, ecological contexts have likely contributed to breed-specific positive selection. Prior to the advent of modern dog-feeding practices, there was likely substantial variation in dietary landscapes among disparate dog breeds. As such, we investigated one type of genetic variant, copy number variation, in three metabolic genes: glucokinase regulatory protein (GCKR), phytanol-CoA 2-hydroxylase (PHYH), and pancreatic α-amylase 2B (AMY2B). These genes code for proteins that are responsible for metabolizing dietary products that originate from distinctly different food types: sugar, meat, and starch, respectively. After surveying copy number variation among dogs with diverse dietary histories, we found no correlation between diet and positive selection in either GCKR or PHYH. Although it has been previously demonstrated that dogs experienced a copy number increase in AMY2B relative to wolves during or after the dog domestication process, we demonstrate that positive selection continued to act on amylase copy number in dog breeds that consumed starch-rich diets in time periods after domestication. Furthermore, we found that introgression with wolves is not responsible for deterioration of positive selection on AMY2B among diverse dog breeds. Together, this supports the hypothesis that the amylase copy number expansion is found universally in dogs.