RESUMO
Understanding variation in chromatin contact patterns across diverse humans is critical for interpreting noncoding variants and their effects on gene expression and phenotypes. However, experimental determination of chromatin contact patterns across large samples is prohibitively expensive. To overcome this challenge, we develop and validate a machine learning method to quantify the variation in 3D chromatin contacts at 2 kilobase resolution from genome sequence alone. We apply this approach to thousands of human genomes from the 1000 Genomes Project and the inferred hominin ancestral genome. While patterns of 3D contact divergence genome wide are qualitatively similar to patterns of sequence divergence, we find substantial differences in 3D divergence and sequence divergence in local 1 megabase genomic windows. In particular, we identify 392 windows with significantly greater 3D divergence than expected from sequence. Moreover, for 31% of genomic windows, a single individual has a rare divergent 3D contact map pattern. Using in silico mutagenesis, we find that most single nucleotide sequence changes do not result in changes to 3D chromatin contacts. However, in windows with substantial 3D divergence just one or a few variants can lead to divergent 3D chromatin contacts without the individuals carrying those variants having high sequence divergence. In summary, inferring 3D chromatin contact maps across human populations reveals variable contact patterns. We anticipate that these genetically diverse maps of 3D chromatin contact will provide a reference for future work on the function and evolution of 3D chromatin contact variation across human populations.
Assuntos
Cromatina , Genoma Humano , Aprendizado de Máquina , Humanos , Cromatina/genética , Variação GenéticaRESUMO
The contribution of mosaicism to diagnosed genetic disease and presumed de novo variants (DNV) is under investigated. We determined the contribution of mosaic genetic disease (MGD) and diagnosed parental mosaicism (PM) in parents of offspring with reported DNV (in the same variant) in the (1) Undiagnosed Diseases Network (UDN) (N = 1946) and (2) in 12,472 individuals electronic health records (EHR) who underwent genetic testing at an academic medical center. In the UDN, we found 4.51% of diagnosed probands had MGD, and 2.86% of parents of those with DNV exhibited PM. In the EHR, we found 6.03% and 2.99% and (of diagnosed probands) had MGD detected on chromosomal microarray and exome/genome sequencing, respectively. We found 2.34% (of those with a presumed pathogenic DNV) had a parent with PM for the variant. We detected mosaicism (regardless of pathogenicity) in 4.49% of genetic tests performed. We found a broad phenotypic spectrum of MGD with previously unknown phenotypic phenomena. MGD is highly heterogeneous and provides a significant contribution to genetic diseases. Further work is required to improve the diagnosis of MGD and investigate how PM contributes to DNV risk.
Assuntos
Variação Genética , Mosaicismo , Humanos , Testes Genéticos , Exoma , PaisRESUMO
Protein function can be impacted by changes in protein structure stability, but determining which change has impact is complex. Stability can be affected by a large change in the tertiary (3D) structure of the protein or due to free-energy changes caused by single amino acid substitutions. Changes in the DNA sequence can have minor or major impact on protein stability, which can lead to disease. Inherited retinal degenerations are generally caused by single mutations which are mostly located in protein-coding regions, while age-related macular degeneration (AMD) is a complex disorder that can be influenced by some genetic variants impacting proteins involved in the disease, although not all AMD risk variants lead to amino acid changes. Here, we review ways that proteins may be affected, the identification and understanding of these changes, and how to identify causal changes that can be targeted to develop treatments to alleviate retinal degenerative disease.
Assuntos
Degeneração Macular , Degeneração Retiniana , Humanos , Degeneração Retiniana/genética , Retina , Degeneração Macular/genética , Mutação , Proteínas/química , Estabilidade ProteicaRESUMO
Polar bear (Ursus maritimus) and brown bear (Ursus arctos) are recently diverged species that inhabit vastly differing habitats. Thus, analysis of the polar bear and brown bear genomes represents a unique opportunity to investigate the evolutionary mechanisms and genetic underpinnings of rapid ecological adaptation in mammals. Copy number (CN) differences in genomic regions between closely related species can underlie adaptive phenotypes and this form of genetic variation has not been explored in the context of polar bear evolution. Here, we analyzed the CN profiles of 17 polar bears, 9 brown bears, and 2 black bears (Ursus americanus). We identified an average of 318 genes per individual that showed evidence of CN variation (CNV). Nearly 200 genes displayed species-specific CN differences between polar bear and brown bear species. Principal component analysis of gene CN provides strong evidence that CNV evolved rapidly in the polar bear lineage and mainly resulted in CN loss. Olfactory receptors composed 47% of CN differentiated genes, with the majority of these genes being at lower CN in the polar bear. Additionally, we found significantly fewer copies of several genes involved in fatty acid metabolism as well as AMY1B, the salivary amylase-encoding gene in the polar bear. These results suggest that natural selection shaped patterns of CNV in response to the transition from an omnivorous to primarily carnivorous diet during polar bear evolution. Our analyses of CNV shed light on the genomic underpinnings of ecological adaptation during polar bear evolution.
Assuntos
Evolução Biológica , Dieta/veterinária , Dosagem de Genes , Ursidae/genética , Adaptação Fisiológica/genética , Animais , Ecologia , Dosagem de Genes/genética , MetagenômicaRESUMO
BACKGROUND: Neural circuits are initially assembled during development when neurons synapse with potential partners and later refined as appropriate connections stabilize into mature synapses while inappropriate contacts are eliminated. Disruptions to this synaptogenic process impair connectivity optimization and can cause neurodevelopmental disorders. Intellectual disability (ID) and autism spectrum disorder (ASD) are often characterized by synaptic overgrowth, with the maintenance of immature or inappropriate synapses. Such synaptogenic defects can occur through mutation of a single gene, such as fragile X mental retardation protein (FMRP) loss causing the neurodevelopmental disorder fragile X syndrome (FXS). FXS represents the leading heritable cause of ID and ASD, but many other genes that play roles in ID and ASD have yet to be identified. RESULTS: In a Drosophila FXS disease model, one dfmr150M null mutant stock exhibits previously unreported axonal overgrowths at developmental and mature stages in the giant fiber (GF) escape circuit. These excess axon projections contain both chemical and electrical synapse markers, indicating mixed synaptic connections. Extensive analyses show these supernumerary synapses connect known GF circuit neurons, rather than new, inappropriate partners, indicating hyperconnectivity within the circuit. Despite the striking similarities to well-characterized FXS synaptic defects, this new GF circuit hyperconnectivity phenotype is driven by genetic background mutations in this dfmr150M stock. Similar GF circuit synaptic overgrowth is not observed in independent dfmr1 null alleles. Bulked segregant analysis (BSA) was combined with whole genome sequencing (WGS) to identify the quantitative trait loci (QTL) linked to neural circuit hyperconnectivity. The results reveal 8 QTL associated with inappropriate synapse formation and maintenance in the dfmr150M mutant background. CONCLUSIONS: Synaptogenesis is a complex, precisely orchestrated neurodevelopmental process with a large cohort of gene products coordinating the connectivity, synaptic strength, and excitatory/inhibitory balance between neuronal partners. This work identifies a number of genetic regions that contain mutations disrupting proper synaptogenesis within a particularly well-mapped neural circuit. These QTL regions contain potential new genes involved in synapse formation and refinement. Given the similarity of the synaptic overgrowth phenotype to known ID and ASD inherited conditions, identifying these genes should increase our understanding of these devastating neurodevelopmental disease states.
Assuntos
Drosophila melanogaster/genética , Síndrome do Cromossomo X Frágil/genética , Mutação , Neurônios/fisiologia , Sinapses/metabolismo , Animais , Animais Geneticamente Modificados/genética , Modelos Animais de Doenças , Proteínas de Drosophila/metabolismo , Patrimônio GenéticoRESUMO
Interactions between plants and herbivorous insects have been models for theories of specialization and co-evolution for over a century. Phytochemicals govern many aspects of these interactions and have fostered the evolution of adaptations by insects to tolerate or even specialize on plant defensive chemistry. While genomic approaches are providing new insights into the genes and mechanisms insect specialists employ to tolerate plant secondary metabolites, open questions remain about the evolution and conservation of insect counterdefences, how insects respond to the diversity defences mounted by their host plants, and the costs and benefits of resistance and tolerance to plant defences in natural ecological communities. Using a milkweed-specialist aphid (Aphis nerii) model, we test the effects of host plant species with increased toxicity, likely driven primarily by increased secondary metabolites, on aphid life history traits and whole-body gene expression. We show that more toxic plant species have a negative effect on aphid development and lifetime fecundity. When feeding on more toxic host plants with higher levels of secondary metabolites, aphids regulate a narrow, targeted set of genes, including those involved in canonical detoxification processes (e.g., cytochrome P450s, hydrolases, UDP-glucuronosyltransferases and ABC transporters). These results indicate that A. nerii marshal a variety of metabolic detoxification mechanisms to circumvent milkweed toxicity and facilitate host plant specialization, yet, despite these detoxification mechanisms, aphids experience reduced fitness when feeding on more toxic host plants. Disentangling how specialist insects respond to challenging host plants is a pivotal step in understanding the evolution of specialized diet breadths.
Assuntos
Afídeos/fisiologia , Asclepias/química , Aptidão Genética , Transcriptoma , Animais , Afídeos/genética , Fertilidade , Regulação da Expressão Gênica , Herbivoria , Inativação Metabólica , Metabolismo SecundárioRESUMO
Olfactory-driven behaviors are central to the lifecycle of the malaria vector mosquito Anopheles gambiae and are initiated by peripheral signaling in the antenna and other olfactory tissues. To continue gaining insight into the relationship between gene expression and olfaction, we have performed cohort comparisons of antennal transcript abundances at five time points after a blood meal, a key event in both reproduction and disease transmission cycles. We found that more than 5,000 transcripts displayed significant abundance differences, many of which were correlated by cluster analysis. Within the chemosensory gene families, we observed a general reduction in the level of chemosensory gene transcripts, although a subset of odorant receptors (AgOrs) was modestly enhanced in post-blood-fed samples. Integration of AgOr transcript abundance data with previously characterized AgOr excitatory odorant response profiles revealed potential changes in antennal odorant receptivity that coincided with the shift from host-seeking to oviposition behaviors in blood-fed female mosquitoes. Behavioral testing of ovipositing females to odorants highlighted by this synthetic analysis identified two unique, unitary oviposition cues for An. gambiae, 2-propylphenol and 4-methylcyclohexanol. We posit that modest, yet cumulative, alterations of AgOr transcript levels modulate peripheral odor coding resulting in biologically relevant behavioral effects. Moreover, these results demonstrate that highly quantitative, RNAseq transcript abundance data can be successfully integrated with functional data to generate testable hypotheses.
Assuntos
Anopheles/fisiologia , Antenas de Artrópodes/metabolismo , Odorantes , Receptores Odorantes/metabolismo , Transcriptoma , Animais , Anopheles/metabolismo , Sangue , Análise por Conglomerados , Biologia Computacional , Feminino , Regulação da Expressão Gênica , RNA/metabolismo , Receptores Odorantes/genética , Análise de Sequência de DNA , Transcrição GênicaRESUMO
Social and brood parasitisms are nonconsumptive forms of parasitism involving the exploitation of the colonies or nests of a host. Such parasites are often related to their hosts and may evolve in various ecological contexts, causing evolutionary constraints and opportunities for both parasites and their hosts. In extreme cases, patterns of diversification between social parasites and their hosts can be coupled, such that diversity of one is correlated with or even shapes the diversity of the other. Aphids in the genus Tamalia induce galls on North American manzanita (Arctostaphylos) and related shrubs (Arbutoideae) and are parasitized by nongalling social parasites or inquilines in the same genus. We used RNA sequencing to identify and generate new gene sequences for Tamalia and performed maximum-likelihood, Bayesian and phylogeographic analyses to reconstruct the origins and patterns of diversity and host-associated differentiation in the genus. Our results indicate that the Tamalia inquilines are monophyletic and closely related to their gall-forming hosts on Arctostaphylos, supporting a previously proposed scenario for origins of these parasitic aphids. Unexpectedly, population structure and host-plant-associated differentiation were greater in the non-gall-inducing parasites than in their gall-inducing hosts. RNA-seq indicated contrasting patterns of gene expression between host aphids and parasites, and perhaps functional differences in host-plant relationships. Our results suggest a mode of speciation in which host plants drive within-guild diversification in insect hosts and their parasites. Shared host plants may be sufficient to promote the ecological diversification of a network of phytophagous insects and their parasites, as exemplified by Tamalia aphids.
Assuntos
Afídeos/genética , Arctostaphylos/parasitologia , Interações Hospedeiro-Parasita , Filogenia , Animais , Arizona , Teorema de Bayes , California , Variação Genética , Funções Verossimilhança , Nevada , Parasitos/genética , Filogeografia , Tumores de Planta/parasitologia , Análise de Sequência de RNARESUMO
In insects, odor cues are discriminated through a divergent family of odorant receptors (ORs). A functional OR complex consists of both a conventional odorant-binding OR and a nonconventional coreceptor (Orco) that is highly conserved across insect taxa. Recent reports have characterized insect ORs as ion channels, but the precise mechanism of signaling remains unclear. We report the identification and characterization of an Orco family agonist, VUAA1, using the Anopheles gambiae coreceptor (AgOrco) and other orthologues. These studies reveal that the Orco family can form functional ion channels in the absence of an odor-binding OR, and in addition, demonstrate a first-in-class agonist to further research in insect OR signaling. In light of the extraordinary conservation and widespread expression of the Orco family, VUAA1 represents a powerful new family of compounds that can be used to disrupt the destructive behaviors of nuisance insects, agricultural pests, and disease vectors alike.
Assuntos
Canais Iônicos/agonistas , Receptores Odorantes/agonistas , Transdução de Sinais , Tioglicolatos/farmacologia , Triazóis/farmacologia , Animais , Anopheles , Insetos/fisiologia , Canais Iônicos/fisiologia , Tioglicolatos/isolamento & purificação , Triazóis/isolamento & purificaçãoRESUMO
Effective diagnosis and treatment of rare genetic disorders requires the interpretation of a patient's genetic variants of unknown significance (VUSs). Today, clinical decision-making is primarily guided by gene-phenotype association databases and DNA-based scoring methods. Our web-accessible variant analysis pipeline, VUStruct, supplements these established approaches by deeply analyzing the downstream molecular impact of variation in context of 3D protein structure. VUStruct's growing impact is fueled by the co-proliferation of protein 3D structural models, gene sequencing, compute power, and artificial intelligence. Contextualizing VUSs in protein 3D structural models also illuminates longitudinal genomics studies and biochemical bench research focused on VUS, and we created VUStruct for clinicians and researchers alike. We now introduce VUStruct to the broad scientific community as a mature, web-facing, extensible, High Performance Computing (HPC) software pipeline. VUStruct maps missense variants onto automatically selected protein structures and launches a broad range of analyses. These include energy-based assessments of protein folding and stability, pathogenicity prediction through spatial clustering analysis, and machine learning (ML) predictors of binding surface disruptions and nearby post-translational modification sites. The pipeline also considers the entire input set of VUS and identifies genes potentially involved in digenic disease. VUStruct's utility in clinical rare disease genome interpretation has been demonstrated through its analysis of over 175 Undiagnosed Disease Network (UDN) Patient cases. VUStruct-leveraged hypotheses have often informed clinicians in their consideration of additional patient testing, and we report here details from two cases where VUStruct was key to their solution. We also note successes with academic research collaborators, for whom VUStruct has informed research directions in both computational genomics and wet lab studies.
RESUMO
Autosomal dominant congenital disorder of glycosylation (CDG) type Iw (OMIM# 619714) is caused by a heterozygous mutation in the STT3A gene. Most CDGs have an autosomal recessive (AR) mode of inheritance, but several cases with an autosomal dominant (AD) form of an AR CDG have been recently identified. This report describes a 17-year-old male who was referred to the Undiagnosed Diseases Network (UDN) with a history of macrocephaly, failure to thrive, short stature, epilepsy, autism, attention-deficit/hyperactivity disorder, mild developmental delay, intermittent hypotonia, dysmorphic features, and mildly enlarged aortic root. Trio exome sequencing was negative. His biochemical workup included normal plasma amino acids, ammonia, acylcarnitine profile and urine organic and amino acids. His UDN genome sequencing (GS) identified a previously unreported de novo STT3A variant (c.1631A > G: p.Asn544Ser). This variant removes a glycosylation site and was predicted to be destabilizing by structural biology modeling. The patient was formally diagnosed by the UDN Metabolomics Core as having an abnormal transferrin profile indicative of CDG type Iw through metabolomic profiling. We report here an affected male with phenotypic, molecular, and metabolic findings consistent with CDG type Iw due to a heterozygous STT3A variant. This case highlights the importance of further testing of individuals with the phenotypic and metabolic findings of an AR disorder who are heterozygous for a single disease-causing allele and can be shown to have a new AD form of the disorder that represents clinical heterogeneity.
RESUMO
Fungal pathogens exhibit extensive strain heterogeneity, including variation in virulence. Whether closely related non-pathogenic species also exhibit strain heterogeneity remains unknown. Here, we comprehensively characterized the pathogenic potentials (i.e., the ability to cause morbidity and mortality) of 16 diverse strains of Aspergillus fischeri, a non-pathogenic close relative of the major pathogen Aspergillus fumigatus. In vitro immune response assays and in vivo virulence assays using a mouse model of pulmonary aspergillosis showed that A. fischeri strains varied widely in their pathogenic potential. Furthermore, pangenome analyses suggest that A. fischeri genomic and phenotypic diversity is even greater. Genomic, transcriptomic, and metabolic profiling identified several pathways and secondary metabolites associated with variation in virulence. Notably, strain virulence was associated with the simultaneous presence of the secondary metabolites hexadehydroastechrome and gliotoxin. We submit that examining the pathogenic potentials of non-pathogenic close relatives is key for understanding the origins of fungal pathogenicity.
Assuntos
Aspergillus , Animais , Virulência , Aspergillus/patogenicidade , Aspergillus/genética , Aspergillus/metabolismo , Camundongos , Gliotoxina/metabolismo , Modelos Animais de Doenças , Aspergilose Pulmonar/microbiologia , Feminino , Genoma FúngicoRESUMO
Fungal pathogens exhibit extensive strain heterogeneity, including variation in virulence. Whether closely related non-pathogenic species also exhibit strain heterogeneity remains unknown. Here, we comprehensively characterized the pathogenic potentials (i.e., the ability to cause morbidity and mortality) of 16 diverse strains of Aspergillus fischeri, a non-pathogenic close relative of the major pathogen Aspergillus fumigatus. In vitro immune response assays and in vivo virulence assays using a mouse model of pulmonary aspergillosis showed that A. fischeri strains varied widely in their pathogenic potential. Furthermore, pangenome analyses suggest that A. fischeri genomic and phenotypic diversity is even greater. Genomic, transcriptomic, and metabolomic profiling identified several pathways and secondary metabolites associated with variation in virulence. Notably, strain virulence was associated with the simultaneous presence of the secondary metabolites hexadehydroastechrome and gliotoxin. We submit that examining the pathogenic potentials of non-pathogenic close relatives is key for understanding the origins of fungal pathogenicity.
RESUMO
Aspergillus fumigatus causes aspergillosis and relies on asexual spores (conidia) for initiating host infection. There is scarce information about A. fumigatus proteins involved in fungal evasion and host immunity modulation. Here we analysed the conidial surface proteome of A. fumigatus, two closely related non-pathogenic species, Aspergillus fischeri and Aspergillus oerlinghausenensis, as well as pathogenic Aspergillus lentulus, to identify such proteins. After identifying 62 proteins exclusively detected on the A. fumigatus conidial surface, we assessed null mutants for 42 genes encoding these proteins. Deletion of 33 of these genes altered susceptibility to macrophage, epithelial cells and cytokine production. Notably, a gene that encodes a putative glycosylasparaginase, modulating levels of the host proinflammatory cytokine IL-1ß, is important for infection in an immunocompetent murine model of fungal disease. These results suggest that A. fumigatus conidial surface proteins are important for evasion and modulation of the immune response at the onset of fungal infection.
Assuntos
Aspergilose , Aspergillus fumigatus , Proteínas Fúngicas , Evasão da Resposta Imune , Proteoma , Esporos Fúngicos , Aspergillus fumigatus/imunologia , Aspergillus fumigatus/genética , Animais , Esporos Fúngicos/imunologia , Camundongos , Proteoma/genética , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Proteínas Fúngicas/imunologia , Aspergilose/imunologia , Aspergilose/microbiologia , Humanos , Interações Hospedeiro-Patógeno/imunologia , Interações Hospedeiro-Patógeno/genética , Macrófagos/imunologia , Macrófagos/microbiologia , Macrófagos/metabolismo , Citocinas/metabolismo , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Proteínas de Membrana/imunologia , Modelos Animais de Doenças , Células Epiteliais/microbiologia , Células Epiteliais/imunologia , Células Epiteliais/metabolismo , FemininoRESUMO
Cryptic fungal pathogens pose disease management challenges due to their morphological resemblance to known pathogens. Here, we investigated the genomes and phenotypes of 53 globally distributed isolates of Aspergillus section Nidulantes fungi and found 30 clinical isolates-including four isolated from COVID-19 patients-were A. latus, a cryptic pathogen that originated via allodiploid hybridization. Notably, all A. latus isolates were misidentified. A. latus hybrids likely originated via a single hybridization event during the Miocene and harbor substantial genetic diversity. Transcriptome profiling of a clinical isolate revealed that both parental subgenomes are actively expressed and respond to environmental stimuli. Characterizing infection-relevant traits-such as drug resistance and growth under oxidative stress-revealed distinct phenotypic profiles among A. latus hybrids compared to parental and closely related species. Moreover, we identified four features that could aid A. latus taxonomic identification. Together, these findings deepen our understanding of the origin of cryptic pathogens.
Assuntos
Aspergillus , COVID-19 , Variação Genética , Genoma Fúngico , Filogenia , Humanos , Genoma Fúngico/genética , Aspergillus/genética , Aspergillus/isolamento & purificação , COVID-19/virologia , COVID-19/epidemiologia , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , Hibridização Genética , Fenótipo , Evolução Molecular , Perfilação da Expressão Gênica/métodosRESUMO
BACKGROUND: Two sibling members of the Anopheles gambiae species complex display notable differences in female blood meal preferences. An. gambiae s.s. has a well-documented preference for feeding upon human hosts, whereas An. quadriannulatus feeds on vertebrate/mammalian hosts, with only opportunistic feeding upon humans. Because mosquito host-seeking behaviors are largely driven by the sensory modality of olfaction, we hypothesized that hallmarks of these divergent host seeking phenotypes will be in evidence within the transcriptome profiles of the antennae, the mosquito's principal chemosensory appendage. RESULTS: To test this hypothesis, we have sequenced antennal mRNA of non-bloodfed females from each species and observed a number of distinct quantitative and qualitative differences in their chemosensory gene repertoires. In both species, these gene families show higher rates of sequence polymorphisms than the overall rates in their respective transcriptomes, with potentially important divergences between the two species. Moreover, quantitative differences in odorant receptor transcript abundances have been used to model potential distinctions in volatile odor receptivity between the two sibling species of anophelines. CONCLUSION: This analysis suggests that the anthropophagic behavior of An. gambiae s.s. reflects the differential distribution of olfactory receptors in the antenna, likely resulting from a co-option and refinement of molecular components common to both species. This study improves our understanding of the molecular evolution of chemoreceptors in closely related anophelines and suggests possible mechanisms that underlie the behavioral distinctions in host seeking that, in part, account for the differential vectorial capacity of these mosquitoes.
Assuntos
Anopheles/genética , Antenas de Artrópodes/metabolismo , Genoma , Receptores Odorantes/genética , Transcriptoma , Animais , Evolução Molecular , Feminino , Humanos , Anotação de Sequência Molecular , Receptores Odorantes/metabolismo , Análise de Sequência de RNARESUMO
When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely, Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultraviolet radiation and increased variation in seasonal dynamics. However, whether these groups differed substantially in circadian biology and whether archaic introgression adaptively contributed to human chronotypes remain unknown. Here, we traced the evolution of chronotype based on genomes from archaic hominins and present-day humans. First, we inferred differences in circadian gene sequences, splicing, and regulation between archaic hominins and modern humans. We identified 28 circadian genes containing variants with potential to alter splicing in archaics (e.g., CLOCK, PER2, RORB, and RORC) and 16 circadian genes likely divergently regulated between present-day humans and archaic hominins, including RORA. These differences suggest the potential for introgression to modify circadian gene expression. Testing this hypothesis, we found that introgressed variants are enriched among expression quantitative trait loci for circadian genes. Supporting the functional relevance of these regulatory effects, we found that many introgressed alleles have associations with chronotype. Strikingly, the strongest introgressed effects on chronotype increase morningness, consistent with adaptations to high latitude in other species. Finally, we identified several circadian loci with evidence of adaptive introgression or latitudinal clines in allele frequency. These findings identify differences in circadian gene regulation between modern humans and archaic hominins and support the contribution of introgression via coordinated effects on variation in human chronotype.
Assuntos
Hominidae , Homem de Neandertal , Animais , Humanos , Raios Ultravioleta , Genoma Humano , Hominidae/genética , Homem de Neandertal/genética , Frequência do GeneRESUMO
Introduction: When the ancestors of modern Eurasians migrated out of Africa and interbred with Eurasian archaic hominins, namely Neanderthals and Denisovans, DNA of archaic ancestry integrated into the genomes of anatomically modern humans. This process potentially accelerated adaptation to Eurasian environmental factors, including reduced ultra-violet radiation and increased variation in seasonal dynamics. However, whether these groups differed substantially in circadian biology, and whether archaic introgression adaptively contributed to human chronotypes remains unknown. Results: Here we traced the evolution of chronotype based on genomes from archaic hominins and present-day humans. First, we inferred differences in circadian gene sequences, splicing, and regulation between archaic hominins and modern humans. We identified 28 circadian genes containing variants with potential to alter splicing in archaics (e.g., CLOCK, PER2, RORB, RORC), and 16 circadian genes likely divergently regulated between present-day humans and archaic hominins, including RORA. These differences suggest the potential for introgression to modify circadian gene expression. Testing this hypothesis, we found that introgressed variants are enriched among eQTLs for circadian genes. Supporting the functional relevance of these regulatory effects, we found that many introgressed alleles have associations with chronotype. Strikingly, the strongest introgressed effects on chronotype increase morningness, consistent with adaptations to high latitude in other species. Finally, we identified several circadian loci with evidence of adaptive introgression or latitudinal clines in allele frequency. Conclusions: These findings identify differences in circadian gene regulation between modern humans and archaic hominins and support the contribution of introgression via coordinated effects on variation in human chronotype.
RESUMO
Understanding variation in chromatin contact patterns across human populations is critical for interpreting non-coding variants and their ultimate effects on gene expression and phenotypes. However, experimental determination of chromatin contacts at a population-scale is prohibitively expensive. To overcome this challenge, we develop and validate a machine learning method to quantify the diversity 3D chromatin contacts at 2 kilobase resolution from genome sequence alone. We then apply this approach to thousands of diverse modern humans and the inferred human-archaic hominin ancestral genome. While patterns of 3D contact divergence genome-wide are qualitatively similar to patterns of sequence divergence, we find that 3D divergence in local 1-megabase genomic windows does not follow sequence divergence. In particular, we identify 392 windows with significantly greater 3D divergence than expected from sequence. Moreover, 26% of genomic windows have rare 3D contact variation observed in a small number of individuals. Using in silico mutagenesis we find that most sequence changes to do not result in changes to 3D chromatin contacts. However in windows with substantial 3D divergence, just one or a few variants can lead to divergent 3D chromatin contacts without the individuals carrying those variants having high sequence divergence. In summary, inferring 3D chromatin contact maps across human populations reveals diverse contact patterns. We anticipate that these genetically diverse maps of 3D chromatin contact will provide a reference for future work on the function and evolution of 3D chromatin contact variation across human populations.
RESUMO
Purpose: Genetic variants in complement genes are associated with age-related macular degeneration (AMD). However, many rare variants have been identified in these genes, but have an unknown significance, and their impact on protein function and structure is still unknown. We set out to address this issue by evaluating the spatial placement and impact on protein structureof these variants by developing an analytical pipeline and applying it to the International AMD Genomics Consortium (IAMDGC) dataset (16,144 AMD cases, 17,832 controls). Methods: The IAMDGC dataset was imputed using the Haplotype Reference Consortium (HRC), leading to an improvement of over 30% more imputed variants, over the original 1000 Genomes imputation. Variants were extracted for the CFH , CFI , CFB , C9 , and C3 genes, and filtered for missense variants in solved protein structures. We evaluated these variants as to their placement in the three-dimensional structure of the protein (i.e. spatial proximity in the protein), as well as AMD association. We applied several pipelines to a) calculate spatial proximity to known AMD variants versus gnomAD variants, b) assess a variant's likelihood of causing protein destabilization via calculation of predicted free energy change (ddG) using Rosetta, and c) whole gene-based testing to test for statistical associations. Gene-based testing using seqMeta was performed using a) all variants b) variants near known AMD variants or c) with a ddG >|2|. Further, we applied a structural kernel adaptation of SKAT testing (POKEMON) to confirm the association of spatial distributions of missense variants to AMD. Finally, we used logistic regression on known AMD variants in CFI to identify variants leading to >50% reduction in protein expression from known AMD patient carriers of CFI variants compared to wild type (as determined by in vitro experiments) to determine the pipeline's robustness in identifying AMD-relevant variants. These results were compared to functional impact scores, ie CADD values > 10, which indicate if a variant may have a large functional impact genomewide, to determine if our metrics have better discriminative power than existing variant assessment methods. Once our pipeline had been validated, we then performed a priori selection of variants using this pipeline methodology, and tested AMD patient cell lines that carried those selected variants from the EUGENDA cohort (n=34). We investigated complement pathway protein expression in vitro , looking at multiple components of the complement factor pathway in patient carriers of bioinformatically identified variants. Results: Multiple variants were found with a ddG>|2| in each complement gene investigated. Gene-based tests using known and novel missense variants identified significant associations of the C3 , C9 , CFB , and CFH genes with AMD risk after controlling for age and sex (P=3.22×10 -5 ;7.58×10 -6 ;2.1×10 -3 ;1.2×10 -31 ). ddG filtering and SKAT-O tests indicate that missense variants that are predicted to destabilize the protein, in both CFI and CFH, are associated with AMD (P=CFH:0.05, CFI:0.01, threshold of 0.05 significance). Our structural kernel approach identified spatial associations for AMD risk within the protein structures for C3, C9, CFB, CFH, and CFI at a nominal p-value of 0.05. Both ddG and CADD scores were predictive of reduced CFI protein expression, with ROC curve analyses indicating ddG is a better predictor (AUCs of 0.76 and 0.69, respectively). A priori in vitro analysis of variants in all complement factor genes indicated that several variants identified via bioinformatics programs PathProx/POKEMON in our pipeline via in vitro experiments caused significant change in complement protein expression (P=0.04) in actual patient carriers of those variants, via ELISA testing of proteins in the complement factor pathway, and were previously unknown to contribute to AMD pathogenesis. Conclusion: We demonstrate for the first time that missense variants in complement genes cluster together spatially and are associated with AMD case/control status. Using this method, we can identify CFI and CFH variants of previously unknown significance that are predicted to destabilize the proteins. These variants, both in and outside spatial clusters, can predict in-vitro tested CFI protein expression changes, and we hypothesize the same is true for CFH . A priori identification of variants that impact gene expression allow for classification for previously classified as VUS. Further investigation is needed to validate the models for additional variants and to be applied to all AMD-associated genes.