RESUMO
In population genetics, the emergence of large-scale genomic data for various species and populations has provided new opportunities to understand the evolutionary forces that drive genetic diversity using statistical inference. However, the era of population genomics presents new challenges in analysing the massive amounts of genomes and variants. Deep learning has demonstrated state-of-the-art performance for numerous applications involving large-scale data. Recently, deep learning approaches have gained popularity in population genetics; facilitated by the advent of massive genomic data sets, powerful computational hardware and complex deep learning architectures, they have been used to identify population structure, infer demographic history and investigate natural selection. Here, we introduce common deep learning architectures and provide comprehensive guidelines for implementing deep learning models for population genetic inference. We also discuss current challenges and future directions for applying deep learning in population genetics, focusing on efficiency, robustness and interpretability.
Assuntos
Aprendizado Profundo , Genômica , Genética Populacional , Genoma , Evolução BiológicaRESUMO
Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.
Assuntos
Sequência Conservada , Evolução Molecular , Genoma , Primatas , Animais , Feminino , Humanos , Gravidez , Sequência Conservada/genética , Desoxirribonuclease I/metabolismo , DNA/genética , DNA/metabolismo , Genoma/genética , Mamíferos/classificação , Mamíferos/genética , Placenta , Primatas/classificação , Primatas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Reprodutibilidade dos Testes , Fatores de Transcrição/metabolismo , Proteínas/genética , Regulação da Expressão Gênica/genéticaRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
The phylogenetic relationships between hominins of the Early Pleistocene epoch in Eurasia, such as Homo antecessor, and hominins that appear later in the fossil record during the Middle Pleistocene epoch, such as Homo sapiens, are highly debated1-5. For the oldest remains, the molecular study of these relationships is hindered by the degradation of ancient DNA. However, recent research has demonstrated that the analysis of ancient proteins can address this challenge6-8. Here we present the dental enamel proteomes of H. antecessor from Atapuerca (Spain)9,10 and Homo erectus from Dmanisi (Georgia)1, two key fossil assemblages that have a central role in models of Pleistocene hominin morphology, dispersal and divergence. We provide evidence that H. antecessor is a close sister lineage to subsequent Middle and Late Pleistocene hominins, including modern humans, Neanderthals and Denisovans. This placement implies that the modern-like face of H. antecessor-that is, similar to that of modern humans-may have a considerably deep ancestry in the genus Homo, and that the cranial morphology of Neanderthals represents a derived form. By recovering AMELY-specific peptide sequences, we also conclude that the H. antecessor molar fragment from Atapuerca that we analysed belonged to a male individual. Finally, these H. antecessor and H. erectus fossils preserve evidence of enamel proteome phosphorylation and proteolytic digestion that occurred in vivo during tooth formation. Our results provide important insights into the evolutionary relationships between H. antecessor and other hominin groups, and pave the way for future studies using enamel proteomes to investigate hominin biology across the existence of the genus Homo.
Assuntos
Esmalte Dentário/química , Esmalte Dentário/metabolismo , Fósseis , Hominidae , Proteoma/análise , Proteoma/metabolismo , Sequência de Aminoácidos , Animais , República da Geórgia , Humanos , Masculino , Dente Molar/química , Dente Molar/metabolismo , Homem de Neandertal , Fosfoproteínas/análise , Fosfoproteínas/química , Fosfoproteínas/metabolismo , Fosforilação , Filogenia , Proteoma/química , EspanhaRESUMO
Sea turtles represent an ancient lineage of marine vertebrates that evolved from terrestrial ancestors over 100 Mya. The genomic basis of the unique physiological and ecological traits enabling these species to thrive in diverse marine habitats remains largely unknown. Additionally, many populations have drastically declined due to anthropogenic activities over the past two centuries, and their recovery is a high global conservation priority. We generated and analyzed high-quality reference genomes for the leatherback (Dermochelys coriacea) and green (Chelonia mydas) turtles, representing the two extant sea turtle families. These genomes are highly syntenic and homologous, but localized regions of noncollinearity were associated with higher copy numbers of immune, zinc-finger, and olfactory receptor (OR) genes in green turtles, with ORs related to waterborne odorants greatly expanded in green turtles. Our findings suggest that divergent evolution of these key gene families may underlie immunological and sensory adaptations assisting navigation, occupancy of neritic versus pelagic environments, and diet specialization. Reduced collinearity was especially prevalent in microchromosomes, with greater gene content, heterozygosity, and genetic distances between species, supporting their critical role in vertebrate evolutionary adaptation. Finally, diversity and demographic histories starkly contrasted between species, indicating that leatherback turtles have had a low yet stable effective population size, exhibit extremely low diversity compared with other reptiles, and harbor a higher genetic load compared with green turtles, reinforcing concern over their persistence under future climate scenarios. These genomes provide invaluable resources for advancing our understanding of evolution and conservation best practices in an imperiled vertebrate lineage.
Assuntos
Tartarugas , Animais , Ecossistema , Dinâmica PopulacionalRESUMO
Gigantopithecus blacki was a giant hominid that inhabited densely forested environments of Southeast Asia during the Pleistocene epoch1. Its evolutionary relationships to other great ape species, and the divergence of these species during the Middle and Late Miocene epoch (16-5.3 million years ago), remain unclear2,3. Hypotheses regarding the relationships between Gigantopithecus and extinct and extant hominids are wide ranging but difficult to substantiate because of its highly derived dentognathic morphology, the absence of cranial and post-cranial remains1,3-6, and the lack of independent molecular validation. We retrieved dental enamel proteome sequences from a 1.9-million-year-old G. blacki molar found in Chuifeng Cave, China7,8. The thermal age of these protein sequences is approximately five times greater than that of any previously published mammalian proteome or genome. We demonstrate that Gigantopithecus is a sister clade to orangutans (genus Pongo) with a common ancestor about 12-10 million years ago, implying that the divergence of Gigantopithecus from Pongo forms part of the Miocene radiation of great apes. In addition, we hypothesize that the expression of alpha-2-HS-glycoprotein, which has not been previously observed in enamel proteomes, had a role in the biomineralization of the thick enamel crowns that characterize the large molars in Gigantopithecus9,10. The survival of an Early Pleistocene dental enamel proteome in the subtropics further expands the scope of palaeoproteomic analysis into geographical areas and time periods previously considered incompatible with the preservation of substantial amounts of genetic information.
Assuntos
Hominidae/genética , Proteoma , Sequência de Aminoácidos , Animais , Teorema de Bayes , Humanos , Filogenia , Fatores de TempoRESUMO
We used pathogen genomics to test orangutan specimens from a museum in Bonn, Germany, to identify the origin of the animals and the circumstances of their death. We found monkeypox virus genomes in the samples and determined that they represent cases from a 1965 outbreak at Rotterdam Zoo in Rotterdam, the Netherlands.
Assuntos
Monkeypox virus , Museus , Animais , Genômica , Surtos de Doenças , Alemanha/epidemiologiaRESUMO
S* is a widely used statistic for detecting archaic admixture from population genetic data. Previous studies used freezing-archer to apply S*, which is only directly applicable to the specific case of Neanderthal and Denisovan introgression in Papuans. Here, we implemented sstar for a more general purpose. Compared with several tools, including SPrime, SkovHMM, and ArchaicSeeker2.0, for detecting introgressed fragments with simulations, our results suggest that sstar is robust to differences in demographic models, including ghost introgression and two-source introgression. We believe sstar will be a useful tool for detecting introgressed fragments in various scenarios and in non-human species.
Assuntos
Genoma Humano , Homem de Neandertal , Humanos , Animais , Homem de Neandertal/genética , Genética PopulacionalRESUMO
It has been shown that Neanderthals contributed genetically to modern humans outside Africa 47,000-65,000 years ago. Here we analyse the genomes of a Neanderthal and a Denisovan from the Altai Mountains in Siberia together with the sequences of chromosome 21 of two Neanderthals from Spain and Croatia. We find that a population that diverged early from other modern humans in Africa contributed genetically to the ancestors of Neanderthals from the Altai Mountains roughly 100,000 years ago. By contrast, we do not detect such a genetic contribution in the Denisovan or the two European Neanderthals. We conclude that in addition to later interbreeding events, the ancestors of Neanderthals from the Altai Mountains and early modern humans met and interbred, possibly in the Near East, many thousands of years earlier than previously thought.
Assuntos
Fluxo Gênico/genética , Homem de Neandertal/genética , Altitude , Animais , Teorema de Bayes , Cromossomos Humanos Par 21/genética , Croácia/etnologia , Genoma Humano/genética , Genômica , Haplótipos/genética , Heterozigoto , Humanos , Hibridização Genética/genética , Filogenia , Densidade Demográfica , Sibéria , Espanha/etnologia , Fatores de TempoRESUMO
BACKGROUND: Numerous Ebola virus outbreaks have occurred in Equatorial Africa over the past decades. Besides human fatalities, gorillas and chimpanzees have also succumbed to the fatal virus. The 2004 outbreak at the Odzala-Kokoua National Park (Republic of Congo) alone caused a severe decline in the resident western lowland gorilla (Gorilla gorilla gorilla) population, with a 95% mortality rate. Here, we explore the immediate genetic impact of the Ebola outbreak in the western lowland gorilla population. RESULTS: Associations with survivorship were evaluated by utilizing DNA obtained from fecal samples from 16 gorilla individuals declared missing after the outbreak (non-survivors) and 15 individuals observed before and after the epidemic (survivors). We used a target enrichment approach to capture the sequences of 123 genes previously associated with immunology and Ebola virus resistance and additionally analyzed the gut microbiome which could influence the survival after an infection. Our results indicate no changes in the population genetic diversity before and after the Ebola outbreak, and no significant differences in microbial community composition between survivors and non-survivors. However, and despite the low power for an association analysis, we do detect six nominally significant missense mutations in four genes that might be candidate variants associated with an increased chance of survival. CONCLUSION: This study offers the first insight to the genetics of a wild great ape population before and after an Ebola outbreak using target capture experiments from fecal samples, and presents a list of candidate loci that may have facilitated their survival.
Assuntos
Microbioma Gastrointestinal , Doença pelo Vírus Ebola , Animais , Surtos de Doenças , Gorilla gorilla/genética , Doença pelo Vírus Ebola/epidemiologia , Doença pelo Vírus Ebola/veterinária , Humanos , Pan troglodytesRESUMO
Admixture, the genetic exchange between differentiated populations appears to be common in the history of species, but has not yet been comparatively studied across mammals. This limits the understanding of its mechanisms and potential role in mammalian evolution. The authors want to summarize the current knowledge on admixture in non-human primates, and suggest that it is important to establish a comparative framework for this phenomenon in humans. Genetic observations in domesticated mammals and their wild counterparts are discussed, and a brief global overview on other clades is presented. Based on this, some of the consequences of gene flow, including incompatibilities and their genomic footprint, as well as adaptive introgression are discussed, and suggestions for a functional genomics approach are made. It is proposed that the field is moving beyond descriptive observations in single species, to a comprehensive analysis of admixture and its impact. Admixture is becoming an integral part of mammalian evolution.
Assuntos
Fluxo Gênico/genética , Animais , Genética Populacional , Genômica/métodos , Humanos , Primatas/genéticaRESUMO
We present a high-quality genome sequence of a Neanderthal woman from Siberia. We show that her parents were related at the level of half-siblings and that mating among close relatives was common among her recent ancestors. We also sequenced the genome of a Neanderthal from the Caucasus to low coverage. An analysis of the relationships and population history of available archaic genomes and 25 present-day human genomes shows that several gene flow events occurred among Neanderthals, Denisovans and early modern humans, possibly including gene flow into Denisovans from an unknown archaic group. Thus, interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene. In addition, the high-quality Neanderthal genome allows us to establish a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neanderthals and Denisovans.
Assuntos
Fósseis , Genoma/genética , Homem de Neandertal/genética , África , Animais , Cavernas , Variações do Número de Cópias de DNA/genética , Feminino , Fluxo Gênico/genética , Frequência do Gene , Heterozigoto , Humanos , Endogamia , Modelos Genéticos , Homem de Neandertal/classificação , Filogenia , Densidade Demográfica , Sibéria/etnologia , Falanges dos Dedos do Pé/anatomia & histologiaRESUMO
We present the DNA sequence of 17,367 protein-coding genes in two Neandertals from Spain and Croatia and analyze them together with the genome sequence recently determined from a Neandertal from southern Siberia. Comparisons with present-day humans from Africa, Europe, and Asia reveal that genetic diversity among Neandertals was remarkably low, and that they carried a higher proportion of amino acid-changing (nonsynonymous) alleles inferred to alter protein structure or function than present-day humans. Thus, Neandertals across Eurasia had a smaller long-term effective population than present-day humans. We also identify amino acid substitutions in Neandertals and present-day humans that may underlie phenotypic differences between the two groups. We find that genes involved in skeletal morphology have changed more in the lineage leading to Neandertals than in the ancestral lineage common to archaic and modern humans, whereas genes involved in behavior and pigmentation have changed more on the modern human lineage.
Assuntos
Exoma , Variação Genética , Homem de Neandertal/genética , Substituição de Aminoácidos , Animais , Croácia , DNA/genética , Frequência do Gene , Humanos , Paleontologia , Filogenia , Polimorfismo de Nucleotídeo Único , Sibéria , EspanhaRESUMO
We introduce a new method to detect ancient selective sweeps centered on a candidate site. We explored different patterns produced by sweeps around a fixed beneficial mutation, and found that a particularly informative statistic measures the consistency between majority haplotypes near the mutation and genotypic data from a closely related population. We incorporated this statistic into an approximate Bayesian computation (ABC) method that tests for sweeps at a candidate site. We applied this method to simulated data and show that it has some power to detect sweeps that occurred more than 10,000 generations in the past. We also applied it to 1,000 Genomes and Complete Genomics data combined with high-coverage Denisovan and Neanderthal genomes to test for sweeps in modern humans since the separation from the Neanderthal-Denisovan ancestor. We tested sites at which humans are fixed for the derived (i.e., nonchimpanzee allele) whereas the Neanderthal and Denisovan genomes are homozygous for the ancestral allele. We observe only weak differences in statistics indicative of selection between functional categories. When we compare patterns of scaled diversity or use our ABC approach, we fail to find a significant difference in signals of classic selective sweeps between regions surrounding nonsynonymous and synonymous changes, but we detect a slight enrichment for reduced scaled diversity around splice site changes. We also present a list of candidate sites that show high probability of having undergone a classic sweep in the modern human lineage since the split from Neanderthals and Denisovans.
Assuntos
Evolução Molecular , Modelos Genéticos , Animais , Teorema de Bayes , Genoma Humano , Humanos , Homem de Neandertal/genética , Polimorfismo de Nucleotídeo Único , Seleção GenéticaRESUMO
RUNX2, a gene involved in skeletal development, has previously been shown to be potentially affected by positive selection during recent human evolution. Here we have used antibody-based proteomics to characterize potential differences in expression patterns of RUNX2 interacting partners during primate evolution. Tissue microarrays consisting of a large set of normal tissues from human and macaque were used for protein profiling of 50 RUNX2 partners with immunohistochemistry. Eleven proteins (AR, CREBBP, EP300, FGF2, HDAC3, JUN, PRKD3, RUNX1, SATB2, TCF3, and YAP1) showed differences in expression between humans and macaques. These proteins were further profiled in tissues from chimpanzee, gorilla, and orangutan, and the corresponding genes were analyzed with regard to genomic features. Moreover, protein expression data were compared with previously obtained RNA sequencing data from six different organs. One gene (TCF3) showed significant expression differences between human and macaque at both the protein and RNA level, with higher expression in a subset of germ cells in human testis compared with macaque. In conclusion, normal tissues from macaque and human showed differences in expression of some RUNX2 partners that could be mapped to various defined cell types. The applied strategy appears advantageous to characterize the consequences of altered genes selected during evolution.
Assuntos
Subunidade alfa 1 de Fator de Ligação ao Core/metabolismo , Evolução Molecular , Regulação da Expressão Gênica/genética , Primatas/genética , Proteínas/metabolismo , Proteômica/métodos , Seleção Genética , Animais , Sequência de Bases , Subunidade alfa 1 de Fator de Ligação ao Core/genética , Humanos , Imuno-Histoquímica , Análise em Microsséries/métodos , Dados de Sequência Molecular , Proteínas/genética , Alinhamento de Sequência , Análise de Sequência de RNA , Especificidade da EspécieRESUMO
Establishing the genetic and geographic structure of populations is fundamental, both to understand their evolutionary past and preserve their future. Nevertheless, the patterns of genetic population structure are unknown for most endangered species. This is the case for bonobos (Pan paniscus), which, together with chimpanzees (Pan troglodytes), are humans' closest living relatives. Chimpanzees live across equatorial Africa and are classified into four subspecies,1 with some genetic population substructure even within subspecies. Conversely, bonobos live exclusively in the Democratic Republic of Congo and are considered a homogeneous group with low genetic diversity,2 despite some population structure inferred from mtDNA. Nevertheless, mtDNA aside, their genetic structure remains unknown, hampering our understanding of the species and conservation efforts. Mapping bonobo genetic diversity in space is, however, challenging because, being endangered, only non-invasive sampling is possible for wild individuals. Here, we jointly analyze the exomes and mtDNA from 20 wild-born bonobos, the whole genomes of 10 captive bonobos, and the mtDNA of 136 wild individuals. We identify three genetically distinct bonobo groups of inferred Central, Western, and Far-Western geographic origin within the bonobo range. We estimate the split time between the central and western populations to be â¼145,000 years ago and genetic differentiation to be in the order of that of the closest chimpanzee subspecies. Furthermore, our estimated long-term Ne for Far-West (â¼3,000) is among the lowest estimated for any great ape lineage. Our results highlight the need to attend to the bonobo substructure, both in terms of research and conservation.
RESUMO
Archaic admixture has had a substantial impact on human evolution with multiple events across different clades, including from extinct hominins such as Neanderthals and Denisovans into modern humans. In great apes, archaic admixture has been identified in chimpanzees and bonobos but the possibility of such events has not been explored in other species. Here, we address this question using high-coverage whole-genome sequences from all four extant gorilla subspecies, including six newly sequenced eastern gorillas from previously unsampled geographic regions. Using approximate Bayesian computation with neural networks to model the demographic history of gorillas, we find a signature of admixture from an archaic 'ghost' lineage into the common ancestor of eastern gorillas but not western gorillas. We infer that up to 3% of the genome of these individuals is introgressed from an archaic lineage that diverged more than 3 million years ago from the common ancestor of all extant gorillas. This introgression event took place before the split of mountain and eastern lowland gorillas, probably more than 40 thousand years ago and may have influenced perception of bitter taste in eastern gorillas. When comparing the introgression landscapes of gorillas, humans and bonobos, we find a consistent depletion of introgressed fragments on the X chromosome across these species. However, depletion in protein-coding content is not detectable in eastern gorillas, possibly as a consequence of stronger genetic drift in this species.
Assuntos
Hominidae , Homem de Neandertal , Animais , Humanos , Gorilla gorilla/genética , Pan paniscus/genética , Teorema de Bayes , Hominidae/genética , Pan troglodytes , Homem de Neandertal/genéticaRESUMO
Baboons (genus Papio ) are a morphologically and behaviorally diverse clade of catarrhine monkeys that have experienced hybridization between phenotypically and genetically distinct phylogenetic species. We used high coverage whole genome sequences from 225 wild baboons representing 19 geographic localities to investigate population genomics and inter-species gene flow. Our analyses provide an expanded picture of evolutionary reticulation among species and reveal novel patterns of population structure within and among species, including differential admixture among conspecific populations. We describe the first example of a baboon population with a genetic composition that is derived from three distinct lineages. The results reveal processes, both ancient and recent, that produced the observed mismatch between phylogenetic relationships based on matrilineal, patrilineal, and biparental inheritance. We also identified several candidate genes that may contribute to species-specific phenotypes. One-Sentence Summary: Genomic data for 225 baboons reveal novel sites of inter-species gene flow and local effects due to differences in admixture.
RESUMO
Baboons (genus Papio) are a morphologically and behaviorally diverse clade of catarrhine monkeys that have experienced hybridization between phenotypically and genetically distinct phylogenetic species. We used high-coverage whole-genome sequences from 225 wild baboons representing 19 geographic localities to investigate population genomics and interspecies gene flow. Our analyses provide an expanded picture of evolutionary reticulation among species and reveal patterns of population structure within and among species, including differential admixture among conspecific populations. We describe the first example of a baboon population with a genetic composition that is derived from three distinct lineages. The results reveal processes, both ancient and recent, that produced the observed mismatch between phylogenetic relationships based on matrilineal, patrilineal, and biparental inheritance. We also identified several candidate genes that may contribute to species-specific phenotypes.