ABSTRACT
The recently enriched genomic history of Indigenous groups in the Americas is still meager concerning continental Central America. Here, we report ten pre-Hispanic (plus two early colonial) genomes and 84 genome-wide profiles from seven groups presently living in Panama. Our analyses reveal that pre-Hispanic demographic events contributed to the extensive genetic structure currently seen in the area, which is also characterized by a distinctive Isthmo-Colombian Indigenous component. This component drives these populations on a specific variability axis and derives from the local admixture of different ancestries of northern North American origin(s). Two of these ancestries were differentially associated to Pleistocene Indigenous groups that also moved into South America, leaving heterogenous genetic footprints. An additional Pleistocene ancestry was brought by a still unsampled population of the Isthmus (UPopI) that remained restricted to the Isthmian area, expanded locally during the early Holocene, and left genomic traces up to the present day.
Subject(s)
American Indian or Alaska Native/genetics , Archaeology , Genomics/methods , American Indian or Alaska Native/classification , DNA, Mitochondrial/genetics , Genetic Variation , Genome, Human , Haplotypes , Humans , PhylogenyABSTRACT
The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.
Subject(s)
Evolution, Molecular , Genome/genetics , Genomics , Pan paniscus/genetics , Phylogeny , Animals , Eukaryotic Initiation Factor-4A/genetics , Female , Genes , Gorilla gorilla/genetics , Molecular Sequence Annotation/standards , Pan troglodytes/genetics , Pongo/genetics , Segmental Duplications, Genomic , Sequence Analysis, DNAABSTRACT
Using contemporary people as proxies for ancient communities is a contentious but necessary practice in anthropology. In southern Africa, the distinction between the Cape KhoeSan and eastern KhoeSan remains unclear, as ethnicity labels have been changed through time and most communities were decimated if not extirpated. The eastern KhoeSan may have had genetic distinctions from neighboring communities who speak Bantu languages and KhoeSan further away; alternatively, the identity may not have been tied to any notion of biology, instead denoting communities with a nomadic "lifeway" distinct from African agro-pastoralism. The Baphuthi of the 1800s in the Maloti-Drakensberg, southern Africa had a substantial KhoeSan constituency and a lifeway of nomadism, cattle raiding, and horticulture. Baphuthi heritage could provide insights into the history of the eastern KhoeSan. We examine genetic affinities of 23 Baphuthi to discern whether the narrative of KhoeSan descent reflects distinct genetic ancestry. Genome-wide SNP data (Illumina GSA) were merged with 52 global populations, for 160,000 SNPs. Genetic analyses show no support for a unique eastern KhoeSan ancestry distinct from other KhoeSan or southern Bantu speakers. The Baphuthi have strong affinities with early-arriving southern Bantu-speaking (Nguni) communities, as the later-arriving non-Nguni show strong evidence of recent African admixture possibly related to late-Iron Age migrations. The references to communities as "San" and "Bushman" in historic literature has often been misconstrued as notions of ethnic/biological distinctions. The terms may have reflected ambiguous references to non-sedentary polities instead, as seems to be the case for the eastern "Bushman" heritage of the Baphuthi.
Subject(s)
Genetic Variation , Genetics, Population , Humans , Africa, Southern , Black People/genetics , Ethnicity/geneticsABSTRACT
Gibbons are the most speciose family of living apes, characterized by a diverse chromosome number and rapid rate of large-scale rearrangements. Here we performed single-cell template strand sequencing (Strand-seq), molecular cytogenetics, and deep in silico analysis of a southern white-cheeked gibbon genome, providing the first comprehensive map of 238 previously hidden small-scale inversions. We determined that more than half are gibbon specific, at least fivefold higher than shown for other primate lineage-specific inversions, with a significantly high number of small heterozygous inversions, suggesting that accelerated evolution of inversions may have played a role in the high sympatric diversity of gibbons. Although the precise mechanisms underlying these inversions are not yet understood, it is clear that segmental duplication-mediated NAHR only accounts for a small fraction of events. Several genomic features, including gene density and repeat (e.g., LINE-1) content, might render these regions more break-prone and susceptible to inversion formation. In the attempt to characterize interspecific variation between southern and northern white-cheeked gibbons, we identify several large assembly errors in the current GGSC Nleu3.0/nomLeu3 reference genome comprising more than 49 megabases of DNA. Finally, we provide a list of 182 candidate genes potentially involved in gibbon diversification and speciation.
Subject(s)
Hominidae , Hylobates , Animals , Hylobates/genetics , Genome , Primates/genetics , Chromosome Inversion/genetics , Chromosomes , Hominidae/geneticsABSTRACT
Anatomically modern humans evolved around 300 thousand years ago in Africa. They started to appear in the fossil record outside of Africa as early as 100 thousand years ago, although other hominins existed throughout Eurasia much earlier. Recently, several studies argued in favor of a single out of Africa event for modern humans on the basis of whole-genome sequence analyses. However, the single out of Africa model is in contrast with some of the findings from fossil records, which support two out of Africa events, and uniparental data, which propose a back to Africa movement. Here, we used a deep-learning approach coupled with approximate Bayesian computation and sequential Monte Carlo to revisit these hypotheses from the whole-genome sequence perspective. Our results support the back to Africa model over other alternatives. We estimated that there are two sequential separations between Africa and out of African populations happening around 60-90 thousand years ago and separated by 13-15 thousand years. One of the populations resulting from the more recent split has replaced the older West African population to a large extent, while the other one has founded the out of Africa populations.
Subject(s)
Deep Learning , Evolution, Molecular , Africa , Algorithms , Bayes Theorem , Fossils , Genetic Variation , Humans , Monte Carlo Method , Whole Genome Sequencing/methodsABSTRACT
Theropithecus gelada, the last surviving species of this genus, occupy a unique and highly specialised ecological niche in the Ethiopian highlands. A subdivision into three geographically defined populations (Northern, Central and Southern) has been tentatively proposed for this species on the basis of genetic analyses, but genomic data have been investigated only for two of these groups (Northern and Central). Here we combined newly generated whole genome sequences of individuals sampled from the population living south of the East Africa Great Rift Valley with available data from the other two gelada populations to reconstruct the evolutionary history of the species. Integrating genomic and paleoclimatic data we found that gene-flow across populations and with Papio species tracked past climate changes. The isolation and climatic conditions experienced by Southern geladas during the Holocene shaped local diversity and generated diet-related genomic signatures.
Subject(s)
Climate Change , Gene Flow , Genetics, Population , Hybridization, Genetic , Animals , Ethiopia , Theropithecus/genetics , Genome/geneticsABSTRACT
Generative models have shown breakthroughs in a wide spectrum of domains due to recent advancements in machine learning algorithms and increased computational power. Despite these impressive achievements, the ability of generative models to create realistic synthetic data is still under-exploited in genetics and absent from population genetics. Yet a known limitation in the field is the reduced access to many genetic databases due to concerns about violations of individual privacy, although they would provide a rich resource for data mining and integration towards advancing genetic studies. In this study, we demonstrated that deep generative adversarial networks (GANs) and restricted Boltzmann machines (RBMs) can be trained to learn the complex distributions of real genomic datasets and generate novel high-quality artificial genomes (AGs) with none to little privacy loss. We show that our generated AGs replicate characteristics of the source dataset such as allele frequencies, linkage disequilibrium, pairwise haplotype distances and population structure. Moreover, they can also inherit complex features such as signals of selection. To illustrate the promising outcomes of our method, we showed that imputation quality for low frequency alleles can be improved by data augmentation to reference panels with AGs and that the RBM latent space provides a relevant encoding of the data, hence allowing further exploration of the reference dataset and features for solving supervised tasks. Generative models and AGs have the potential to become valuable assets in genetic studies by providing a rich yet compact representation of existing genomes and high-quality, easy-access and anonymous alternatives for private databases.
Subject(s)
Computer Simulation , Genome, Human , Machine Learning , Population/genetics , Algorithms , Alleles , Chromosomes, Human, Pair 15/genetics , Databases, Factual , Databases, Genetic , Deep Learning , HapMap Project , Humans , Markov Chains , Neural Networks, Computer , Polymorphism, Single NucleotideABSTRACT
The geographical location and shape of Apulia, a narrow land stretching out in the sea at the South of Italy, made this region a Mediterranean crossroads connecting Western Europe and the Balkans. Such movements culminated at the beginning of the Iron Age with the Iapygian civilization which consisted of three cultures: Peucetians, Messapians, and Daunians. Among them, the Daunians left a peculiar cultural heritage, with one-of-a-kind stelae and pottery, but, despite the extensive archaeological literature, their origin has been lost to time. In order to shed light on this and to provide a genetic picture of Iron Age Southern Italy, we collected and sequenced human remains from three archaeological sites geographically located in Northern Apulia (the area historically inhabited by Daunians) and radiocarbon dated between 1157 and 275 calBCE. We find that Iron Age Apulian samples are still distant from the genetic variability of modern-day Apulians, they show a degree of genetic heterogeneity comparable with the cosmopolitan Republican and Imperial Roman civilization, even though a few kilometers and centuries separate them, and they are well inserted into the Iron Age Pan-Mediterranean genetic landscape. Our study provides for the first time a window on the genetic make-up of pre-Roman Apulia, whose increasing connectivity within the Mediterranean landscape, would have contributed to laying the foundation for modern genetic variability. In this light, the genetic profile of Daunians may be compatible with an at least partial autochthonous origin, with plausible contributions from the Balkan peninsula.
Subject(s)
DNA, Mitochondrial , DNA, Mitochondrial/genetics , Europe , ItalyABSTRACT
American populations are one of the most interesting examples of recently admixed groups, where ancestral components from three major continental human groups (Africans, Eurasians and Native Americans) have admixed within the last 15 generations. Recently, several genetic surveys focusing on thousands of individuals shed light on the geography, chronology and relevance of these events. However, even though gene flow could drive adaptive evolution, it is unclear whether and how natural selection acted on the resulting genetic variation in the Americas. In this study, we analysed the patterns of local ancestry of genomic fragments in genome-wide data for ~ 6000 admixed individuals from 10 American countries. In doing so, we identified regions characterized by a divergent ancestry profile (DAP), in which a significant over or under ancestral representation is evident. Our results highlighted a series of genomic regions with DAPs associated with immune system response and relevant medical traits, with the longest DAP region encompassing the human leukocyte antigen locus. Furthermore, we found that DAP regions are enriched in genes linked to cancer-related traits and autoimmune diseases. Then, analysing the biological impact of these regions, we showed that natural selection could have acted preferentially towards variants located in coding and non-coding transcripts and characterized by a high deleteriousness score. Taken together, our analyses suggest that shared patterns of post admixture adaptation occurred at a continental scale in the Americas, affecting more often functional and impactful genomic variants.
Subject(s)
Genetics, Population , Genome, Human , Genomics , Racial Groups/genetics , Selection, Genetic , Americas , Computer Simulation , Genomics/methods , Humans , Models, Genetic , Polymorphism, Single NucleotideABSTRACT
Genomic variation extends from single nucleotide variants to large chromosomal rearrangements, but the extent of structural variation in Homo sapiens is still unclear. Almarri et al. provide a worldwide catalogue of structural variants present in human populations. Most of the reported variation is novel, with some variants being inherited from Neanderthals and Denisovans. Drift and selection shaped the distribution of these variants with some suggested to have functional implications.
Subject(s)
Neanderthals , Genomics , Humans , Neanderthals/geneticsABSTRACT
Southern Italy was characterised by a complex prehistory that started with different Palaeolithic cultures, later followed by the Neolithization and the demic dispersal from the Pontic-Caspian Steppe during the Bronze Age. Archaeological and historical evidences point to a link between Southern Italians and the Balkans still present in modern times. To shed light on these dynamics, we analysed around 700 South Mediterranean genomes combined with informative ancient DNAs. Our findings revealed high affinities of South-Eastern Italians with modern Eastern Peloponnesians, and a closer affinity of ancient Greek genomes with those from specific regions of South Italy than modern Greek genomes. The higher similarity could be associated with a Bronze Age component ultimately originating from the Caucasus with high Iranian and Anatolian Neolithic ancestries. Furthermore, extremely differentiated allele frequencies among Northern and Southern Italy revealed putatively adapted SNPs in genes involved in alcohol metabolism, nevi features and immunological traits.
Subject(s)
DNA, Ancient , Genome, Human , Archaeology , Humans , Iran , ItalyABSTRACT
The Italian Peninsula, a natural pier across the Mediterranean Sea, witnessed intricate population events since the very beginning of the human occupation in Europe. In the last few years, an increasing number of modern and ancient genomes from the area have been published by the international research community. This genomic perspective started unveiling the relevance of Italy to understand the post-Last Glacial Maximum (LGM) re-peopling of Europe, the earlier phase of the Neolithic westward migrations, and its linking role between Eastern and Western Mediterranean areas after the Iron Age. However, many open questions are still waiting for more data to be addressed in full. With this review, we summarize the current knowledge emerging from the available ancient Italian individuals and, by re-analysing them all at once, we try to shed light on the avenues future research in the area should cover. In particular, open questions concern (1) the fate of pre-Villabruna Europeans and to what extent their genomic components were absorbed by the post-LGM hunter-gatherers; (2) the role of Sicily and Sardinia before LGM; (3) to what degree the documented genetic structure within the Early Neolithic settlers can be described as two separate migrations; (4) what are the population events behind the marked presence of an Iranian Neolithic-like component in Bronze Age and Iron Age Italian and Southern European samples.
Subject(s)
DNA, Ancient/analysis , Evolution, Molecular , Genetic Variation , Genome, Human , Genomics/history , White People/genetics , White People/history , History, Ancient , History, Medieval , Humans , ItalyABSTRACT
Puberty is a complex developmental process that varies considerably among individuals and populations. Genetic factors explain a large proportion of the variability of several pubertal traits. Recent genome-wide association studies (GWAS) have identified hundreds of variants involved in traits that result from body growth, like adult height. However, they do not capture many genetic loci involved in growth changes over distinct growth phases. Further, such GWAS have been mostly performed in Europeans, but it is unknown how these findings relate to other continental populations. In this study, we analyzed the genetic basis of three pubertal traits; namely, peak height velocity (PV), age at PV (APV) and height at APV (HAPV). We analyzed a cohort of 904 admixed Chilean children and adolescents with European and Mapuche Native American ancestries. Height was measured on roughly a [Formula: see text]month basis from childhood to adolescence between 2006 and 2019. We predict that, in average, HAPV is 4.3 cm higher in European than in Mapuche adolescents (P = 0.042), and APV is 0.73 years later in European compared with Mapuche adolescents (P = 0.023). Further, by performing a GWAS on 774, 433 single-nucleotide polymorphisms, we identified a genetic signal harboring 3 linked variants significantly associated with PV in boys (P [Formula: see text]). This signal has never been associated with growth-related traits.
Subject(s)
Indians, South American/genetics , Puberty/genetics , Adolescent , Adolescent Development , Adult , Aging/genetics , Body Height/genetics , Chile , Cohort Studies , Female , Genetic Variation , Genome-Wide Association Study , Humans , Male , White People/geneticsABSTRACT
The Indus Valley has been the backdrop for several historic and prehistoric population movements between South Asia and West Eurasia. However, the genetic structure of present-day populations from Northwest India is poorly characterized. Here we report new genome-wide genotype data for 45 modern individuals from four Northwest Indian populations, including the Ror, whose long-term occupation of the region can be traced back to the early Vedic scriptures. Our results suggest that although the genetic architecture of most Northwest Indian populations fits well on the broader North-South Indian genetic cline, culturally distinct groups such as the Ror stand out by being genetically more akin to populations living west of India; such populations include prehistorical and early historical ancient individuals from the Swat Valley near the Indus Valley. We argue that this affinity is more likely a result of genetic continuity since the Bronze Age migrations from the Steppe Belt than a result of recent admixture. The observed patterns of genetic relationships both with modern and ancient West Eurasians suggest that the Ror can be used as a proxy for a population descended from the Ancestral North Indian (ANI) population. Collectively, our results show that the Indus Valley populations are characterized by considerable genetic heterogeneity that has persisted over thousands of years.
Subject(s)
Ethnicity/genetics , Genetic Variation/genetics , Asia , Emigration and Immigration , Genetics, Population/methods , Genome-Wide Association Study/methods , Genotype , Geography , Humans , IndiaABSTRACT
Recessive dystrophic epidermolysis bullosa (RDEB) is a rare genodermatosis caused by mutations in the gene coding for type VII collagen (COL7A1). More than 800 different pathogenic mutations in COL7A1 have been described to date; however, the ancestral origins of many of these mutations have not been precisely identified. In this study, 32 RDEB patient samples from the Southwestern United States, Mexico, Chile, and Colombia carrying common mutations in the COL7A1 gene were investigated to determine the origins of these mutations and the extent to which shared ancestry contributes to disease prevalence. The results demonstrate both shared European and American origins of RDEB mutations in distinct populations in the Americas and suggest the influence of Sephardic ancestry in at least some RDEB mutations of European origins. Knowledge of ancestry and relatedness among RDEB patient populations will be crucial for the development of future clinical trials and the advancement of novel therapeutics.
Subject(s)
Collagen Type VII/genetics , Epidermolysis Bullosa Dystrophica/genetics , Hispanic or Latino/genetics , Jews/genetics , Chile/epidemiology , Colombia/epidemiology , Epidermolysis Bullosa Dystrophica/epidemiology , Female , Genes, Recessive/genetics , Humans , Male , Mexico/epidemiology , Phenotype , United States/epidemiologyABSTRACT
Genetic variation in contemporary South Asian populations follows a northwest to southeast decreasing cline of shared West Eurasian ancestry. A growing body of ancient DNA evidence is being used to build increasingly more realistic models of demographic changes in the last few thousand years. Through high-quality modern genomes, these models can be tested for gene and genome level deviations. Using local ancestry deconvolution and masking, we reconstructed population-specific surrogates of the two main ancestral components for more than 500 samples from 25 South Asian populations and showed our approach to be robust via coalescent simulations. Our f3 and f4 statistics-based estimates reveal that the reconstructed haplotypes are good proxies for the source populations that admixed in the area and point to complex interpopulation relationships within the West Eurasian component, compatible with multiple waves of arrival, as opposed to a simpler one wave scenario. Our approach also provides reliable local haplotypes for future downstream analyses. As one such example, the local ancestry deconvolution in South Asians reveals opposite selective pressures on two pigmentation genes (SLC45A2 and SLC24A5) that are common or fixed in West Eurasians, suggesting post-admixture purifying and positive selection signals, respectively.
Subject(s)
Genome, Human , Genomics/methods , Adaptation, Biological , Demography , Haplotypes , Humans , India , Pakistan , Phylogeography , Polymorphism, Single Nucleotide , Principal Component Analysis , Selection, GeneticABSTRACT
Context: Africa's role in the narrative of human evolution is indisputably emphasised in the emergence of Homo sapiens. However, once humans dispersed beyond Africa, the history of those who stayed remains vastly under-studied, lacking the proper attention the birthplace of both modern and archaic humans deserves. The sequencing of Neanderthal and Denisovan genomes has elucidated evidence of admixture between archaic and modern humans outside of Africa, but has not aided efforts in answering whether archaic admixture happened within Africa. Objectives: This article reviews the state of research for archaic introgression in African populations and discusses recent insights into this topic. Methods: Gathering published sources and recently released preprints, this review reports on the different methods developed for detecting archaic introgression. Particularly it discusses how relevant these are when implemented on African populations and what findings these studies have shown so far. Results: Methods for detecting archaic introgression have been predominantly developed and implemented on non-African populations. Recent preprints present new methods considering African populations. While a number of studies using these methods suggest archaic introgression in Africa, without an African archaic genome to validate these results, such findings remain as putative archaic introgression. Conclusion: In light of the caveats with implementing current archaic introgression detection methods in Africa, we recommend future studies to concentrate on unravelling the complicated demographic history of Africa through means of ancient DNA where possible and through more focused efforts to sequence modern DNA from more representative populations across the African continent.
Subject(s)
Black People/genetics , DNA, Ancient/analysis , Hominidae/genetics , Hybridization, Genetic , Africa , Animals , Genome, Human , HumansABSTRACT
BACKGROUND: A number of studies which have investigated isolation patterns in human populations rely on the analysis of intra- and inter-population genetic statistics of mtDNA polymorphisms. However, this approach makes it difficult to differentiate between the effects of long-term genetic isolation and the random fluctuations of statistics due to reduced sample size. AIM: To overcome the confounding effect of sample size when detecting signatures of genetic isolation. SUBJECTS AND METHODS: A re-sampling based procedure was employed to evaluate reduction in intra-population diversity, departure from surrounding genetic background and demographic stationarity in 34 Italian populations subject to isolation factors. RESULTS: Signatures of genetic isolation were detected for all three statistics in seven populations: Pusteria valley, Sappada, Sauris, Timau settled in the eastern Italian Alps and Cappadocia, Filettino and Vallepietra settled in the Appenines. However, this study was unable to find signals for any of the statistics analysed in 19 populations. Finally, eight populations showing signals of isolation were found for one or two statistics. CONCLUSION: The analysis revealed that the use of population genetic statistics combined with re-sampling procedure can help detect signatures of genetic isolation in human populations, even using a single, although highly informative, locus like mtDNA.
Subject(s)
DNA, Mitochondrial/genetics , Polymorphism, Genetic , Reproductive Isolation , Gene Flow , Humans , Italy , Sample SizeABSTRACT
A consensus on Bantu-speaking populations being genetically similar has emerged in the last few years, but the demographic scenarios associated with their dispersal are still a matter of debate. The frontier model proposed by archeologists postulates different degrees of interaction among incoming agropastoralist and resident foraging groups in the presence of "static" and "moving" frontiers. By combining mitochondrial DNA and Y chromosome data collected from several southern African populations, we show that Bantu-speaking populations from regions characterized by a moving frontier developing after a long-term static frontier have larger hunter-gatherer contributions than groups from areas where a static frontier was not followed by further spatial expansion. Differences in the female and male components suggest that the process of assimilation of the long-term resident groups into agropastoralist societies was gender biased. Our results show that the diffusion of Bantu languages and culture in Southern Africa was a process more complex than previously described and suggest that the admixture dynamics between farmers and foragers played an important role in shaping the current patterns of genetic diversity.