RESUMEN
The Levant is a region in the Near East with an impressive record of continuous human existence and major cultural developments since the Paleolithic period. Genetic and archeological studies present solid evidence placing the Middle East and the Arabian Peninsula as the first stepping-stone outside Africa. There is, however, little understanding of demographic changes in the Middle East, particularly the Levant, after the first Out-of-Africa expansion and how the Levantine peoples relate genetically to each other and to their neighbors. In this study we analyze more than 500,000 genome-wide SNPs in 1,341 new samples from the Levant and compare them to samples from 48 populations worldwide. Our results show recent genetic stratifications in the Levant are driven by the religious affiliations of the populations within the region. Cultural changes within the last two millennia appear to have facilitated/maintained admixture between culturally similar populations from the Levant, Arabian Peninsula, and Africa. The same cultural changes seem to have resulted in genetic isolation of other groups by limiting admixture with culturally different neighboring populations. Consequently, Levant populations today fall into two main groups: one sharing more genetic characteristics with modern-day Europeans and Central Asians, and the other with closer genetic affinities to other Middle Easterners and Africans. Finally, we identify a putative Levantine ancestral component that diverged from other Middle Easterners â¼23,700-15,500 years ago during the last glacial period, and diverged from Europeans â¼15,900-9,100 years ago between the last glacial warming and the start of the Neolithic.
Asunto(s)
Cromosomas Humanos Y/genética , ADN Mitocondrial/genética , Variación Genética , Genética de Población , Arqueología , Población Negra , Evolución Cultural , Etnicidad/genética , Genoma Humano , Haplotipos , Humanos , Medio Oriente , Filogenia , Población BlancaRESUMEN
Basque people have received considerable attention from anthropologists, geneticists, and linguists during the last century due to the singularity of their language and to other cultural and biological characteristics. Despite the multidisciplinary efforts performed to address the questions of the origin, uniqueness, and heterogeneity of Basques, the genetic studies performed up to now have suffered from a weak study design where populations are not analyzed in an adequate geographic and population context. To address the former questions and to overcome these design limitations, we have analyzed the uniparentally inherited markers (Y chromosome and mitochondrial DNA) of ~900 individuals from 18 populations, including those where Basque is currently spoken and populations from adjacent regions where Basque might have been spoken in historical times. Our results indicate that Basque-speaking populations fall within the genetic Western European gene pool, that they are similar to geographically surrounding non-Basque populations, and also that their genetic uniqueness is based on a lower amount of external influences compared with other Iberians and French populations. Our data suggest that the genetic heterogeneity and structure observed in the Basque region result from pre-Roman tribal structure related to geography and might be linked to the increased complexity of emerging societies during the Bronze Age. The rough overlap of the pre-Roman tribe location and the current dialect limits support the notion that the environmental diversity in the region has played a recurrent role in cultural differentiation and ethnogenesis at different time periods.
Asunto(s)
Marcadores Genéticos , Población Blanca/genética , Cromosomas Humanos Y , ADN Mitocondrial , Etnicidad/genética , Genética de Población , Geografía , Haplotipos , HumanosRESUMEN
We analyzed 40 single nucleotide polymorphism and 19 short tandem repeat Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation, and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene-language coevolution occurred within geographically isolated populations, probably due to its mountainous terrain.
Asunto(s)
Evolución Molecular , Lenguaje , Filogenia , Población Blanca/genética , Pueblo Asiatico/genética , Cromosomas Humanos Y , Pool de Genes , Genética de Población , Haplotipos , Humanos , Lingüística , Masculino , Repeticiones de Microsatélite , Polimorfismo de Nucleótido Simple , Federación de Rusia , Análisis de Secuencia de ADNRESUMEN
Population origins and ancestry have previously been found to be important determinants of coronary artery disease (CAD). This study investigates associations of Lebanese mitochondrial DNA lineages with CAD and studies their correlation with other populations, exploring population structures that may infer mitochondria functional associations and reveal population movements and origins. Sequencing the mitochondrial hypervariable sequence 1 (HVS-1) of 363 controls and 448 cases revealed that haplogroup W was more frequent (P = 0.013) in cases compared to controls, and was associated with increased risk of CAD (OR = 5.50, 95% CI = 1.50-35.30, P = 0.026) among Lebanese samples. Haplogroup A was only found in controls (P = 0.029). We have detected stronger geographic correlation between haplogroup W and CAD (Pearson's r = 0.316, P < 0.001) than between haplogroup A and CAD (r = 0.149, P < 0.001). HVS-1 phylogenetic network of haplogroup W shows controls are restricted to European clusters while cases belong mostly to Middle Eastern natives. The network of haplogroup A shows that the controls belong to a cluster dominated by Central Asians. Our results show evidence of a gene flow into Lebanon, creating CAD-associated population structures that are similar to those in the source populations, maintained by limited admixture, and probably encompassing variations on the nuclear and/or the mitochondrial genome that are correlated with the disease.
Asunto(s)
Enfermedad de la Arteria Coronaria/genética , ADN Mitocondrial , Flujo Génico , Haplotipos , Adulto , África , Pueblo Asiatico/genética , Estudios de Casos y Controles , Femenino , Predisposición Genética a la Enfermedad , Variación Genética , Humanos , Líbano , Masculino , Persona de Mediana Edad , Medio Oriente , Filogeografía , Población Blanca/genéticaRESUMEN
We investigated whether relative rates of divergence were correlated between the mitochondrial and chloroplast genomes as expected under lineage effects or were genome specific as expected with locus-specific effects. Five mitochondrial noncoding regions (nad1B_C, nad4exon1_2, nad7exon2_3, nad7exon3_4, and rps14-cob) for 21 samples from Lecythidaceae were sequenced. Three chloroplast regions (rpl20-5'rps12, trnS-trnG, and psbA-trnH) were sequenced to expand the taxa in an existing data set. Absolute rates of nucleotide and insertion and deletion (indel) changes were 13 times faster in the chloroplast genome than in the mitochondrial genome. Similar indel length frequency distributions for both organelles suggested that common mechanisms were responsible for generating indels. Molecular clock tests applied to phylogenetic trees estimated from mitochondrial and chloroplast sequences revealed global rate heterogeneity of nucleotide substitution. Maximum likelihood and Tajima's 1D relative rate tests show that Lecythis zabucajo exhibited a rate acceleration for both the mitochondrial and chloroplast sequences. Whereas Eschweilera romeu-cardosoi showed a significant rate slowdown for chloroplast sequences, the mitochondrial sequences for 3 Eschweilera taxa showed evidence for a rate slowdown only when compared with L. zabucajo. Significant rate heterogeneity was also observed for indel changes in the mitochondrial genome but not for the chloroplast. The lack of mitochondrial nucleotide changes for some taxa as well as chloroplast indel homoplasy may have limited the power of relative rate tests to detect rate variation. Relative ratio tests consistently indicated rate proportionality among branch lengths between the mitochondrial and chloroplast phylogenetic trees. The relative ratio tests showed that taxa possessing rate heterogeneity had parallel relative divergence rates in both mitochondrial and chloroplast sequences as expected under lineage effects. A neutral replication-dependent model of rate heterogeneity for both nucleotide and indel changes provides a simple explanation for common patterns of rate heterogeneity across the 2 organelle genomes in Lecythidaceae. The lineage effects observed here were uncoupled from annual/perennial habit because all the species from this study are perennial.
Asunto(s)
Bertholletia/genética , Cloroplastos/genética , ADN Mitocondrial/genética , ADN de Plantas/genética , Evolución Molecular , Mitocondrias/genética , Secuencia de Bases , Genoma de Planta , Funciones de Verosimilitud , Datos de Secuencia Molecular , Filogenia , Alineación de SecuenciaRESUMEN
The mitochondrial DNA hypervariable segment I (HVS-I) is widely used in studies of human evolutionary genetics, and therefore accurate estimates of mutation rates among nucleotide sites in this region are essential. We have developed a novel maximum-likelihood methodology for estimating site-specific mutation rates from partial phylogenetic information, such as haplogroup association. The resulting estimation problem is a generalized linear model, with a nonstandard link function. We develop inference and bias correction tools for our estimates and a hypothesis-testing approach for site independence. We demonstrate our methodology using 16,609 HVS-I samples from the Genographic Project. Our results suggest that mutation rates among nucleotide sites in HVS-I are highly variable. The 16,400-16,500 region exhibits significantly lower rates compared to other regions, suggesting potential functional constraints. Several loci identified in the literature as possible termination-associated sequences (TAS) do not yield statistically slower rates than the rest of HVS-I, casting doubt on their functional importance. Our tests do not reject the null hypothesis of independent mutation rates among nucleotide sites, supporting the use of site-independence assumption for analyzing HVS-I. Potential extensions of our methodology include its application to estimation of mutation rates in other genetic regions, like Y chromosome short tandem repeats.
Asunto(s)
ADN Mitocondrial/genética , Modelos Genéticos , Filogenia , Mutación Puntual/genética , Simulación por Computador , Variación Genética , Haplotipos/genética , Humanos , Funciones de VerosimilitudRESUMEN
The biological role of the mitochondrial DNA (mtDNA) control region in mtDNA replication remains unclear. In a worldwide survey of mtDNA variation in the general population, we have identified a novel large control region deletion spanning positions 16154 to 16307 (m.16154_16307del154). The population prevalence of this deletion is low, since it was only observed in 1 out of over 120,000 mtDNA genomes studied. The deletion is present in a nonheteroplasmic state, and was transmitted by a mother to her two sons with no apparent past or present disease conditions. The identification of this large deletion in healthy individuals challenges the current view of the control region as playing a crucial role in the regulation of mtDNA replication, and supports the existence of a more complex system of multiple or epigenetically-determined replication origins.
Asunto(s)
Replicación del ADN , ADN Mitocondrial/genética , Región de Control de Posición , Eliminación de Secuencia , Secuencia de Bases , Análisis Mutacional de ADN , Femenino , Humanos , Masculino , Mitocondrias/genética , Datos de Secuencia MolecularRESUMEN
BACKGROUND: Differences in plant annual/perennial habit are hypothesized to cause a generation time effect on divergence rates. Previous studies that compared rates of divergence for internal transcribed spacer (ITS1 and ITS2) sequences of nuclear ribosomal DNA (nrDNA) in angiosperms have reached contradictory conclusions about whether differences in generation times (or other life history features) are associated with divergence rate heterogeneity. We compared annual/perennial ITS divergence rates using published sequence data, employing sampling criteria to control for possible artifacts that might obscure any actual rate variation caused by annual/perennial differences. RESULTS: Relative rate tests employing ITS sequences from 16 phylogenetically-independent annual/perennial species pairs rejected rate homogeneity in only a few comparisons, with annuals more frequently exhibiting faster substitution rates. Treating branch length differences categorically (annual faster or perennial faster regardless of magnitude) with a sign test often indicated an excess of annuals with faster substitution rates. Annuals showed an approximately 1.6-fold rate acceleration in nucleotide substitution models for ITS. Relative rates of three nuclear loci and two chloroplast regions for the annual Arabidopsis thaliana compared with two closely related Arabidopsis perennials indicated that divergence was faster for the annual. In contrast, A. thaliana ITS divergence rates were sometimes faster and sometimes slower than the perennial. In simulations, divergence rate differences of at least 3.5-fold were required to reject rate constancy in > 80 % of replicates using a nucleotide substitution model observed for the combination of ITS1 and ITS2. Simulations also showed that categorical treatment of branch length differences detected rate heterogeneity > 80% of the time with a 1.5-fold or greater rate difference. CONCLUSION: Although rate homogeneity was not rejected in many comparisons, in cases of significant rate heterogeneity annuals frequently exhibited faster substitution rates. Our results suggest that annual taxa may exhibit a less than 2-fold rate acceleration at ITS. Since the rate difference is small and ITS lacks statistical power to reject rate homogeneity, further studies with greater power will be required to adequately test the hypothesis that annual and perennial plants have heterogeneous substitution rates. Arabidopsis sequence data suggest that relative rate tests based on multiple loci may be able to distinguish a weak acceleration in annual plants. The failure to detect rate heterogeneity with ITS in past studies may be largely a product of low statistical power.
Asunto(s)
Núcleo Celular/genética , ADN Espaciador Ribosómico/química , Evolución Molecular , Magnoliopsida/genética , Arabidopsis/genética , Simulación por Computador , ADN de Plantas/química , Genes de Plantas , Genoma de Planta , Magnoliopsida/clasificación , Ribosomas/genética , Ribosomas/metabolismoRESUMEN
The Middle East was a funnel of human expansion out of Africa, a staging area for the Neolithic Agricultural Revolution, and the home to some of the earliest world empires. Post LGM expansions into the region and subsequent population movements created a striking genetic mosaic with distinct sex-based genetic differentiation. While prior studies have examined the mtDNA and Y-chromosome contrast in focal populations in the Middle East, none have undertaken a broad-spectrum survey including North and sub-Saharan Africa, Europe, and Middle Eastern populations. In this study 5,174 mtDNA and 4,658 Y-chromosome samples were investigated using PCA, MDS, mean-linkage clustering, AMOVA, and Fisher exact tests of F(ST)'s, R(ST)'s, and haplogroup frequencies. Geographic differentiation in affinities of Middle Eastern populations with Africa and Europe showed distinct contrasts between mtDNA and Y-chromosome data. Specifically, Lebanon's mtDNA shows a very strong association to Europe, while Yemen shows very strong affinity with Egypt and North and East Africa. Previous Y-chromosome results showed a Levantine coastal-inland contrast marked by J1 and J2, and a very strong North African component was evident throughout the Middle East. Neither of these patterns were observed in the mtDNA. While J2 has penetrated into Europe, the pattern of Y-chromosome diversity in Lebanon does not show the widespread affinities with Europe indicated by the mtDNA data. Lastly, while each population shows evidence of connections with expansions that now define the Middle East, Africa, and Europe, many of the populations in the Middle East show distinctive mtDNA and Y-haplogroup characteristics that indicate long standing settlement with relatively little impact from and movement into other populations.
Asunto(s)
Cromosomas Humanos Y , ADN Mitocondrial/genética , Grupos Raciales/genética , África , Análisis por Conglomerados , Europa (Continente) , Frecuencia de los Genes , Genética de Población , Haplotipos , Humanos , Medio Oriente , Filogenia , Filogeografía , Polimorfismo de Nucleótido SimpleRESUMEN
Afghanistan has held a strategic position throughout history. It has been inhabited since the Paleolithic and later became a crossroad for expanding civilizations and empires. Afghanistan's location, history, and diverse ethnic groups present a unique opportunity to explore how nations and ethnic groups emerged, and how major cultural evolutions and technological developments in human history have influenced modern population structures. In this study we have analyzed, for the first time, the four major ethnic groups in present-day Afghanistan: Hazara, Pashtun, Tajik, and Uzbek, using 52 binary markers and 19 short tandem repeats on the non-recombinant segment of the Y-chromosome. A total of 204 Afghan samples were investigated along with more than 8,500 samples from surrounding populations important to Afghanistan's history through migrations and conquests, including Iranians, Greeks, Indians, Middle Easterners, East Europeans, and East Asians. Our results suggest that all current Afghans largely share a heritage derived from a common unstructured ancestral population that could have emerged during the Neolithic revolution and the formation of the first farming communities. Our results also indicate that inter-Afghan differentiation started during the Bronze Age, probably driven by the formation of the first civilizations in the region. Later migrations and invasions into the region have been assimilated differentially among the ethnic groups, increasing inter-population genetic differences, and giving the Afghans a unique genetic diversity in Central Asia.
Asunto(s)
Cromosomas Humanos Y/genética , Etnicidad/genética , Afganistán/etnología , Humanos , Análisis de Componente PrincipalRESUMEN
Previous studies that pooled Indian populations from a wide variety of geographical locations, have obtained contradictory conclusions about the processes of the establishment of the Varna caste system and its genetic impact on the origins and demographic histories of Indian populations. To further investigate these questions we took advantage that both Y chromosome and caste designation are paternally inherited, and genotyped 1,680 Y chromosomes representing 12 tribal and 19 non-tribal (caste) endogamous populations from the predominantly Dravidian-speaking Tamil Nadu state in the southernmost part of India. Tribes and castes were both characterized by an overwhelming proportion of putatively Indian autochthonous Y-chromosomal haplogroups (H-M69, F-M89, R1a1-M17, L1-M27, R2-M124, and C5-M356; 81% combined) with a shared genetic heritage dating back to the late Pleistocene (10-30 Kya), suggesting that more recent Holocene migrations from western Eurasia contributed <20% of the male lineages. We found strong evidence for genetic structure, associated primarily with the current mode of subsistence. Coalescence analysis suggested that the social stratification was established 4-6 Kya and there was little admixture during the last 3 Kya, implying a minimal genetic impact of the Varna (caste) system from the historically-documented Brahmin migrations into the area. In contrast, the overall Y-chromosomal patterns, the time depth of population diversifications and the period of differentiation were best explained by the emergence of agricultural technology in South Asia. These results highlight the utility of detailed local genetic studies within India, without prior assumptions about the importance of Varna rank status for population grouping, to obtain new insights into the relative influences of past demographic events for the population structure of the whole of modern India.
Asunto(s)
Cromosomas Humanos Y , Genética de Población , Agricultura , ADN Mitocondrial/genética , Demografía , Etnicidad/genética , Variación Genética , Geografía , Haplotipos , Migración Humana , Humanos , India/etnología , Masculino , Repeticiones de Microsatélite/genética , Modelos Estadísticos , Mutación , Filogenia , Clase SocialRESUMEN
Cultural expansions, including of religions, frequently leave genetic traces of differentiation and in-migration. These expansions may be driven by complex doctrinal differentiation, together with major population migrations and gene flow. The aim of this study was to explore the genetic signature of the establishment of religious communities in a region where some of the most influential religions originated, using the Y chromosome as an informative male-lineage marker. A total of 3139 samples were analyzed, including 647 Lebanese and Iranian samples newly genotyped for 28 binary markers and 19 short tandem repeats on the non-recombinant segment of the Y chromosome. Genetic organization was identified by geography and religion across Lebanon in the context of surrounding populations important in the expansions of the major sects of Lebanon, including Italy, Turkey, the Balkans, Syria, and Iran by employing principal component analysis, multidimensional scaling, and AMOVA. Timing of population differentiations was estimated using BATWING, in comparison with dates of historical religious events to determine if these differentiations could be caused by religious conversion, or rather, whether religious conversion was facilitated within already differentiated populations. Our analysis shows that the great religions in Lebanon were adopted within already distinguishable communities. Once religious affiliations were established, subsequent genetic signatures of the older differentiations were reinforced. Post-establishment differentiations are most plausibly explained by migrations of peoples seeking refuge to avoid the turmoil of major historical events.
Asunto(s)
Cromosomas Humanos Y , Genética de Población , Grupos de Población , Emigración e Inmigración , Flujo Génico , Genotipo , Geografía , Humanos , Irán , Italia , Líbano , Masculino , Repeticiones de Microsatélite , Siria , TurquíaRESUMEN
Insertions and deletions (indels) in chloroplast noncoding regions are common genetic markers to estimate population structure and gene flow, although relatively little is known about indel evolution among recently diverged lineages such as within plant families. Because indel events tend to occur nonrandomly along DNA sequences, recurrent mutations may generate homoplasy for indel haplotypes. This is a potential problem for population studies, because indel haplotypes may be shared among populations after recurrent mutation as well as gene flow. Furthermore, indel haplotypes may differ in fitness and therefore be subject to natural selection detectable as rate heterogeneity among lineages. Such selection could contribute to the spatial patterning of cpDNA haplotypes, greatly complicating the interpretation of cpDNA population structure. This study examined both nucleotide and indel cpDNA variation and divergence at six noncoding regions (psbB-psbH, atpB-rbcL, trnL-trnH, rpl20-5'rps12, trnS-trnG, and trnH-psbA) in 16 individuals from eight species in the Lecythidaceae and a Sapotaceae outgroup. We described patterns of cpDNA changes, assessed the level of indel homoplasy, and tested for rate heterogeneity among lineages and regions. Although regression analysis of branch lengths suggested some degree of indel homoplasy among the most divergent lineages, there was little evidence for indel homoplasy within the Lecythidaceae. Likelihood ratio tests applied to the entire phylogenetic tree revealed a consistent pattern rejecting a molecular clock. Tajima's 1D and 2D tests revealed two taxa with consistent rate heterogeneity, one showing relatively more and one relatively fewer changes than other taxa. In general, nucleotide changes showed more evidence of rate heterogeneity than did indel changes. The rate of evolution was highly variable among the six cpDNA regions examined, with the trnS-trnG and trnH-psbA regions showing as much as 10% and 15% divergence within the Lecythidaceae. Deviations from rate homogeneity in the two taxa were constant across cpDNA regions, consistent with lineage-specific rates of evolution rather than cpDNA region-specific natural selection. There is no evidence that indels are more likely than nucleotide changes to experience homoplasy within the Lecythidaceae. These results support a neutral interpretation of cpDNA indel and nucleotide variation in population studies within species such as Corythophora alta.