RESUMO
The expansion of people speaking Bantu languages is the most dramatic demographic event in Late Holocene Africa and fundamentally reshaped the linguistic, cultural and biological landscape of the continent1-7. With a comprehensive genomic dataset, including newly generated data of modern-day and ancient DNA from previously unsampled regions in Africa, we contribute insights into this expansion that started 6,000-4,000 years ago in western Africa. We genotyped 1,763 participants, including 1,526 Bantu speakers from 147 populations across 14 African countries, and generated whole-genome sequences from 12 Late Iron Age individuals8. We show that genetic diversity amongst Bantu-speaking populations declines with distance from western Africa, with current-day Zambia and the Democratic Republic of Congo as possible crossroads of interaction. Using spatially explicit methods9 and correlating genetic, linguistic and geographical data, we provide cross-disciplinary support for a serial-founder migration model. We further show that Bantu speakers received significant gene flow from local groups in regions they expanded into. Our genetic dataset provides an exhaustive modern-day African comparative dataset for ancient DNA studies10 and will be important to a wide range of disciplines from science and humanities, as well as to the medical sector studying human genetic variation and health in African and African-descendant populations.
Assuntos
DNA Antigo , Emigração e Imigração , Genética Populacional , Idioma , Humanos , África Ocidental , Conjuntos de Dados como Assunto , República Democrática do Congo , DNA Antigo/análise , Emigração e Imigração/história , Efeito Fundador , Fluxo Gênico/genética , Variação Genética/genética , História Antiga , Idioma/história , Linguística/história , Zâmbia , Mapeamento GeográficoRESUMO
South Eastern Bantu-speaking (SEB) groups constitute more than 80% of the population in South Africa. Despite clear linguistic and geographic diversity, the genetic differences between these groups have not been systematically investigated. Based on genome-wide data of over 5000 individuals, representing eight major SEB groups, we provide strong evidence for fine-scale population structure that broadly aligns with geographic distribution and is also congruent with linguistic phylogeny (separation of Nguni, Sotho-Tswana and Tsonga speakers). Although differential Khoe-San admixture plays a key role, the structure persists after Khoe-San ancestry-masking. The timing of admixture, levels of sex-biased gene flow and population size dynamics also highlight differences in the demographic histories of individual groups. The comparisons with five Iron Age farmer genomes further support genetic continuity over ~400 years in certain regions of the country. Simulated trait genome-wide association studies further show that the observed population structure could have major implications for biomedical genomics research in South Africa.