RESUMO
African antelope diversity is a globally unique vestige of a much richer world-wide Pleistocene megafauna. Despite this, the evolutionary processes leading to the prolific radiation of African antelopes are not well understood. Here, we sequenced 145 whole genomes from both subspecies of the waterbuck (Kobus ellipsiprymnus), an African antelope believed to be in the process of speciation. We investigated genetic structure and population divergence and found evidence of a mid-Pleistocene separation on either side of the eastern Great Rift Valley, consistent with vicariance caused by a rain shadow along the so-called 'Kingdon's Line'. However, we also found pervasive evidence of both recent and widespread historical gene flow across the Rift Valley barrier. By inferring the genome-wide landscape of variation among subspecies, we found 14 genomic regions of elevated differentiation, including a locus that may be related to each subspecies' distinctive coat pigmentation pattern. We investigated these regions as candidate speciation islands. However, we observed no significant reduction in gene flow in these regions, nor any indications of selection against hybrids. Altogether, these results suggest a pattern whereby climatically driven vicariance is the most important process driving the African antelope radiation, and suggest that reproductive isolation may not set in until very late in the divergence process. This has a significant impact on taxonomic inference, as many taxa will be in a gray area of ambiguous systematic status, possibly explaining why it has been hard to achieve consensus regarding the species status of many African antelopes. Our analyses demonstrate how population genetics based on low-depth whole genome sequencing can provide new insights that can help resolve how far lineages have gone along the path to speciation.
RESUMO
Impalas are unusual among bovids because they have remained morphologically similar over millions of years-a phenomenon referred to as evolutionary stasis. Here, we sequenced 119 whole genomes from the two extant subspecies of impala, the common (Aepyceros melampus melampus) and black-faced (A. m. petersi) impala. We investigated the evolutionary forces working within the species to explore how they might be associated with its evolutionary stasis as a taxon. Despite being one of the most abundant bovid species, we found low genetic diversity overall, and a phylogeographic signal of spatial expansion from southern to eastern Africa. Contrary to expectations under a scenario of evolutionary stasis, we found pronounced genetic structure between and within the two subspecies with indications of ancient, but not recent, gene flow. Black-faced impala and eastern African common impala populations had more runs of homozygosity than common impala in southern Africa, and, using a proxy for genetic load, we found that natural selection is working less efficiently in these populations compared to the southern African populations. Together with the fossil record, our results are consistent with a fixed-optimum model of evolutionary stasis, in which impalas in the southern African core of the range are able to stay near their evolutionary fitness optimum as a generalist ecotone species, whereas eastern African impalas may struggle to do so due to the effects of genetic drift and reduced adaptation to the local habitat, leading to recurrent local extinction in eastern Africa and re-colonisation from the South.
RESUMO
Strong genetic structure has prompted discussion regarding giraffe taxonomy,1,2,3 including a suggestion to split the giraffe into four species: Northern (Giraffa c. camelopardalis), Reticulated (G. c. reticulata), Masai (G. c. tippelskirchi), and Southern giraffes (G. c. giraffa).4,5,6 However, their evolutionary history is not yet fully resolved, as previous studies used a simple bifurcating model and did not explore the presence or extent of gene flow between lineages. We therefore inferred a model that incorporates various evolutionary processes to assess the drivers of contemporary giraffe diversity. We analyzed whole-genome sequencing data from 90 wild giraffes from 29 localities across their current distribution. The most basal divergence was dated to 280 kya. Genetic differentiation, FST, among major lineages ranged between 0.28 and 0.62, and we found significant levels of ancient gene flow between them. In particular, several analyses suggested that the Reticulated lineage evolved through admixture, with almost equal contribution from the Northern lineage and an ancestral lineage related to Masai and Southern giraffes. These new results highlight a scenario of strong differentiation despite gene flow, providing further context for the interpretation of giraffe diversity and the process of speciation in general. They also illustrate that conservation measures need to target various lineages and sublineages and that separate management strategies are needed to conserve giraffe diversity effectively. Given local extinctions and recent dramatic declines in many giraffe populations, this improved understanding of giraffe evolutionary history is relevant for conservation interventions, including reintroductions and reinforcements of existing populations.
Assuntos
Girafas , Animais , Girafas/genética , Ruminantes/genética , Evolução Biológica , Filogenia , Deriva GenéticaRESUMO
The 3-dimensional spatial and 2-dimensional frontal QRS-T angles are measures derived from the vectorcardiogram. They are independent risk predictors for arrhythmia, but the underlying biology is unknown. Using multi-ancestry genome-wide association studies we identify 61 (58 previously unreported) loci for the spatial QRS-T angle (N = 118,780) and 11 for the frontal QRS-T angle (N = 159,715). Seven out of the 61 spatial QRS-T angle loci have not been reported for other electrocardiographic measures. Enrichments are observed in pathways related to cardiac and vascular development, muscle contraction, and hypertrophy. Pairwise genome-wide association studies with classical ECG traits identify shared genetic influences with PR interval and QRS duration. Phenome-wide scanning indicate associations with atrial fibrillation, atrioventricular block and arterial embolism and genetically determined QRS-T angle measures are associated with fascicular and bundle branch block (and also atrioventricular block for the frontal QRS-T angle). We identify potential biology involved in the QRS-T angle and their genetic relationships with cardiovascular traits and diseases, may inform future research and risk prediction.
Assuntos
Bloqueio Atrioventricular , Doenças Cardiovasculares , Humanos , Doenças Cardiovasculares/genética , Estudo de Associação Genômica Ampla , Fatores de Risco , Arritmias Cardíacas/genética , Eletrocardiografia/métodos , BiomarcadoresRESUMO
Being able to assign sex to individuals and identify autosomal and sex-linked scaffolds are essential in most population genomic analyses. Non-model organisms often have genome assemblies at scaffold-level and lack characterization of sex-linked scaffolds. Previous methods to identify sex and sex-linked scaffolds have relied on synteny between the non-model organism and a closely related species or prior knowledge about the sex of the samples to identify sex-linked scaffolds. In the latter case, the difference in depth of coverage between the autosomes and the sex chromosomes are used. Here, we present "sex assignment through coverage" (SATC), a method to assign sex to samples and identify sex-linked scaffolds from next generation sequencing (NGS) data. The method works for species with a homogametic/heterogametic sex determination system and only requires a scaffold-level reference assembly and sampling of both sexes with whole genome sequencing (WGS) data. We use the sequencing depth distribution across scaffolds to jointly identify: (i) male and female individuals, and (ii) sex-linked scaffolds. This is achieved through projecting the scaffold depths into a low-dimensional space using principal component analysis (PCA) and subsequent Gaussian mixture clustering. We demonstrate the applicability of our method using data from five mammal species and a bird species complex. The method is freely available at https://github.com/popgenDK/SATC as R code and a graphical user interface (GUI).
Assuntos
Genoma , Genômica , Cromossomos Sexuais , Análise para Determinação do Sexo , Animais , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Masculino , Cromossomos Sexuais/genética , SinteniaRESUMO
The QT interval is an electrocardiographic measure representing the sum of ventricular depolarization and repolarization, estimated by QRS duration and JT interval, respectively. QT interval abnormalities are associated with potentially fatal ventricular arrhythmia. Using genome-wide multi-ancestry analyses (>250,000 individuals) we identify 177, 156 and 121 independent loci for QT, JT and QRS, respectively, including a male-specific X-chromosome locus. Using gene-based rare-variant methods, we identify associations with Mendelian disease genes. Enrichments are observed in established pathways for QT and JT, and previously unreported genes indicated in insulin-receptor signalling and cardiac energy metabolism. In contrast for QRS, connective tissue components and processes for cell growth and extracellular matrix interactions are significantly enriched. We demonstrate polygenic risk score associations with atrial fibrillation, conduction disease and sudden cardiac death. Prioritization of druggable genes highlight potential therapeutic targets for arrhythmia. Together, these results substantially advance our understanding of the genetic architecture of ventricular depolarization and repolarization.
Assuntos
Arritmias Cardíacas , Eletrocardiografia , Arritmias Cardíacas/genética , Morte Súbita Cardíaca , Eletrocardiografia/métodos , Testes Genéticos , Humanos , MasculinoRESUMO
Genotyping-by-sequencing methods such as RADseq are popular for generating genomic and population-scale data sets from a diverse range of organisms. These often lack a usable reference genome, restricting users to RADseq specific software for processing. However, these come with limitations compared to generic next generation sequencing (NGS) toolkits. Here, we describe and test a simple pipeline for reference-free RADseq data processing that blends de novo elements from STACKS with the full suite of state-of-the art NGS tools. Specifically, we use the de novo RADseq assembly employed by STACKS to create a catalogue of RAD loci that serves as a reference for read mapping, variant calling and site filters. Using RADseq data from 28 zebra sequenced to ~8x depth-of-coverage we evaluate our approach by comparing the site frequency spectra (SFS) to those from alternative pipelines. Most pipelines yielded similar SFS at 8x depth, but only a genotype likelihood based pipeline performed similarly at low sequencing depth (2-4x). We compared the RADseq SFS with medium-depth (~13x) shotgun sequencing of eight overlapping samples, revealing that the RADseq SFS was persistently slightly skewed towards rare and invariant alleles. Using simulations and human data we confirm that this is expected when there is allelic dropout (AD) in the RADseq data. AD in the RADseq data caused a heterozygosity deficit of ~16%, which dropped to ~5% after filtering AD. Hence, AD was the most important source of bias in our RADseq data.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Software , Animais , Equidae/genética , Genômica , Humanos , Funções Verossimilhança , Perda de Heterozigosidade , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Large carnivores are generally sensitive to ecosystem changes because their specialized diet and position at the top of the trophic pyramid is associated with small population sizes. Accordingly, low genetic diversity at the whole-genome level has been reported for all big cat species, including the widely distributed leopard. However, all previous whole-genome analyses of leopards are based on the Far Eastern Amur leopards that live at the extremity of the species' distribution and therefore are not necessarily representative of the whole species. We sequenced 53 whole genomes of African leopards. Strikingly, we found that the genomic diversity in the African leopard is 2- to 5-fold higher than in other big cats, including the Amur leopard, likely because of an exceptionally high effective population size maintained by the African leopard throughout the Pleistocene. Furthermore, we detected ongoing gene flow and very low population differentiation within African leopards compared with those of other big cats. We corroborated this by showing a complete absence of an otherwise ubiquitous equatorial forest barrier to gene flow. This sets the leopard apart from most other widely distributed large African mammals, including lions. These results revise our understanding of trophic sensitivity and highlight the remarkable resilience of the African leopard, likely because of its extraordinary habitat versatility and broad dietary niche.