Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
Add more filters










Publication year range
1.
Genome Med ; 16(1): 71, 2024 05 23.
Article in English | MEDLINE | ID: mdl-38778393

ABSTRACT

BACKGROUND: Disease prevalence and mean phenotype values differ between many populations, including Inuit and Europeans. Whether these differences are partly explained by genetic differences or solely due to differences in environmental exposures is still unknown, because estimates of the genetic contribution to these means, which we will here refer to as mean genotypic values, are easily confounded, and because studies across genetically diverse populations are lacking. METHODS: Leveraging the unique genetic properties of the small, admixed and historically isolated Greenlandic population, we estimated the differences in mean genotypic value between Inuit and European genetic ancestry using an admixed sibling design. Analyses were performed across 26 metabolic phenotypes, in 1474 admixed sibling pairs present in a cohort of 5996 Greenlanders. RESULTS: After FDR correction for multiple testing, we found significantly lower mean genotypic values in Inuit genetic ancestry compared to European genetic ancestry for body weight (effect size per percentage of Inuit genetic ancestry (se), -0.51 (0.16) kg/%), body mass index (-0.20 (0.06) kg/m2/%), fat percentage (-0.38 (0.13) %/%), waist circumference (-0.42 (0.16) cm/%), hip circumference (-0.38 (0.11) cm/%) and fasting serum insulin levels (-1.07 (0.51) pmol/l/%). The direction of the effects was consistent with the observed mean phenotype differences between Inuit and European genetic ancestry. No difference in mean genotypic value was observed for height, markers of glucose homeostasis, or circulating lipid levels. CONCLUSIONS: We show that mean genotypic values for some metabolic phenotypes differ between two human populations using a method not easily confounded by possible differences in environmental exposures. Our study illustrates the importance of performing genetic studies in diverse populations.


Subject(s)
Genotype , Inuit , Phenotype , Siblings , White People , Humans , Inuit/genetics , Greenland , Male , Female , White People/genetics , Adult , Middle Aged , Body Mass Index , European People
2.
Nat Commun ; 15(1): 2921, 2024 Apr 12.
Article in English | MEDLINE | ID: mdl-38609362

ABSTRACT

The blue wildebeest (Connochaetes taurinus) is a keystone species in savanna ecosystems from southern to eastern Africa, and is well known for its spectacular migrations and locally extreme abundance. In contrast, the black wildebeest (C. gnou) is endemic to southern Africa, barely escaped extinction in the 1900s and is feared to be in danger of genetic swamping from the blue wildebeest. Despite the ecological importance of the wildebeest, there is a lack of understanding of how its unique migratory ecology has affected its gene flow, genetic structure and phylogeography. Here, we analyze whole genomes from 121 blue and 22 black wildebeest across the genus' range. We find discrete genetic structure consistent with the morphologically defined subspecies. Unexpectedly, our analyses reveal no signs of recent interspecific admixture, but rather a late Pleistocene introgression of black wildebeest into the southern blue wildebeest populations. Finally, we find that migratory blue wildebeest populations exhibit a combination of long-range panmixia, higher genetic diversity and lower inbreeding levels compared to neighboring populations whose migration has recently been disrupted. These findings provide crucial insights into the evolutionary history of the wildebeest, and tangible genetic evidence for the negative effects of anthropogenic activities on highly migratory ungulates.


Subject(s)
Antelopes , Animals , Antelopes/genetics , Ecosystem , Africa, Eastern , Africa, Southern , Anthropogenic Effects
3.
Curr Biol ; 34(7): 1576-1586.e5, 2024 04 08.
Article in English | MEDLINE | ID: mdl-38479386

ABSTRACT

Strong genetic structure has prompted discussion regarding giraffe taxonomy,1,2,3 including a suggestion to split the giraffe into four species: Northern (Giraffa c. camelopardalis), Reticulated (G. c. reticulata), Masai (G. c. tippelskirchi), and Southern giraffes (G. c. giraffa).4,5,6 However, their evolutionary history is not yet fully resolved, as previous studies used a simple bifurcating model and did not explore the presence or extent of gene flow between lineages. We therefore inferred a model that incorporates various evolutionary processes to assess the drivers of contemporary giraffe diversity. We analyzed whole-genome sequencing data from 90 wild giraffes from 29 localities across their current distribution. The most basal divergence was dated to 280 kya. Genetic differentiation, FST, among major lineages ranged between 0.28 and 0.62, and we found significant levels of ancient gene flow between them. In particular, several analyses suggested that the Reticulated lineage evolved through admixture, with almost equal contribution from the Northern lineage and an ancestral lineage related to Masai and Southern giraffes. These new results highlight a scenario of strong differentiation despite gene flow, providing further context for the interpretation of giraffe diversity and the process of speciation in general. They also illustrate that conservation measures need to target various lineages and sublineages and that separate management strategies are needed to conserve giraffe diversity effectively. Given local extinctions and recent dramatic declines in many giraffe populations, this improved understanding of giraffe evolutionary history is relevant for conservation interventions, including reintroductions and reinforcements of existing populations.


Subject(s)
Giraffes , Animals , Giraffes/genetics , Ruminants/genetics , Biological Evolution , Phylogeny , Genetic Drift
4.
Nat Commun ; 15(1): 172, 2024 Jan 03.
Article in English | MEDLINE | ID: mdl-38172616

ABSTRACT

Several African mammals exhibit a phylogeographic pattern where closely related taxa are split between West/Central and East/Southern Africa, but their evolutionary relationships and histories remain controversial. Bushpigs (Potamochoerus larvatus) and red river hogs (P. porcus) are recognised as separate species due to morphological distinctions, a perceived lack of interbreeding at contact, and putatively old divergence times, but historically, they were considered conspecific. Moreover, the presence of Malagasy bushpigs as the sole large terrestrial mammal shared with the African mainland raises intriguing questions about its origin and arrival in Madagascar. Analyses of 67 whole genomes revealed a genetic continuum between the two species, with putative signatures of historical gene flow, variable FST values, and a recent divergence time (<500,000 years). Thus, our study challenges key arguments for splitting Potamochoerus into two species and suggests their speciation might be incomplete. Our findings also indicate that Malagasy bushpigs diverged from southern African populations and underwent a limited bottleneck 1000-5000 years ago, concurrent with human arrival in Madagascar. These results shed light on the evolutionary history of an iconic and widespread African mammal and provide insight into the longstanding biogeographic puzzle surrounding the bushpig's presence in Madagascar.


Subject(s)
Mammals , Humans , Animals , Swine , Madagascar , Phylogeny , Porosity , Phylogeography , Mammals/genetics
5.
Mol Ecol ; 33(2): e17205, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37971141

ABSTRACT

Genomic studies of species threatened by extinction are providing crucial information about evolutionary mechanisms and genetic consequences of population declines and bottlenecks. However, to understand how species avoid the extinction vortex, insights can be drawn by studying species that thrive despite past declines. Here, we studied the population genomics of the muskox (Ovibos moschatus), an Ice Age relict that was at the brink of extinction for thousands of years at the end of the Pleistocene yet appears to be thriving today. We analysed 108 whole genomes, including present-day individuals representing the current native range of both muskox subspecies, the white-faced and the barren-ground muskox (O. moschatus wardi and O. moschatus moschatus) and a ~21,000-year-old ancient individual from Siberia. We found that the muskox' demographic history was profoundly shaped by past climate changes and post-glacial re-colonizations. In particular, the white-faced muskox has the lowest genome-wide heterozygosity recorded in an ungulate. Yet, there is no evidence of inbreeding depression in native muskox populations. We hypothesize that this can be explained by the effect of long-term gradual population declines that allowed for purging of strongly deleterious mutations. This study provides insights into how species with a history of population bottlenecks, small population sizes and low genetic diversity survive against all odds.


Subject(s)
Metagenomics , Resilience, Psychological , Humans , Animals , Infant, Newborn , Biological Evolution , Genomics , Ruminants/genetics , Genetic Variation/genetics
6.
Mol Ecol Resour ; 23(7): 1604-1619, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37400991

ABSTRACT

The genome of recently admixed individuals or hybrids has characteristic genetic patterns that can be used to learn about their recent admixture history. One of these are patterns of interancestry heterozygosity, which can be inferred from SNP data from either called genotypes or genotype likelihoods, without the need for information on genomic location. This makes them applicable to a wide range of data that are often used in evolutionary and conservation genomic studies, such as low-depth sequencing mapped to scaffolds and reduced representation sequencing. Here we implement maximum likelihood estimation of interancestry heterozygosity patterns using two complementary models. We furthermore develop apoh (Admixture Pedigrees of Hybrids), a software that uses estimates of paired ancestry proportions to detect recently admixed individuals or hybrids, and to suggest possible admixture pedigrees. It furthermore calculates several hybrid indices that make it easier to identify and rank possible admixture pedigrees that could give rise to the estimated patterns. We implemented apoh both as a command line tool and as a Graphical User Interface that allows the user to automatically and interactively explore, rank and visualize compatible recent admixture pedigrees, and calculate the different summary indices. We validate the performance of the method using admixed family trios from the 1000 Genomes Project. In addition, we show its applicability on identifying recent hybrids from RAD-seq data of Grant's gazelle (Nanger granti and Nanger petersii) and whole genome low-depth data of waterbuck (Kobus ellipsiprymnus) which shows complex admixture of up to four populations.


Subject(s)
Genetics, Population , Genome , Humans , Pedigree , Genome/genetics , Genotype , Software
7.
Mol Ecol ; 32(8): 1860-1874, 2023 04.
Article in English | MEDLINE | ID: mdl-36651275

ABSTRACT

The iconic Cape buffalo has experienced several documented population declines in recent history. These declines have been largely attributed to the late 19th century rinderpest pandemic. However, the effect of the rinderpest pandemic on their genetic diversity remains contentious, and other factors that have potentially affected this diversity include environmental changes during the Pleistocene, range expansions and recent human activity. Motivated by this, we present analyses of whole genome sequencing data from 59 individuals from across the Cape buffalo range to assess present-day levels of genome-wide genetic diversity and what factors have influenced these levels. We found that the Cape buffalo has high average heterozygosity overall (0.40%), with the two southernmost populations having significantly lower heterozygosity levels (0.33% and 0.29%) on par with that of the domesticated water buffalo (0.29%). Interestingly, we found that these lower levels are probably due to recent inbreeding (average fraction of runs of homozygosity 23.7% and 19.9%) rather than factors further back in time during the Pleistocene. Moreover, detailed investigations of recent demographic history show that events across the past three centuries were the main drivers of the exceptional loss of genetic diversity in the southernmost populations, coincident with the onset of colonialism in the southern extreme of the Cape buffalo range. Hence, our results add to the growing body of studies suggesting that multiple recent human-mediated impacts during the colonial period caused massive losses of large mammal abundance in southern Africa.


Subject(s)
Genetics, Population , Rinderpest , Animals , Humans , South Africa , Genetic Variation , Buffaloes/genetics , Colonialism
8.
HGG Adv ; 3(4): 100118, 2022 Oct 13.
Article in English | MEDLINE | ID: mdl-36267056

ABSTRACT

The common Arctic-specific LDLR p.G137S variant was recently shown to be associated with elevated lipid levels. Motivated by this, we aimed to investigate the effect of p.G137S on metabolic health and cardiovascular disease risk among Greenlanders to quantify its impact on the population. In a population-based Greenlandic cohort (n = 5,063), we tested for associations between the p.G137S variant and metabolic health traits as well as cardiovascular disease risk based on registry data. In addition, we explored the variant's impact on plasma NMR measured lipoprotein concentration and composition in another Greenlandic cohort (n = 1,629); 29.5% of the individuals in the cohort carried at least one copy of the p.G137S risk allele. Furthermore, 25.4% of the heterozygous and 54.7% of the homozygous carriers had high levels (>4.9 mmol/L) of serum LDL cholesterol, which is above the diagnostic level for familial hypercholesterolemia (FH). Moreover, p.G137S was associated with an overall atherosclerotic lipid profile, and increased risk of ischemic heart disease (HR [95% CI], 1.51 [1.18-1.92], p = 0.00096), peripheral artery disease (1.69 [1.01-2.82], p = 0.046), and coronary operations (1.78 [1.21-2.62], p = 0.0035). Due to its high frequency and large effect sizes, p.G137S has a marked population-level impact, increasing the risk of FH and cardiovascular disease for up to 30% of the Greenlandic population. Thus, p.G137S is a potential marker for early intervention in Arctic populations.

9.
Gastroenterology ; 162(4): 1171-1182.e3, 2022 04.
Article in English | MEDLINE | ID: mdl-34914943

ABSTRACT

BACKGROUND & AIMS: The sucrase-isomaltase (SI) c.273_274delAG loss-of-function variant is common in Arctic populations and causes congenital sucrase-isomaltase deficiency, which is an inability to break down and absorb sucrose and isomaltose. Children with this condition experience gastrointestinal symptoms when dietary sucrose is introduced. We aimed to describe the health of adults with sucrase-isomaltase deficiency. METHODS: The association between c.273_274delAG and phenotypes related to metabolic health was assessed in 2 cohorts of Greenlandic adults (n = 4922 and n = 1629). A sucrase-isomaltase knockout (Sis-KO) mouse model was used to further elucidate the findings. RESULTS: Homozygous carriers of the variant had a markedly healthier metabolic profile than the remaining population, including lower body mass index (ß [standard error], -2.0 [0.5] kg/m2; P = 3.1 × 10-5), body weight (-4.8 [1.4] kg; P = 5.1 × 10-4), fat percentage (-3.3% [1.0%]; P = 3.7 × 10-4), fasting triglyceride (-0.27 [0.07] mmol/L; P = 2.3 × 10-6), and remnant cholesterol (-0.11 [0.03] mmol/L; P = 4.2 × 10-5). Further analyses suggested that this was likely mediated partly by higher circulating levels of acetate observed in homozygous carriers (ß [standard error], 0.056 [0.002] mmol/L; P = 2.1 × 10-26), and partly by reduced sucrose uptake, but not lower caloric intake. These findings were verified in Sis-KO mice, which, compared with wild-type mice, were leaner on a sucrose-containing diet, despite similar caloric intake, had significantly higher plasma acetate levels in response to a sucrose gavage, and had lower plasma glucose level in response to a sucrose-tolerance test. CONCLUSIONS: These results suggest that sucrase-isomaltase constitutes a promising drug target for improvement of metabolic health, and that the health benefits are mediated by reduced dietary sucrose uptake and possibly also by higher levels of circulating acetate.


Subject(s)
Dietary Sucrose , Sucrase-Isomaltase Complex , Acetates , Animals , Carbohydrate Metabolism, Inborn Errors , Dietary Sucrose/adverse effects , Humans , Mice , Oligo-1,6-Glucosidase , Sucrase-Isomaltase Complex/deficiency , Sucrase-Isomaltase Complex/genetics , Sucrase-Isomaltase Complex/metabolism
10.
BMC Bioinformatics ; 22(1): 470, 2021 Sep 29.
Article in English | MEDLINE | ID: mdl-34587903

ABSTRACT

BACKGROUND: Identification of selection signatures between populations is often an important part of a population genetic study. Leveraging high-throughput DNA sequencing larger sample sizes of populations with similar ancestries has become increasingly common. This has led to the need of methods capable of identifying signals of selection in populations with a continuous cline of genetic differentiation. Individuals from continuous populations are inherently challenging to group into meaningful units which is why existing methods rely on principal components analysis for inference of the selection signals. These existing methods require called genotypes as input which is problematic for studies based on low-coverage sequencing data. MATERIALS AND METHODS: We have extended two principal component analysis based selection statistics to genotype likelihood data and applied them to low-coverage sequencing data from the 1000 Genomes Project for populations with European and East Asian ancestry to detect signals of selection in samples with continuous population structure. RESULTS: Here, we present two selections statistics which we have implemented in the PCAngsd framework. These methods account for genotype uncertainty, opening for the opportunity to conduct selection scans in continuous populations from low and/or variable coverage sequencing data. To illustrate their use, we applied the methods to low-coverage sequencing data from human populations of East Asian and European ancestries and show that the implemented selection statistics can control the false positive rate and that they identify the same signatures of selection from low-coverage sequencing data as state-of-the-art software using high quality called genotypes. CONCLUSION: We show that selection scans of low-coverage sequencing data of populations with similar ancestry perform on par with that obtained from high quality genotype data. Moreover, we demonstrate that PCAngsd outperform selection statistics obtained from called genotypes from low-coverage sequencing data without the need for ad-hoc filtering.


Subject(s)
Genetics, Population , High-Throughput Nucleotide Sequencing , Genome , Genotype , Humans , Polymorphism, Single Nucleotide , Principal Component Analysis
11.
G3 (Bethesda) ; 11(8)2021 08 07.
Article in English | MEDLINE | ID: mdl-34015083

ABSTRACT

Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here, we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.


Subject(s)
Genetics, Population , High-Throughput Nucleotide Sequencing , Genotype , Humans , Polymorphism, Single Nucleotide , Probability , Software
12.
Curr Biol ; 31(9): 1862-1871.e5, 2021 05 10.
Article in English | MEDLINE | ID: mdl-33636121

ABSTRACT

Large carnivores are generally sensitive to ecosystem changes because their specialized diet and position at the top of the trophic pyramid is associated with small population sizes. Accordingly, low genetic diversity at the whole-genome level has been reported for all big cat species, including the widely distributed leopard. However, all previous whole-genome analyses of leopards are based on the Far Eastern Amur leopards that live at the extremity of the species' distribution and therefore are not necessarily representative of the whole species. We sequenced 53 whole genomes of African leopards. Strikingly, we found that the genomic diversity in the African leopard is 2- to 5-fold higher than in other big cats, including the Amur leopard, likely because of an exceptionally high effective population size maintained by the African leopard throughout the Pleistocene. Furthermore, we detected ongoing gene flow and very low population differentiation within African leopards compared with those of other big cats. We corroborated this by showing a complete absence of an otherwise ubiquitous equatorial forest barrier to gene flow. This sets the leopard apart from most other widely distributed large African mammals, including lions. These results revise our understanding of trophic sensitivity and highlight the remarkable resilience of the African leopard, likely because of its extraordinary habitat versatility and broad dietary niche.


Subject(s)
Ecosystem , Genetic Variation , Panthera/anatomy & histology , Panthera/genetics , Africa , Animals , Female , Gene Flow , Male , Panthera/classification , Population Density
13.
Heredity (Edinb) ; 125(1-2): 15-27, 2020 08.
Article in English | MEDLINE | ID: mdl-32346130

ABSTRACT

Populations of the common chimpanzee (Pan troglodytes) are in an impending risk of going extinct in the wild as a consequence of damaging anthropogenic impact on their natural habitat and illegal pet and bushmeat trade. Conservation management programmes for the chimpanzee have been established outside their natural range (ex situ), and chimpanzees from these programmes could potentially be used to supplement future conservation initiatives in the wild (in situ). However, these programmes have often suffered from inadequate information about the geographical origin and subspecies ancestry of the founders. Here, we present a newly designed capture array with ~60,000 ancestry informative markers used to infer ancestry of individual chimpanzees in ex situ populations and determine geographical origin of confiscated sanctuary individuals. From a test panel of 167 chimpanzees with unknown origins or subspecies labels, we identify 90 suitable non-admixed individuals in the European Association of Zoos and Aquaria (EAZA) Ex situ Programme (EEP). Equally important, another 46 individuals have been identified with admixed subspecies ancestries, which therefore over time, should be naturally phased out of the breeding populations. With potential for future re-introduction to the wild, we determine the geographical origin of 31 individuals that were confiscated from the illegal trade and demonstrate the promises of using non-invasive sampling in future conservation action plans. Collectively, our genomic approach provides an exemplar for ex situ management of endangered species and offers an efficient tool in future in situ efforts to combat the illegal wildlife trade.


Subject(s)
Conservation of Natural Resources , Endangered Species , Pan troglodytes , Animals , Ecosystem , Pan troglodytes/genetics
14.
PLoS Genet ; 16(1): e1008544, 2020 01.
Article in English | MEDLINE | ID: mdl-31978080

ABSTRACT

The genetic architecture of the small and isolated Greenlandic population is advantageous for identification of novel genetic variants associated with cardio-metabolic traits. We aimed to identify genetic loci associated with body mass index (BMI), to expand the knowledge of the genetic and biological mechanisms underlying obesity. Stage 1 BMI-association analyses were performed in 4,626 Greenlanders. Stage 2 replication and meta-analysis were performed in additional cohorts comprising 1,058 Yup'ik Alaska Native people, and 1,529 Greenlanders. Obesity-related traits were assessed in the stage 1 study population. We identified a common variant on chromosome 11, rs4936356, where the derived G-allele had a frequency of 24% in the stage 1 study population. The derived allele was genome-wide significantly associated with lower BMI (beta (SE), -0.14 SD (0.03), p = 3.2x10-8), corresponding to 0.64 kg/m2 lower BMI per G allele in the stage 1 study population. We observed a similar effect in the Yup'ik cohort (-0.09 SD, p = 0.038), and a non-significant effect in the same direction in the independent Greenlandic stage 2 cohort (-0.03 SD, p = 0.514). The association remained genome-wide significant in meta-analysis of the Arctic cohorts (-0.10 SD (0.02), p = 4.7x10-8). Moreover, the variant was associated with a leaner body type (weight, -1.68 (0.37) kg; waist circumference, -1.52 (0.33) cm; hip circumference, -0.85 (0.24) cm; lean mass, -0.84 (0.19) kg; fat mass and percent, -1.66 (0.33) kg and -1.39 (0.27) %; visceral adipose tissue, -0.30 (0.07) cm; subcutaneous adipose tissue, -0.16 (0.05) cm, all p<0.0002), lower insulin resistance (HOMA-IR, -0.12 (0.04), p = 0.00021), and favorable lipid levels (triglyceride, -0.05 (0.02) mmol/l, p = 0.025; HDL-cholesterol, 0.04 (0.01) mmol/l, p = 0.0015). In conclusion, we identified a novel variant, where the derived G-allele possibly associated with lower BMI in Arctic populations, and as a consequence also leaner body type, lower insulin resistance, and a favorable lipid profile.


Subject(s)
Body Mass Index , Chromosomes, Human, Pair 11/genetics , Inuit/genetics , Polymorphism, Single Nucleotide , Adiposity , Cholesterol/blood , DNA, Intergenic/genetics , Female , Greenland , Humans , Insulin Resistance , Male , Metabolome , Waist Circumference
15.
Cell ; 177(6): 1419-1435.e31, 2019 05 30.
Article in English | MEDLINE | ID: mdl-31056281

ABSTRACT

Horse domestication revolutionized warfare and accelerated travel, trade, and the geographic expansion of languages. Here, we present the largest DNA time series for a non-human organism to date, including genome-scale data from 149 ancient animals and 129 ancient genomes (≥1-fold coverage), 87 of which are new. This extensive dataset allows us to assess the modern legacy of past equestrian civilizations. We find that two extinct horse lineages existed during early domestication, one at the far western (Iberia) and the other at the far eastern range (Siberia) of Eurasia. None of these contributed significantly to modern diversity. We show that the influence of Persian-related horse lineages increased following the Islamic conquests in Europe and Asia. Multiple alleles associated with elite-racing, including at the MSTN "speed gene," only rose in popularity within the last millennium. Finally, the development of modern breeding impacted genetic diversity more dramatically than the previous millennia of human management.


Subject(s)
Horses/genetics , Animals , Asia , Biological Evolution , Breeding/history , DNA, Ancient/analysis , Domestication , Equidae/genetics , Europe , Female , Genetic Variation/genetics , Genome/genetics , History, Ancient , Male , Phylogeny
16.
Genetics ; 212(3): 587-614, 2019 07.
Article in English | MEDLINE | ID: mdl-31088861

ABSTRACT

Both the total amount and the distribution of heterozygous sites within individual genomes are informative about the genetic diversity of the population they belong to. Detecting true heterozygous sites in ancient genomes is complicated by the generally limited coverage achieved and the presence of post-mortem damage inflating sequencing errors. Additionally, large runs of homozygosity found in the genomes of particularly inbred individuals and of domestic animals can skew estimates of genome-wide heterozygosity rates. Current computational tools aimed at estimating runs of homozygosity and genome-wide heterozygosity levels are generally sensitive to such limitations. Here, we introduce ROHan, a probabilistic method which substantially improves the estimate of heterozygosity rates both genome-wide and for genomic local windows. It combines a local Bayesian model and a Hidden Markov Model at the genome-wide level and can work both on modern and ancient samples. We show that our algorithm outperforms currently available methods for predicting heterozygosity rates for ancient samples. Specifically, ROHan can delineate large runs of homozygosity (at megabase scales) and produce a reliable confidence interval for the genome-wide rate of heterozygosity outside of such regions from modern genomes with a depth of coverage as low as 5-6× and down to 7-8× for ancient samples showing moderate DNA damage. We apply ROHan to a series of modern and ancient genomes previously published and revise available estimates of heterozygosity for humans, chimpanzees and horses.


Subject(s)
DNA, Ancient , Genotyping Techniques/methods , Heterozygote , Homozygote , Animals , Bayes Theorem , Genotyping Techniques/standards , Humans , Markov Chains
17.
Gigascience ; 8(5)2019 05 01.
Article in English | MEDLINE | ID: mdl-31042285

ABSTRACT

BACKGROUND: The estimation of relatedness between pairs of possibly inbred individuals from high-throughput sequencing (HTS) data has previously not been possible for samples where we cannot obtain reliable genotype calls, as in the case of low-coverage data. RESULTS: We introduce ngsRelateV2, a major revision of ngsRelateV1, a program that originally allowed for estimation of relatedness from HTS data among non-inbred individuals only. The new revised version takes into account the possibility of individuals being inbred by estimating the 9 condensed Jacquard coefficients along with various other relatedness statistics. The program is threaded and scales linearly with the number of cores allocated to the process. CONCLUSION: The program is available as an open source C/C++ program under the GPL license and hosted at https://github.com/ANGSD/ngsRelate. To facilitate easy analysis, the program is able to work directly on the most commonly used container formats for raw sequence (BAM/CRAM) and summary data (VCF/BCF).


Subject(s)
Genetics, Population , Genotyping Techniques , High-Throughput Nucleotide Sequencing , Inbreeding , Genotype , Humans , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA , Software
18.
Gigascience ; 8(4)2019 04 01.
Article in English | MEDLINE | ID: mdl-31004132

ABSTRACT

BACKGROUND: Recent computational advances in ancient DNA research have opened access to the detection of ancient DNA methylation footprints at the genome-wide scale. The most commonly used approach infers the methylation state of a given genomic region on the basis of the amount of nucleotide mis-incorporations observed at CpG dinucleotide sites. However, this approach overlooks a number of confounding factors, including the presence of sequencing errors and true variants. The scale and distribution of the inferred methylation measurements are also variable across samples, precluding direct comparisons. FINDINGS: Here, we present DamMet, an open-source software program retrieving maximum likelihood estimates of regional CpG methylation levels from ancient DNA sequencing data. It builds on a novel statistical model of post-mortem DNA damage for dinucleotides, accounting for sequencing errors, genotypes, and differential post-mortem cytosine deamination rates at both methylated and unmethylated sites. To validate DamMet, we extended gargammel, a sequence simulator for ancient DNA data, by introducing methylation-dependent features of post-mortem DNA decay. This new simulator provides direct validation of DamMet predictions. Additionally, the methylation levels inferred by DamMet were found to be correlated to those inferred by epiPALEOMIX and both on par and directly comparable to those measured from whole-genome bisulphite sequencing experiments of fresh tissues. CONCLUSIONS: DamMet provides genuine estimates for local DNA methylation levels in ancient individual genomes. The returned estimates are directly cross-sample comparable, and the software is available as an open-source C++ program hosted at https://gitlab.com/KHanghoj/DamMet along with a manual and tutorial.


Subject(s)
Computational Biology , DNA Methylation , Epigenomics/methods , Software , Algorithms , Autopsy , Computational Biology/methods , CpG Islands , DNA Damage , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing , Humans
19.
Science ; 360(6384): 111-114, 2018 Apr 06.
Article in English | MEDLINE | ID: mdl-29472442

ABSTRACT

The Eneolithic Botai culture of the Central Asian steppes provides the earliest archaeological evidence for horse husbandry, ~5500 years ago, but the exact nature of early horse domestication remains controversial. We generated 42 ancient-horse genomes, including 20 from Botai. Compared to 46 published ancient- and modern-horse genomes, our data indicate that Przewalski's horses are the feral descendants of horses herded at Botai and not truly wild horses. All domestic horses dated from ~4000 years ago to present only show ~2.7% of Botai-related ancestry. This indicates that a massive genomic turnover underpins the expansion of the horse stock that gave rise to modern domesticates, which coincides with large-scale human population expansions during the Early Bronze Age.


Subject(s)
Horses/classification , Horses/genetics , Animals , DNA, Ancient , Genome , Horses/anatomy & histology , Phenotype , Phylogeny
20.
Bioinformatics ; 33(19): 3148-3150, 2017 Oct 01.
Article in English | MEDLINE | ID: mdl-28957500

ABSTRACT

MOTIVATION: Estimation of admixture proportions and principal component analysis (PCA) are fundamental tools in populations genetics. However, applying these methods to low- or mid-depth sequencing data without taking genotype uncertainty into account can introduce biases. RESULTS: Here we present fastNGSadmix, a tool to fast and reliably estimate admixture proportions and perform PCA from next generation sequencing data of a single individual. The analyses are based on genotype likelihoods of the input sample and a set of predefined reference populations. The method has high accuracy, even at low sequencing depth and corrects for the biases introduced by small reference populations. AVAILABILITY AND IMPLEMENTATION: The admixture estimation method is implemented in C ++ and the PCA method is implemented in R. The code is freely available at http://www.popgen.dk/software/index.php/FastNGSadmix. CONTACT: emil.jorsboe@bio.ku.dk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Principal Component Analysis , Software , Genetics, Population/methods , Genotype , Humans , Probability
SELECTION OF CITATIONS
SEARCH DETAIL
...