Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
1.
Nature ; 617(7962): 764-768, 2023 05.
Article in English | MEDLINE | ID: mdl-37198478

ABSTRACT

Critical illness in COVID-19 is an extreme and clinically homogeneous disease phenotype that we have previously shown1 to be highly efficient for discovery of genetic associations2. Despite the advanced stage of illness at presentation, we have shown that host genetics in patients who are critically ill with COVID-19 can identify immunomodulatory therapies with strong beneficial effects in this group3. Here we analyse 24,202 cases of COVID-19 with critical illness comprising a combination of microarray genotype and whole-genome sequencing data from cases of critical illness in the international GenOMICC (11,440 cases) study, combined with other studies recruiting hospitalized patients with a strong focus on severe and critical disease: ISARIC4C (676 cases) and the SCOURGE consortium (5,934 cases). To put these results in the context of existing work, we conduct a meta-analysis of the new GenOMICC genome-wide association study (GWAS) results with previously published data. We find 49 genome-wide significant associations, of which 16 have not been reported previously. To investigate the therapeutic implications of these findings, we infer the structural consequences of protein-coding variants, and combine our GWAS results with gene expression data using a monocyte transcriptome-wide association study (TWAS) model, as well as gene and protein expression using Mendelian randomization. We identify potentially druggable targets in multiple systems, including inflammatory signalling (JAK1), monocyte-macrophage activation and endothelial permeability (PDE4A), immunometabolism (SLC2A5 and AK5), and host factors required for viral entry and replication (TMPRSS2 and RAB2A).


Subject(s)
COVID-19 , Critical Illness , Genetic Predisposition to Disease , Genetic Variation , Genome-Wide Association Study , Humans , COVID-19/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Genotype , Genotyping Techniques , Monocytes/metabolism , Phenotype , rab GTP-Binding Proteins/genetics , Transcriptome , Whole Genome Sequencing
2.
Nature ; 607(7917): 97-103, 2022 07.
Article in English | MEDLINE | ID: mdl-35255492

ABSTRACT

Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2-4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes-including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)-in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.


Subject(s)
COVID-19 , Critical Illness , Genome, Human , Host-Pathogen Interactions , Whole Genome Sequencing , ATP-Binding Cassette Transporters , COVID-19/genetics , COVID-19/mortality , COVID-19/pathology , COVID-19/virology , Cell Adhesion Molecules , Critical Care , Critical Illness/mortality , E-Selectin , Factor VIII , Fucosyltransferases , Genome, Human/genetics , Genome-Wide Association Study , Host-Pathogen Interactions/genetics , Humans , Interleukin-10 Receptor beta Subunit , Lectins, C-Type , Mucin-1 , Nerve Tissue Proteins , Phospholipid Transfer Proteins , Receptors, Cell Surface , Repressor Proteins , SARS-CoV-2/pathogenicity , Galactoside 2-alpha-L-fucosyltransferase
3.
Nature ; 591(7848): 92-98, 2021 03.
Article in English | MEDLINE | ID: mdl-33307546

ABSTRACT

Host-mediated lung inflammation is present1, and drives mortality2, in the critical illness caused by coronavirus disease 2019 (COVID-19). Host genetic variants associated with critical illness may identify mechanistic targets for therapeutic development3. Here we report the results of the GenOMICC (Genetics Of Mortality In Critical Care) genome-wide association study in 2,244 critically ill patients with COVID-19 from 208 UK intensive care units. We have identified and replicated the following new genome-wide significant associations: on chromosome 12q24.13 (rs10735079, P = 1.65 × 10-8) in a gene cluster that encodes antiviral restriction enzyme activators (OAS1, OAS2 and OAS3); on chromosome 19p13.2 (rs74956615, P = 2.3 × 10-8) near the gene that encodes tyrosine kinase 2 (TYK2); on chromosome 19p13.3 (rs2109069, P = 3.98 ×  10-12) within the gene that encodes dipeptidyl peptidase 9 (DPP9); and on chromosome 21q22.1 (rs2236757, P = 4.99 × 10-8) in the interferon receptor gene IFNAR2. We identified potential targets for repurposing of licensed medications: using Mendelian randomization, we found evidence that low expression of IFNAR2, or high expression of TYK2, are associated with life-threatening disease; and transcriptome-wide association in lung tissue revealed that high expression of the monocyte-macrophage chemotactic receptor CCR2 is associated with severe COVID-19. Our results identify robust genetic signals relating to key host antiviral defence mechanisms and mediators of inflammatory organ damage in COVID-19. Both mechanisms may be amenable to targeted treatment with existing drugs. However, large-scale randomized clinical trials will be essential before any change to clinical practice.


Subject(s)
COVID-19/genetics , COVID-19/physiopathology , Critical Illness , 2',5'-Oligoadenylate Synthetase/genetics , COVID-19/pathology , Chromosomes, Human, Pair 12/genetics , Chromosomes, Human, Pair 19/genetics , Chromosomes, Human, Pair 21/genetics , Critical Care , Dipeptidyl-Peptidases and Tripeptidyl-Peptidases/genetics , Drug Repositioning , Female , Genome-Wide Association Study , Humans , Inflammation/genetics , Inflammation/pathology , Inflammation/physiopathology , Lung/pathology , Lung/physiopathology , Lung/virology , Male , Multigene Family/genetics , Receptor, Interferon alpha-beta/genetics , Receptors, CCR2/genetics , TYK2 Kinase/genetics , United Kingdom
5.
Proc Natl Acad Sci U S A ; 113(25): 6886-91, 2016 06 21.
Article in English | MEDLINE | ID: mdl-27274049

ABSTRACT

Farming and sedentism first appeared in southwestern Asia during the early Holocene and later spread to neighboring regions, including Europe, along multiple dispersal routes. Conspicuous uncertainties remain about the relative roles of migration, cultural diffusion, and admixture with local foragers in the early Neolithization of Europe. Here we present paleogenomic data for five Neolithic individuals from northern Greece and northwestern Turkey spanning the time and region of the earliest spread of farming into Europe. We use a novel approach to recalibrate raw reads and call genotypes from ancient DNA and observe striking genetic similarity both among Aegean early farmers and with those from across Europe. Our study demonstrates a direct genetic link between Mediterranean and Central European early farmers and those of Greece and Anatolia, extending the European Neolithic migratory chain all the way back to southwestern Asia.


Subject(s)
Agriculture , Anthropology , Europe , Genetics, Population , Humans , Mediterranean Region , Principal Component Analysis
6.
PLoS Genet ; 9(12): e1003995, 2013.
Article in English | MEDLINE | ID: mdl-24339797

ABSTRACT

The contribution of regulatory versus protein change to adaptive evolution has long been controversial. In principle, the rate and strength of adaptation within functional genetic elements can be quantified on the basis of an excess of nucleotide substitutions between species compared to the neutral expectation or from effects of recent substitutions on nucleotide diversity at linked sites. Here, we infer the nature of selective forces acting in proteins, their UTRs and conserved noncoding elements (CNEs) using genome-wide patterns of diversity in wild house mice and divergence to related species. By applying an extension of the McDonald-Kreitman test, we infer that adaptive substitutions are widespread in protein-coding genes, UTRs and CNEs, and we estimate that there are at least four times as many adaptive substitutions in CNEs and UTRs as in proteins. We observe pronounced reductions in mean diversity around nonsynonymous sites (whether or not they have experienced a recent substitution). This can be explained by selection on multiple, linked CNEs and exons. We also observe substantial dips in mean diversity (after controlling for divergence) around protein-coding exons and CNEs, which can also be explained by the combined effects of many linked exons and CNEs. A model of background selection (BGS) can adequately explain the reduction in mean diversity observed around CNEs. However, BGS fails to explain the wide reductions in mean diversity surrounding exons (encompassing ~100 Kb, on average), implying that there is a substantial role for adaptation within exons or closely linked sites. The wide dips in diversity around exons, which are hard to explain by BGS, suggest that the fitness effects of adaptive amino acid substitutions could be substantially larger than substitutions in CNEs. We conclude that although there appear to be many more adaptive noncoding changes, substitutions in proteins may dominate phenotypic evolution.


Subject(s)
Adaptation, Physiological/genetics , Evolution, Molecular , Muridae/genetics , Open Reading Frames/genetics , Regulatory Sequences, Nucleic Acid , Amino Acid Substitution/genetics , Animals , Exons , Genetic Variation , Mice , Mutation , Polymorphism, Genetic
7.
Mol Biol Evol ; 31(12): 3148-63, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25158796

ABSTRACT

Native to Asia, the soft-skinned fruit pest Drosophila suzukii has recently invaded the United States and Europe. The eastern United States represents the most recent expansion of their range, and presents an opportunity to test alternative models of colonization history. Here, we investigate the genetic population structure of this invasive fruit fly, with a focus on the eastern United States. We sequenced six X-linked gene fragments from 246 individuals collected from a total of 12 populations. We examine patterns of genetic diversity within and between populations and explore alternative colonization scenarios using approximate Bayesian computation. Our results indicate high levels of nucleotide diversity in this species and suggest that the recent invasions of Europe and the continental United States are independent demographic events. More broadly speaking, our results highlight the importance of integrating population structure into demographic models, particularly when attempting to reconstruct invasion histories. Finally, our simulation results illustrate the general challenge in reconstructing invasion histories using genetic data and suggest that genome-level data are often required to distinguish among alternative demographic scenarios.


Subject(s)
Drosophila/genetics , Animals , Bayes Theorem , Genes, Insect , Genetic Variation , Haplotypes , Introduced Species , Male , Microsatellite Repeats , Models, Genetic , Spain , United States
8.
Mol Biol Evol ; 28(3): 1183-91, 2011 Mar.
Article in English | MEDLINE | ID: mdl-21059791

ABSTRACT

During the past two decades, evidence has accumulated of adaptive evolution within protein-coding genes in a variety of species. However, with the exception of Drosophila and humans, little is known about the extent of adaptive evolution in noncoding DNA. Here, we study regions upstream and downstream of protein-coding genes in the house mouse Mus musculus castaneus, a species that has a much larger effective population size (N(e)) than humans. We analyze polymorphism data for 78 genes from 15 wild-caught M. m. castaneus individuals and divergence to a closely related species, Mus famulus. We find high levels of nucleotide diversity and moderate levels of selective constraint in upstream and downstream regions compared with nonsynonymous sites of protein-coding genes. From the polymorphism data, we estimate the distribution of fitness effects (DFE) of new mutations and infer that most new mutations in upstream and downstream regions behave as effectively neutral and that only a small fraction is strongly negatively selected. We also estimate the fraction of substitutions that have been driven to fixation by positive selection (α) and the ratio of adaptive to neutral divergence (ω(α)). We find that α for upstream and downstream regions (∼ 10%) is much lower than α for nonsynonymous sites (∼ 50%). However, ω(α) estimates are very similar for nonsynonymous sites (∼ 10%) and upstream and downstream regions (∼ 5%). We conclude that negative selection operating in upstream and downstream regions of M. m. castaneus is weak and that the low values of α for upstream and downstream regions relative to nonsynonymous sites are most likely due to the presence of a higher proportion of neutrally evolving sites and not due to lower absolute rates of adaptive substitution.


Subject(s)
3' Flanking Region , 5' Flanking Region , DNA, Intergenic , Mice/genetics , Selection, Genetic , Amino Acid Substitution , Animals , DNA/genetics , DNA, Intergenic/analysis , DNA, Intergenic/biosynthesis , Evolution, Molecular , Genetic Drift , Humans , Mutation , Open Reading Frames , Polymorphism, Genetic , Population Density
9.
Nat Ecol Evol ; 6(5): 565-578, 2022 05.
Article in English | MEDLINE | ID: mdl-35273366

ABSTRACT

Host-pathogen interactions impose recurrent selective pressures that lead to constant adaptation and counter-adaptation in both competing species. Here, we sought to study this evolutionary arms-race and assessed the impact of the innate immune system on viral population diversity and evolution, using Drosophila melanogaster as model host and its natural pathogen Drosophila C virus (DCV). We isogenized eight fly genotypes generating animals defective for RNAi, Imd and Toll innate immune pathways as well as pathogen-sensing and gut renewal pathways. Wild-type or mutant flies were then orally infected with DCV and the virus was serially passaged ten times via reinfection in naive flies. Viral population diversity was studied after each viral passage by high-throughput sequencing and infection phenotypes were assessed at the beginning and at the end of the evolution experiment. We found that the absence of any of the various immune pathways studied increased viral genetic diversity while attenuating virulence. Strikingly, these effects were observed in a range of host factors described as having mainly antiviral or antibacterial functions. Together, our results indicate that the innate immune system as a whole and not specific antiviral defence pathways in isolation, generally constrains viral diversity and evolution.


Subject(s)
Drosophila Proteins , RNA Viruses , Animals , Antiviral Agents/metabolism , Dicistroviridae , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Immunity, Innate , RNA Viruses/metabolism
10.
Elife ; 112022 09 20.
Article in English | MEDLINE | ID: mdl-36124557

ABSTRACT

Posterior urethral valves (PUV) are the commonest cause of end-stage renal disease in children, but the genetic architecture of this rare disorder remains unknown. We performed a sequencing-based genome-wide association study (seqGWAS) in 132 unrelated male PUV cases and 23,727 controls of diverse ancestry, identifying statistically significant associations with common variants at 12q24.21 (p=7.8 × 10-12; OR 0.4) and rare variants at 6p21.1 (p=2.0 × 10-8; OR 7.2), that were replicated in an independent European cohort of 395 cases and 4151 controls. Fine mapping and functional genomic data mapped these loci to the transcription factor TBX5 and planar cell polarity gene PTK7, respectively, the encoded proteins of which were detected in the developing urinary tract of human embryos. We also observed enrichment of rare structural variation intersecting with candidate cis-regulatory elements, particularly inversions predicted to affect chromatin looping (p=3.1 × 10-5). These findings represent the first robust genetic associations of PUV, providing novel insights into the underlying biology of this poorly understood disorder and demonstrate how a diverse ancestry seqGWAS can be used for disease locus discovery in a rare disease.


Subject(s)
Genome-Wide Association Study , T-Box Domain Proteins/genetics , Urinary Tract , Cell Adhesion Molecules/genetics , Child , Chromatin , Humans , Male , Receptor Protein-Tyrosine Kinases/genetics , Transcription Factors/genetics
11.
Nat Commun ; 12(1): 6972, 2021 11 30.
Article in English | MEDLINE | ID: mdl-34848700

ABSTRACT

We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32-44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.


Subject(s)
Genome-Wide Association Study , Genomics , Multifactorial Inheritance/genetics , Bayes Theorem , Body Height , Body Mass Index , Cardiovascular Diseases , Diabetes Mellitus, Type 2 , Genetic Techniques , Genetic Variation , Genotype , Humans , Introns , Models, Statistical , Open Reading Frames , Phenotype , Software
12.
Nat Commun ; 12(1): 2337, 2021 04 20.
Article in English | MEDLINE | ID: mdl-33879782

ABSTRACT

While recent advancements in computation and modelling have improved the analysis of complex traits, our understanding of the genetic basis of the time at symptom onset remains limited. Here, we develop a Bayesian approach (BayesW) that provides probabilistic inference of the genetic architecture of age-at-onset phenotypes in a sampling scheme that facilitates biobank-scale time-to-event analyses. We show in extensive simulation work the benefits BayesW provides in terms of number of discoveries, model performance and genomic prediction. In the UK Biobank, we find many thousands of common genomic regions underlying the age-at-onset of high blood pressure (HBP), cardiac disease (CAD), and type-2 diabetes (T2D), and for the genetic basis of onset reflecting the underlying genetic liability to disease. Age-at-menopause and age-at-menarche are also highly polygenic, but with higher variance contributed by low frequency variants. Genomic prediction into the Estonian Biobank data shows that BayesW gives higher prediction accuracy than other approaches.


Subject(s)
Age of Onset , Genome, Human , Models, Genetic , Multifactorial Inheritance , Age Factors , Algorithms , Bayes Theorem , Cardiovascular Diseases/genetics , Computer Simulation , Databases, Genetic , Diabetes Mellitus, Type 2/genetics , Estonia , Female , Genetic Association Studies , Genome-Wide Association Study , Genomics , Humans , Hypertension/genetics , Menarche/genetics , Menopause/genetics , Phenotype , Polymorphism, Single Nucleotide , United Kingdom
13.
Genome Med ; 12(1): 60, 2020 07 08.
Article in English | MEDLINE | ID: mdl-32641083

ABSTRACT

BACKGROUND: The molecular factors which control circulating levels of inflammatory proteins are not well understood. Furthermore, association studies between molecular probes and human traits are often performed by linear model-based methods which may fail to account for complex structure and interrelationships within molecular datasets. METHODS: In this study, we perform genome- and epigenome-wide association studies (GWAS/EWAS) on the levels of 70 plasma-derived inflammatory protein biomarkers in healthy older adults (Lothian Birth Cohort 1936; n = 876; Olink® inflammation panel). We employ a Bayesian framework (BayesR+) which can account for issues pertaining to data structure and unknown confounding variables (with sensitivity analyses using ordinary least squares- (OLS) and mixed model-based approaches). RESULTS: We identified 13 SNPs associated with 13 proteins (n = 1 SNP each) concordant across OLS and Bayesian methods. We identified 3 CpG sites spread across 3 proteins (n = 1 CpG each) that were concordant across OLS, mixed-model and Bayesian analyses. Tagged genetic variants accounted for up to 45% of variance in protein levels (for MCP2, 36% of variance alone attributable to 1 polymorphism). Methylation data accounted for up to 46% of variation in protein levels (for CXCL10). Up to 66% of variation in protein levels (for VEGFA) was explained using genetic and epigenetic data combined. We demonstrated putative causal relationships between CD6 and IL18R1 with inflammatory bowel disease and between IL12B and Crohn's disease. CONCLUSIONS: Our data may aid understanding of the molecular regulation of the circulating inflammatory proteome as well as causal relationships between inflammatory mediators and disease.


Subject(s)
Biomarkers , Epigenomics , Genome-Wide Association Study , Genomics , Proteins/genetics , Age Factors , Aged , Aged, 80 and over , Blood Proteins/genetics , Computational Biology/methods , DNA Methylation , Disease Susceptibility , Epigenesis, Genetic , Epigenomics/methods , Female , Gene Expression Regulation , Genomics/methods , Healthy Volunteers , Humans , Inflammation/etiology , Inflammation/metabolism , Inflammation Mediators , Male , Middle Aged , Polymorphism, Single Nucleotide , Proteins/metabolism , Quantitative Trait Loci
14.
Nat Ecol Evol ; 2(4): 721-730, 2018 04.
Article in English | MEDLINE | ID: mdl-29531345

ABSTRACT

Understanding how deleterious genetic variation is distributed across human populations is of key importance in evolutionary biology and medical genetics. However, the impact of population size changes and gene flow on the corresponding mutational load remains a controversial topic. Here, we report high-coverage exomes from 300 rainforest hunter-gatherers and farmers of central Africa, whose distinct subsistence strategies are expected to have impacted their demographic pasts. Detailed demographic inference indicates that hunter-gatherers and farmers recently experienced population collapses and expansions, respectively, accompanied by increased gene flow. We show that the distribution of deleterious alleles across these populations is compatible with a similar efficacy of selection to remove deleterious variants with additive effects, and predict with simulations that their present-day additive mutation load is almost identical. For recessive mutations, although an increased load is predicted for hunter-gatherers, this increase has probably been partially counteracted by strong gene flow from expanding farmers. Collectively, our predicted and empirical observations suggest that the impact of the recent population decline of African hunter-gatherers on their mutation load has been modest and more restrained than would be expected under a fully recessive model of dominance.


Subject(s)
Exome/genetics , Gene Flow , Mutation , Africa , Farmers , Humans , Life Style , Population Dynamics , Rainforest
15.
Genetics ; 205(1): 317-332, 2017 01.
Article in English | MEDLINE | ID: mdl-27821432

ABSTRACT

While genetic diversity can be quantified accurately from high coverage sequencing data, it is often desirable to obtain such estimates from data with low coverage, either to save costs or because of low DNA quality, as is observed for ancient samples. Here, we introduce a method to accurately infer heterozygosity probabilistically from sequences with average coverage [Formula: see text] of a single individual. The method relaxes the infinite sites assumption of previous methods, does not require a reference sequence, except for the initial alignment of the sequencing data, and takes into account both variable sequencing errors and potential postmortem damage. It is thus also applicable to nonmodel organisms and ancient genomes. Since error rates as reported by sequencing machines are generally distorted and require recalibration, we also introduce a method to accurately infer recalibration parameters in the presence of postmortem damage. This method does not require knowledge about the underlying genome sequence, but instead works with haploid data (e.g., from the X-chromosome from mammalian males) and integrates over the unknown genotypes. Using extensive simulations we show that a few megabasepairs of haploid data are sufficient for accurate recalibration, even at average coverages as low as [Formula: see text] At similar coverages, our method also produces very accurate estimates of heterozygosity down to [Formula: see text] within windows of about 1 Mbp. We further illustrate the usefulness of our approach by inferring genome-wide patterns of diversity for several ancient human samples, and we found that 3000-5000-year-old samples showed diversity patterns comparable to those of modern humans. In contrast, two European hunter-gatherer samples exhibited not only considerably lower levels of diversity than modern samples, but also highly distinct distributions of diversity along their genomes. Interestingly, these distributions were also very different between the two samples, supporting earlier conclusions of a highly diverse and structured population in Europe prior to the arrival of farming.


Subject(s)
DNA, Ancient/analysis , Genetic Carrier Screening/methods , Sequence Analysis, DNA/methods , Base Sequence , Chromosome Mapping/methods , Genetic Variation , Genetics, Population/methods , Genome, Human , Heterozygote , Humans , Male , Software
16.
Curr Biol ; 27(14): 2211-2218.e8, 2017 Jul 24.
Article in English | MEDLINE | ID: mdl-28712568

ABSTRACT

For many crops, wild relatives constitute an extraordinary resource for cultivar improvement [1, 2] and also help to better understand the history of their domestication [3]. However, the wild ancestor species of several perennial crops have not yet been identified. Perennial crops generally present a weak domestication syndrome allowing cultivated individuals to establish feral populations difficult to distinguish from truly wild populations, and there is frequently ongoing gene flow between wild relatives and the crop that might erode most genetic differences [4]. Here we report the discovery of populations of the wild ancestor species of the date palm (Phoenix dactylifera L.), one of the oldest and most important cultivated fruit plants in hot and arid regions of the Old World. We discovered these wild individuals in remote and isolated mountainous locations of Oman. They are genetically more diverse than and distinct from a representative sample of Middle Eastern cultivated date palms and exhibit rounded seed shapes resembling those of a close sister species and archeological samples, but not modern cultivars. Whole-genome sequencing of several wild and cultivated individuals revealed a complex domestication history involving the contribution of at least two wild sources to African cultivated date palms. The discovery of wild date palms offers a unique chance to further elucidate the history of this iconic crop that has constituted the cornerstone of traditional oasis polyculture systems for several thousand years [5].


Subject(s)
Domestication , Phoeniceae/anatomy & histology , Phoeniceae/genetics , Oman
17.
Genetics ; 203(2): 893-904, 2016 06.
Article in English | MEDLINE | ID: mdl-27052569

ABSTRACT

Methods that bypass analytical evaluations of the likelihood function have become an indispensable tool for statistical inference in many fields of science. These so-called likelihood-free methods rely on accepting and rejecting simulations based on summary statistics, which limits them to low-dimensional models for which the value of the likelihood is large enough to result in manageable acceptance rates. To get around these issues, we introduce a novel, likelihood-free Markov chain Monte Carlo (MCMC) method combining two key innovations: updating only one parameter per iteration and accepting or rejecting this update based on subsets of statistics approximately sufficient for this parameter. This increases acceptance rates dramatically, rendering this approach suitable even for models of very high dimensionality. We further derive that for linear models, a one-dimensional combination of statistics per parameter is sufficient and can be found empirically with simulations. Finally, we demonstrate that our method readily scales to models of very high dimensionality, using toy models as well as by jointly inferring the effective population size, the distribution of fitness effects (DFE) of segregating mutations, and selection coefficients for each locus from data of a recent experiment on the evolution of drug resistance in influenza.


Subject(s)
Drug Resistance, Viral/genetics , Models, Genetic , Genetic Fitness , Genetic Loci , Mutation , Orthomyxoviridae/drug effects , Orthomyxoviridae/genetics , Probability , Selection, Genetic
18.
Science ; 353(6298): 499-503, 2016 Jul 29.
Article in English | MEDLINE | ID: mdl-27417496

ABSTRACT

We sequenced Early Neolithic genomes from the Zagros region of Iran (eastern Fertile Crescent), where some of the earliest evidence for farming is found, and identify a previously uncharacterized population that is neither ancestral to the first European farmers nor has contributed substantially to the ancestry of modern Europeans. These people are estimated to have separated from Early Neolithic farmers in Anatolia some 46,000 to 77,000 years ago and show affinities to modern-day Pakistani and Afghan populations, but particularly to Iranian Zoroastrians. We conclude that multiple, genetically differentiated hunter-gatherer populations adopted farming in southwestern Asia, that components of pre-Neolithic population structure were preserved as farming spread into neighboring regions, and that the Zagros region was the cradle of eastward expansion.


Subject(s)
Agriculture , Genome, Human , Afghanistan/ethnology , Agriculture/history , Ethnicity/genetics , Genetic Variation , History, Ancient , Human Migration , Humans , Iran/ethnology , Pakistan/ethnology , White People/genetics
19.
Genetics ; 196(4): 1131-43, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24361937

ABSTRACT

The causes of the large effect of the X chromosome in reproductive isolation and speciation have long been debated. The faster-X hypothesis predicts that X-linked loci are expected to have higher rates of adaptive evolution than autosomal loci if new beneficial mutations are on average recessive. Reproductive isolation should therefore evolve faster when contributing loci are located on the X chromosome. In this study, we have analyzed genome-wide nucleotide polymorphism data from the house mouse subspecies Mus musculus castaneus and nucleotide divergence from Mus famulus and Rattus norvegicus to compare rates of adaptive evolution for autosomal and X-linked protein-coding genes. We found significantly faster adaptive evolution for X-linked loci, particularly for genes with expression in male-specific tissues, but autosomal and X-linked genes with expression in female-specific tissues evolve at similar rates. We also estimated rates of adaptive evolution for genes expressed during spermatogenesis and found that X-linked genes that escape meiotic sex chromosome inactivation (MSCI) show rapid adaptive evolution. Our results suggest that faster-X adaptive evolution is either due to net recessivity of new advantageous mutations or due to a special gene content of the X chromosome, which regulates male function and spermatogenesis. We discuss how our results help to explain the large effect of the X chromosome in speciation.


Subject(s)
Genes, X-Linked , Genetic Speciation , Murinae/classification , Murinae/genetics , X Chromosome/genetics , Animals , Chromosomes, Mammalian , Evolution, Molecular , Female , Genetic Variation , Genome , Humans , Male , Mice , Phylogeny , Polymorphism, Single Nucleotide , Rats , Spermatogenesis , X Chromosome Inactivation
20.
Genetics ; 193(4): 1197-208, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23341416

ABSTRACT

Knowing the distribution of fitness effects (DFE) of new mutations is important for several topics in evolutionary genetics. Existing computational methods with which to infer the DFE based on DNA polymorphism data have frequently assumed that the DFE can be approximated by a unimodal distribution, such as a lognormal or a gamma distribution. However, if the true DFE departs substantially from the assumed distribution (e.g., if the DFE is multimodal), this could lead to misleading inferences about its properties. We conducted simulations to test the performance of parametric and nonparametric discretized distribution models to infer the properties of the DFE for cases in which the true DFE is unimodal, bimodal, or multimodal. We found that lognormal and gamma distribution models can perform poorly in recovering the properties of the distribution if the true DFE is bimodal or multimodal, whereas discretized distribution models perform better. If there is a sufficient amount of data, the discretized models can detect a multimodal DFE and can accurately infer the mean effect and the average fixation probability of a new deleterious mutation. We fitted several models for the DFE of amino acid-changing mutations using whole-genome polymorphism data from Drosophila melanogaster and the house mouse subspecies Mus musculus castaneus. A lognormal DFE best explains the data for D. melanogaster, whereas we find evidence for a bimodal DFE in M. m. castaneus.


Subject(s)
Gene Frequency , Genetic Fitness , Models, Genetic , Mutation, Missense , Animals , Drosophila melanogaster/genetics , Genome , Mice , Polymorphism, Genetic , Population/genetics
SELECTION OF CITATIONS
SEARCH DETAIL