RESUMEN
Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method and the choice of genomic regions1-3. Here we address these issues by analysing the genomes of 363 bird species4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a marked degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous-Palaeogene boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that are a challenge to model due to either extreme DNA composition, variable substitution rates, incomplete lineage sorting or complex evolutionary events such as ancient hybridization. Assessment of the effects of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates and relative brain size following the Cretaceous-Palaeogene extinction event, supporting the hypothesis that emerging ecological opportunities catalysed the diversification of modern birds. The resulting phylogenetic estimate offers fresh insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.
Asunto(s)
Aves , Evolución Molecular , Genoma , Filogenia , Animales , Aves/genética , Aves/clasificación , Aves/anatomía & histología , Encéfalo/anatomía & histología , Extinción Biológica , Genoma/genética , Genómica , Densidad de Población , Masculino , FemeninoRESUMEN
The germline mutation rate determines the pace of genome evolution and is an evolving parameter itself1. However, little is known about what determines its evolution, as most studies of mutation rates have focused on single species with different methodologies2. Here we quantify germline mutation rates across vertebrates by sequencing and comparing the high-coverage genomes of 151 parent-offspring trios from 68 species of mammals, fishes, birds and reptiles. We show that the per-generation mutation rate varies among species by a factor of 40, with mutation rates being higher for males than for females in mammals and birds, but not in reptiles and fishes. The generation time, age at maturity and species-level fecundity are the key life-history traits affecting this variation among species. Furthermore, species with higher long-term effective population sizes tend to have lower mutation rates per generation, providing support for the drift barrier hypothesis3. The exceptionally high yearly mutation rates of domesticated animals, which have been continually selected on fecundity traits including shorter generation times, further support the importance of generation time in the evolution of mutation rates. Overall, our comparative analysis of pedigree-based mutation rates provides ecological insights on the mutation rate evolution in vertebrates.
Asunto(s)
Evolución Molecular , Mutación de Línea Germinal , Tasa de Mutación , Vertebrados , Animales , Femenino , Masculino , Aves/genética , Peces/genética , Mutación de Línea Germinal/genética , Mamíferos/genética , Reptiles/genética , Vertebrados/genéticaRESUMEN
Genome-wide genealogies of multiple species carry detailed information about demographic and selection processes on individual branches of the phylogeny. Here, we introduce TRAILS, a hidden Markov model that accurately infers time-resolved population genetics parameters, such as ancestral effective population sizes and speciation times, for ancestral branches using a multi-species alignment of three species and an outgroup. TRAILS leverages the information contained in incomplete lineage sorting fragments by modelling genealogies along the genome as rooted three-leaved trees, each with a topology and two coalescent events happening in discretized time intervals within the phylogeny. Posterior decoding of the hidden Markov model can be used to infer the ancestral recombination graph for the alignment and details on demographic changes within a branch. Since TRAILS performs posterior decoding at the base-pair level, genome-wide scans based on the posterior probabilities can be devised to detect deviations from neutrality. Using TRAILS on a human-chimp-gorilla-orangutan alignment, we recover speciation parameters and extract information about the topology and coalescent times at high resolution.
Asunto(s)
Especiación Genética , Hominidae , Animales , Humanos , Hominidae/genética , Pan troglodytes/genética , Filogenia , Genética de Población , Modelos GenéticosRESUMEN
Genomes are typically mosaics of regions with different evolutionary histories. When speciation events are closely spaced in time, recombination makes the regions sharing the same history small, and the evolutionary history changes rapidly as we move along the genome. When examining rapid radiations such as the early diversification of Neoaves 66 Mya, typically no consistent history is observed across segments exceeding kilobases of the genome. Here, we report an exception. We found that a 21-Mb region in avian genomes, mapped to chicken chromosome 4, shows an extremely strong and discordance-free signal for a history different from that of the inferred species tree. Such a strong discordance-free signal, indicative of suppressed recombination across many millions of base pairs, is not observed elsewhere in the genome for any deep avian relationships. Although long regions with suppressed recombination have been documented in recently diverged species, our results pertain to relationships dating circa 65 Mya. We provide evidence that this strong signal may be due to an ancient rearrangement that blocked recombination and remained polymorphic for several million years prior to fixation. We show that the presence of this region has misled previous phylogenomic efforts with lower taxon sampling, showing the interplay between taxon and locus sampling. We predict that similar ancient rearrangements may confound phylogenetic analyses in other clades, pointing to a need for new analytical models that incorporate the possibility of such events.
Asunto(s)
Evolución Biológica , Genoma , Animales , Filogenia , Genoma/genética , Aves , Recombinación GenéticaRESUMEN
In most mammals and likely throughout vertebrates, the gene PRDM9 specifies the locations of meiotic double strand breaks; in mice and humans at least, it also aids in their repair. For both roles, many of the molecular partners remain unknown. Here, we take a phylogenetic approach to identify genes that may be interacting with PRDM9 by leveraging the fact that PRDM9 arose before the origin of vertebrates but was lost many times, either partially or entirely-and with it, its role in recombination. As a first step, we characterize PRDM9 domain composition across 446 vertebrate species, inferring at least 13 independent losses. We then use the interdigitation of PRDM9 orthologs across vertebrates to test whether it coevolved with any of 241 candidate genes coexpressed with PRDM9 in mice or associated with recombination phenotypes in mammals. Accounting for the phylogenetic relationship among a subsample of 189 species, we find two genes whose presence and absence is unexpectedly coincident with that of PRDM9: ZCWPW1, which was recently shown to facilitate double strand break repair, and its paralog ZCWPW2, as well as, more tentatively, TEX15 and FBXO47ZCWPW2 is expected to be recruited to sites of PRDM9 binding; its tight coevolution with PRDM9 across vertebrates suggests that it is a key interactor within mammals and beyond, with a role either in recruiting the recombination machinery or in double strand break repair.
Asunto(s)
Proteínas de Ciclo Celular/genética , Eliminación de Gen , N-Metiltransferasa de Histona-Lisina/genética , Animales , Evolución Molecular , Humanos , Ratones , Filogenia , Recombinación Genética , Análisis de Secuencia de ARN/métodosRESUMEN
The merging of distinct genomes, allopolyploidization, is a widespread phenomenon in plants. It generates adaptive potential through increased genetic diversity, but examples demonstrating its exploitation remain scarce. White clover (Trifolium repens) is a ubiquitous temperate allotetraploid forage crop derived from two European diploid progenitors confined to extreme coastal or alpine habitats. We sequenced and assembled the genomes and transcriptomes of this species complex to gain insight into the genesis of white clover and the consequences of allopolyploidization. Based on these data, we estimate that white clover originated â¼15,000 to 28,000 years ago during the last glaciation when alpine and coastal progenitors were likely colocated in glacial refugia. We found evidence of progenitor diversity carryover through multiple hybridization events and show that the progenitor subgenomes have retained integrity and gene expression activity as they traveled within white clover from their original confined habitats to a global presence. At the transcriptional level, we observed remarkably stable subgenome expression ratios across tissues. Among the few genes that show tissue-specific switching between homeologous gene copies, we found flavonoid biosynthesis genes strongly overrepresented, suggesting an adaptive role of some allopolyploidy-associated transcriptional changes. Our results highlight white clover as an example of allopolyploidy-facilitated niche expansion, where two progenitor genomes, adapted and confined to disparate and highly specialized habitats, expanded to a ubiquitous global presence after glaciation-associated allopolyploidization.
Asunto(s)
Genómica , Poliploidía , Trifolium/genética , Vías Biosintéticas/genética , Mapeo Cromosómico , Flavonoides/biosíntesis , Regulación de la Expresión Génica de las Plantas , Genoma de Planta , Geografía , Hibridación Genética , Cubierta de Hielo , Factores de TiempoRESUMEN
Azoospermia is a condition defined as the absence of spermatozoa in the ejaculate, but the testicular phenotype of men with azoospermia may be very variable, ranging from full spermatogenesis, through arrested maturation of germ cells at different stages, to completely degenerated tissue with ghost tubules. Hence, information regarding the cell-type-specific expression patterns is needed to prioritise potential pathogenic variants that contribute to the pathogenesis of azoospermia. Thanks to technological advances within next-generation sequencing, it is now possible to obtain detailed cell-type-specific expression patterns in the testis by single-cell RNA sequencing. However, to interpret single-cell RNA sequencing data properly, substantial knowledge of the highly sophisticated data processing and visualisation methods is needed. Here we review the complex cellular structure of the human testis in different types of azoospermia and outline how known genetic alterations affect the pathology of the testis. We combined the currently available single-cell RNA sequencing datasets originating from the human testis into one dataset covering 62,751 testicular cells, each with a median of 2637 transcripts quantified. We show what effects the most common data-processing steps have, and how different visualisation methods can be used. Furthermore, we calculated expression patterns in pseudotime, and show how splicing rates can be used to determine the velocity of differentiation during spermatogenesis. With the combined dataset we show expression patterns and network analysis of genes known to be involved in the pathogenesis of azoospermia. Finally, we provide the combined dataset as an interactive online resource where expression of genes and different visualisation methods can be explored ( https://testis.cells.ucsc.edu/ ).
Asunto(s)
Azoospermia/genética , Testículo/patología , Transcriptoma/genética , Animales , Humanos , Masculino , Espermatogénesis/genética , Espermatozoides/patologíaRESUMEN
Klinefelter syndrome (KS; 47,XXY) is the most common sex chromosomal anomaly and causes a multitude of symptoms. Often the most noticeable symptom is infertility caused by azoospermia with testicular histology showing hyalinization of tubules, germ cells loss, and Leydig cell hyperplasia. The germ cell loss begins early in life leading to partial hyalinization of the testis at puberty, but the mechanistic drivers behind this remain poorly understood. In this systematic review, we summarize the current knowledge on developmental changes in the cellularity of KS gonads supplemented by a comparative analysis of the fetal and adult gonadal transcriptome, and blood transcriptome and methylome of men with KS. We identified a high fraction of upregulated genes that escape X-chromosome inactivation, thus supporting previous hypotheses that these are the main drivers of the testicular phenotype in KS. Enrichment analysis showed overrepresentation of genes from the X- and Y-chromosome and testicular transcription factors. Furthermore, by re-evaluation of recent single cell RNA-sequencing data originating from adult KS testis, we found novel evidence that the Sertoli cell is the most affected cell type. Our results are consistent with disturbed cross-talk between somatic and germ cells in the KS testis, and with X-escapee genes acting as mediators.
Asunto(s)
Metilación de ADN/genética , Infertilidad Masculina/genética , Síndrome de Klinefelter/sangre , Transcriptoma/genética , Adulto , Cromosomas Humanos X/genética , Cromosomas Humanos Y/genética , Gónadas/crecimiento & desarrollo , Gónadas/metabolismo , Humanos , Infertilidad Masculina/patología , Síndrome de Klinefelter/genética , Síndrome de Klinefelter/patología , Masculino , Células de Sertoli/metabolismo , Células de Sertoli/patología , Testículo/metabolismo , Testículo/patologíaRESUMEN
In humans, the most common sex chromosomal disorder is Klinefelter syndrome (KS), caused by the presence of one or more extra X-chromosomes. KS patients display a varying adult phenotype but usually present with azoospermia due to testicular degeneration, which accelerates at puberty. The timing of the germ cell loss and whether it is caused by dysgenetic fetal development of the testes is not known. We investigated eight fetal KS testes and found a marked reduction in MAGE-A4-positive pre-spermatogonia compared with testes from 15 age-matched controls, indicating a failure of the gonocytes to differentiate into pre-spermatogonia. Transcriptome analysis by RNA-sequencing of formalin-fixed, paraffin-embedded testes originating from four fetal KS and five age-matched controls revealed 211 differentially expressed transcripts in the fetal KS testis. We found a significant enrichment of upregulated X-chromosomal transcripts and validated the expression of the pseudoautosomal region 1 (PAR1) gene, AKAP17A. Moreover, we found enrichment of long non-coding RNAs in the KS testes (e.g. LINC01569 and RP11-485F13.1). In conclusion, our data indicate that the testicular phenotype observed among adult men with KS is initiated already in fetal life by failure of the gonocyte differentiation into pre-spermatogonia, which could be due to aberrant expression of long non-coding RNAs.
Asunto(s)
Perfilación de la Expresión Génica/métodos , Síndrome de Klinefelter/genética , ARN Largo no Codificante/genética , Testículo/metabolismo , Adolescente , Adulto , Antígenos/genética , Células Germinativas/metabolismo , Humanos , Masculino , Glicoproteínas de Membrana/genética , Maduración Sexual , Espermatogénesis/genética , Espermatogonias/metabolismo , Adulto JovenRESUMEN
Genes in the major histocompatibility complex (MHC, also known as HLA) play a critical role in the immune response and variation within the extended 4-Mb region shows association with major risks of many diseases. Yet, deciphering the underlying causes of these associations is difficult because the MHC is the most polymorphic region of the genome with a complex linkage disequilibrium structure. Here, we reconstruct full MHC haplotypes from de novo assembled trios without relying on a reference genome and perform evolutionary analyses. We report 100 full MHC haplotypes and call a large set of structural variants in the regions for future use in imputation with GWAS data. We also present the first complete analysis of the recombination landscape in the entire region and show how balancing selection at classical genes have linked effects on the frequency of variants throughout the region.
Asunto(s)
Variación Genética/genética , Genética de Población , Desequilibrio de Ligamiento/genética , Complejo Mayor de Histocompatibilidad/genética , Alelos , Mapeo Cromosómico , Dinamarca , Haplotipos/genética , Humanos , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.
Asunto(s)
Desequilibrio de Ligamiento/genética , Modelos Teóricos , Herencia Multifactorial/genética , Esclerosis Múltiple/genética , Polimorfismo de Nucleótido Simple/genética , Esquizofrenia/genética , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Fenotipo , Pronóstico , Sitios de Carácter CuantitativoRESUMEN
Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.
Asunto(s)
Evolución Molecular , Variación Genética/genética , Genoma Humano/genética , Genoma/genética , Pan paniscus/genética , Pan troglodytes/genética , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Genotipo , Humanos , Datos de Secuencia Molecular , Fenotipo , Filogenia , Especificidad de la EspecieRESUMEN
The human and chimpanzee X chromosomes are less divergent than expected based on autosomal divergence. We study incomplete lineage sorting patterns between humans, chimpanzees and gorillas to show that this low divergence can be entirely explained by megabase-sized regions comprising one-third of the X chromosome, where polymorphism in the human-chimpanzee ancestral species was severely reduced. We show that background selection can explain at most 10% of this reduction of diversity in the ancestor. Instead, we show that several strong selective sweeps in the ancestral species can explain it. We also report evidence of population specific sweeps in extant humans that overlap the regions of low diversity in the ancestral species. These regions further correspond to chromosomal sections shown to be devoid of Neanderthal introgression into modern humans. This suggests that the same X-linked regions that undergo selective sweeps are among the first to form reproductive barriers between diverging species. We hypothesize that meiotic drive is the underlying mechanism causing these two observations.
Asunto(s)
Cromosomas Humanos X/genética , Animales , Femenino , Flujo Genético , Especiación Genética , Variación Genética , Humanos , Masculino , Hombre de Neandertal , Recombinación Genética , Selección Genética , Especificidad de la EspecieRESUMEN
We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo assembly on the regions with many unmapped reads to resolve homozygous, heterozygous, and complex indels by exhaustive traversal of the de Bruijn graph. The method is implemented in the software SOAPindel and provides a list of candidate indels with quality scores. We compare SOAPindel to Dindel, Pindel, and GATK on simulated data and find similar or better performance for short indels (<10 bp) and higher sensitivity and specificity for long indels. A validation experiment suggests that SOAPindel has a false-positive rate of â¼10% for long indels (>5 bp), while still providing many more candidate indels than other approaches.
Asunto(s)
Mutación INDEL , Mapeo Físico de Cromosoma/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Reacciones Falso Positivas , Genoma Humano , Técnicas de Genotipaje/métodos , Heterocigoto , Homocigoto , HumanosRESUMEN
Recombination maps of ancestral species can be constructed from comparative analyses of genomes from closely related species, exemplified by a recently published map of the human-chimpanzee ancestor. Such maps resolve differences in recombination rate between species into changes along individual branches in the speciation tree, and allow identification of associated changes in the genomic sequences. We describe how coalescent hidden Markov models are able to call individual recombination events in ancestral species through inference of incomplete lineage sorting along a genomic alignment. In the great apes, speciation events are sufficiently close in time that a map can be inferred for the ancestral species at each internal branch - allowing evolution of recombination rate to be tracked over evolutionary time scales from speciation event to speciation event. We see this approach as a way of characterizing the evolution of recombination rate and the genomic properties that influence it.
Asunto(s)
Evolución Molecular , Recombinación Genética , Animales , Cromosomas Humanos/genética , Genoma Humano , Humanos , Cadenas de Markov , Modelos GenéticosRESUMEN
We present a hidden Markov model (HMM) for inferring gradual isolation between two populations during speciation, modelled as a time interval with restricted gene flow. The HMM describes the history of adjacent nucleotides in two genomic sequences, such that the nucleotides can be separated by recombination, can migrate between populations, or can coalesce at variable time points, all dependent on the parameters of the model, which are the effective population sizes, splitting times, recombination rate, and migration rate. We show by extensive simulations that the HMM can accurately infer all parameters except the recombination rate, which is biased downwards. Inference is robust to variation in the mutation rate and the recombination rate over the sequence and also robust to unknown phase of genomes unless they are very closely related. We provide a test for whether divergence is gradual or instantaneous, and we apply the model to three key divergence processes in great apes: (a) the bonobo and common chimpanzee, (b) the eastern and western gorilla, and (c) the Sumatran and Bornean orang-utan. We find that the bonobo and chimpanzee appear to have undergone a clear split, whereas the divergence processes of the gorilla and orang-utan species occurred over several hundred thousands years with gene flow stopping quite recently. We also apply the model to the Homo/Pan speciation event and find that the most likely scenario involves an extended period of gene flow during speciation.
Asunto(s)
Evolución Molecular , Especiación Genética , Variación Genética , Genoma , Animales , Flujo Génico , Genética de Población , Gorilla gorilla/genética , Humanos , Cadenas de Markov , Modelos Teóricos , Pan paniscus/genética , Pan troglodytes/genética , Filogenia , Pongo/genética , Densidad de PoblaciónRESUMEN
We search the complete orangutan genome for regions where humans are more closely related to orangutans than to chimpanzees due to incomplete lineage sorting (ILS) in the ancestor of human and chimpanzees. The search uses our recently developed coalescent hidden Markov model (HMM) framework. We find ILS present in â¼1% of the genome, and that the ancestral species of human and chimpanzees never experienced a severe population bottleneck. The existence of ILS is validated with simulations, site pattern analysis, and analysis of rare genomic events. The existence of ILS allows us to disentangle the time of isolation of humans and orangutans (the speciation time) from the genetic divergence time, and we find speciation to be as recent as 9-13 million years ago (Mya; contingent on the calibration point). The analyses provide further support for a recent speciation of human and chimpanzee at â¼4 Mya and a diverse ancestor of human and chimpanzee with an effective population size of about 50,000 individuals. Posterior decoding infers ILS for each nucleotide in the genome, and we use this to deduce patterns of selection in the ancestral species. We demonstrate the effect of background selection in the common ancestor of humans and chimpanzees. In agreement with predictions from population genetics, ILS was found to be reduced in exons and gene-dense regions when we control for confounding factors such as GC content and recombination rate. Finally, we find the broad-scale recombination rate to be conserved through the complete ape phylogeny.
Asunto(s)
Especiación Genética , Nucleótidos/análisis , Pan troglodytes/genética , Filogenia , Pongo/genética , Animales , Composición de Base , Secuencia de Bases , Secuencia Conservada/genética , Flujo Genético , Variación Genética , Genoma , Humanos , Modelos Estadísticos , Datos de Secuencia Molecular , Densidad de Población , Recombinación Genética , Selección GenéticaRESUMEN
The fungus Mycosphaerella graminicola emerged as a new pathogen of cultivated wheat during its domestication ~11,000 yr ago. We assembled 12 high-quality full genome sequences to investigate the genetic footprints of selection in this wheat pathogen and closely related sister species that infect wild grasses. We demonstrate a strong effect of natural selection in shaping the pathogen genomes with only ~3% of nonsynonymous mutations being effectively neutral. Forty percent of all fixed nonsynonymous substitutions, on the other hand, are driven by positive selection. Adaptive evolution has affected M. graminicola to the highest extent, consistent with recent host specialization. Positive selection has prominently altered genes encoding secreted proteins and putative pathogen effectors supporting the premise that molecular host-pathogen interaction is a strong driver of pathogen evolution. Recent divergence between pathogen sister species is attested by the high degree of incomplete lineage sorting (ILS) in their genomes. We exploit ILS to generate a genetic map of the species without any crossing data, document recent times of species divergence relative to genome divergence, and show that gene-rich regions or regions with low recombination experience stronger effects of natural selection on neutral diversity. Emergence of a new agricultural host selected a highly specialized and fast-evolving pathogen with unique evolutionary patterns compared with its wild relatives. The strong impact of natural selection, we document, is at odds with the small effective population sizes estimated and suggest that population sizes were historically large but likely unstable.
Asunto(s)
Ascomicetos/genética , Evolución Molecular , Genoma Fúngico , Enfermedades de las Plantas/microbiología , Selección Genética , Triticum/microbiologíaRESUMEN
Due to genetic variation in the ancestor of two populations or two species, the divergence time for DNA sequences from two populations is variable along the genome. Within genomic segments all bases will share the same divergence-because they share a most recent common ancestor-when no recombination event has occurred to split them apart. The size of these segments of constant divergence depends on the recombination rate, but also on the speciation time, the effective population size of the ancestral population, as well as demographic effects and selection. Thus, inference of these parameters may be possible if we can decode the divergence times along a genomic alignment. Here, we present a new hidden Markov model that infers the changing divergence (coalescence) times along the genome alignment using a coalescent framework, in order to estimate the speciation time, the recombination rate, and the ancestral effective population size. The model is efficient enough to allow inference on whole-genome data sets. We first investigate the power and consistency of the model with coalescent simulations and then apply it to the whole-genome sequences of the two orangutan sub-species, Bornean (P. p. pygmaeus) and Sumatran (P. p. abelii) orangutans from the Orangutan Genome Project. We estimate the speciation time between the two sub-species to be thousand years ago and the effective population size of the ancestral orangutan species to be , consistent with recent results based on smaller data sets. We also report a negative correlation between chromosome size and ancestral effective population size, which we interpret as a signature of recombination increasing the efficacy of selection.
Asunto(s)
Evolución Molecular , Especiación Genética , Genoma , Pongo abelii/genética , Pongo pygmaeus/genética , Algoritmos , Animales , Cromosomas/metabolismo , Variación Genética , Genética de Población , Cadenas de Markov , Modelos Genéticos , Modelos Estadísticos , Densidad de Población , Recombinación Genética , Alineación de Secuencia , Homología de Secuencia de Ácido Nucleico , Factores de TiempoRESUMEN
The fungus Mycosphaerella graminicola has been a pathogen of wheat since host domestication 10,000-12,000 years ago in the Fertile Crescent. The wheat-infecting lineage emerged from closely related Mycosphaerella pathogens infecting wild grasses. We use a comparative genomics approach to assess how the process of host specialization affected the genome structure of M. graminicola since divergence from the closest known progenitor species named M. graminicola S1. The genome of S1 was obtained by Illumina sequencing resulting in a 35 Mb draft genome sequence of 32X. Assembled contigs were aligned to the previously sequenced M. graminicola genome. The alignment covered >90% of the non-repetitive portion of the M. graminicola genome with an average divergence of 7%. The sequenced M. graminicola strain is known to harbor thirteen essential chromosomes plus eight dispensable chromosomes. We found evidence that structural rearrangements significantly affected the dispensable chromosomes while the essential chromosomes were syntenic. At the nucleotide level, the essential and dispensable chromosomes have evolved differently. The average synonymous substitution rate in dispensable chromosomes is considerably lower than in essential chromosomes, whereas the average non-synonymous substitution rate is three times higher. Differences in molecular evolution can be related to different transmission and recombination patterns, as well as to differences in effective population sizes of essential and dispensable chromosomes. In order to identify genes potentially involved in host specialization or speciation, we calculated ratios of synonymous and non-synonymous substitution rates in the >9,500 aligned protein coding genes. The genes are generally under strong purifying selection. We identified 43 candidate genes showing evidence of positive selection, one encoding a potential pathogen effector protein. We conclude that divergence of these pathogens was accompanied by structural rearrangements in the small dispensable chromosomes, while footprints of positive selection were present in only a small number of protein coding genes.