Búsqueda | Portal Regional de la BVS

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations.

Lauterbur, M Elise; Cavassim, Maria Izabel A; Gladstein, Ariella L; Gower, Graham; Pope, Nathaniel S; Tsambos, Georgia; Adrion, Jeffrey; Belsare, Saurabh; Biddanda, Arjun; Caudill, Victoria; Cury, Jean; Echevarria, Ignacio; Haller, Benjamin C; Hasan, Ahmed R; Huang, Xin; Iasi, Leonardo Nicola Martin; Noskova, Ekaterina; Obsteter, Jana; Pavinato, Vitor Antonio Correa; Pearson, Alice; Peede, David; Perez, Manolo F; Rodrigues, Murillo F; Smith, Chris C R; Spence, Jeffrey P; Teterina, Anastasia; Tittes, Silas; Unneberg, Per; Vazquez, Juan Manuel; Waples, Ryan K; Wohns, Anthony Wilder; Wong, Yan; Baumdicker, Franz; Cartwright, Reed A; Gorjanc, Gregor; Gutenkunst, Ryan N; Kelleher, Jerome; Kern, Andrew D; Ragsdale, Aaron P; Ralph, Peter L; Schrider, Daniel R; Gronau, Ilan.

Elife ; 122023 06 21.

Artículo en Inglés | MEDLINE | ID: mdl-37342968

RESUMEN

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.

Asunto(s)

Genoma , Programas Informáticos , Simulación por Computador , Genética de Población , Genómica

Quantifying the fraction of new mutations that are recessive lethal.

Wade, Emma E; Kyriazis, Christopher C; Cavassim, Maria Izabel A; Lohmueller, Kirk E.

Evolution ; 77(7): 1539-1549, 2023 Jun 29.

Artículo en Inglés | MEDLINE | ID: mdl-37074880

RESUMEN

The presence and impact of recessive lethal mutations have been widely documented in diploid outcrossing species. However, precise estimates of the proportion of new mutations that are recessive lethal remain limited. Here, we evaluate the performance of Fit∂a∂i, a commonly used method for inferring the distribution of fitness effects (DFE), in the presence of lethal mutations. Using simulations, we demonstrate that in both additive and recessive cases, inference of the deleterious nonlethal portion of the DFE is minimally affected by a small proportion (<10%) of lethal mutations. Additionally, we demonstrate that while Fit∂a∂i cannot estimate the fraction of recessive lethal mutations, Fit∂a∂i can accurately infer the fraction of additive lethal mutations. Finally, as an alternative approach to estimate the proportion of mutations that are recessive lethal, we employ models of mutation-selection-drift balance using existing genomic parameters and estimates of segregating recessive lethals for humans and Drosophila melanogaster. In both species, the segregating recessive lethal load can be explained by a very small fraction (<1%) of new nonsynonymous mutations being recessive lethal. Our results refute recent assertions of a much higher proportion of mutations being recessive lethal (4%-5%), while highlighting the need for additional information on the joint distribution of selection and dominance coefficients.

Asunto(s)

Hominidae , Selección Genética , Animales , Humanos , Drosophila melanogaster/genética , Mutación , Hominidae/genética , Genes Letales , Modelos Genéticos

PRDM9 losses in vertebrates are coupled to those of paralogs ZCWPW1 and ZCWPW2.

Cavassim, Maria Izabel A; Baker, Zachary; Hoge, Carla; Schierup, Mikkel H; Schumer, Molly; Przeworski, Molly.

Proc Natl Acad Sci U S A ; 119(9)2022 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-35217607

RESUMEN

In most mammals and likely throughout vertebrates, the gene PRDM9 specifies the locations of meiotic double strand breaks; in mice and humans at least, it also aids in their repair. For both roles, many of the molecular partners remain unknown. Here, we take a phylogenetic approach to identify genes that may be interacting with PRDM9 by leveraging the fact that PRDM9 arose before the origin of vertebrates but was lost many times, either partially or entirely-and with it, its role in recombination. As a first step, we characterize PRDM9 domain composition across 446 vertebrate species, inferring at least 13 independent losses. We then use the interdigitation of PRDM9 orthologs across vertebrates to test whether it coevolved with any of 241 candidate genes coexpressed with PRDM9 in mice or associated with recombination phenotypes in mammals. Accounting for the phylogenetic relationship among a subsample of 189 species, we find two genes whose presence and absence is unexpectedly coincident with that of PRDM9: ZCWPW1, which was recently shown to facilitate double strand break repair, and its paralog ZCWPW2, as well as, more tentatively, TEX15 and FBXO47ZCWPW2 is expected to be recruited to sites of PRDM9 binding; its tight coevolution with PRDM9 across vertebrates suggests that it is a key interactor within mammals and beyond, with a role either in recruiting the recombination machinery or in double strand break repair.

Asunto(s)

Proteínas de Ciclo Celular/genética , Eliminación de Gen , N-Metiltransferasa de Histona-Lisina/genética , Animales , Evolución Molecular , Humanos , Ratones , Filogenia , Recombinación Genética , Análisis de Secuencia de ARN/métodos

Recombination Facilitates Adaptive Evolution in Rhizobial Soil Bacteria.

Cavassim, Maria Izabel A; Andersen, Stig U; Bataillon, Thomas; Schierup, Mikkel Heide.

Mol Biol Evol ; 38(12): 5480-5490, 2021 12 09.

Artículo en Inglés | MEDLINE | ID: mdl-34410427

RESUMEN

Homologous recombination is expected to increase natural selection efficacy by decoupling the fate of beneficial and deleterious mutations and by readily creating new combinations of beneficial alleles. Here, we investigate how the proportion of amino acid substitutions fixed by adaptive evolution (α) depends on the recombination rate in bacteria. We analyze 3,086 core protein-coding sequences from 196 genomes belonging to five closely related species of the genus Rhizobium. These genes are found in all species and do not display any signs of introgression between species. We estimate α using the site frequency spectrum (SFS) and divergence data for all pairs of species. We evaluate the impact of recombination within each species by dividing genes into three equally sized recombination classes based on their average level of intragenic linkage disequilibrium. We find that α varies from 0.07 to 0.39 across species and is positively correlated with the level of recombination. This is both due to a higher estimated rate of adaptive evolution and a lower estimated rate of nonadaptive evolution, suggesting that recombination both increases the fixation probability of advantageous variants and decreases the probability of fixation of deleterious variants. Our results demonstrate that homologous recombination facilitates adaptive evolution measured by α in the core genome of prokaryote species in agreement with studies in eukaryotes.

Asunto(s)

Recombinación Genética , Rhizobium , Evolución Molecular , Mutación , Rhizobium/genética , Selección Genética , Suelo

Defining the Rhizobium leguminosarum Species Complex.

Young, J Peter W; Moeskjær, Sara; Afonin, Alexey; Rahi, Praveen; Maluk, Marta; James, Euan K; Cavassim, Maria Izabel A; Rashid, M Harun-Or; Aserse, Aregu Amsalu; Perry, Benjamin J; Wang, En Tao; Velázquez, Encarna; Andronov, Evgeny E; Tampakaki, Anastasia; Flores Félix, José David; Rivas González, Raúl; Youseif, Sameh H; Lepetit, Marc; Boivin, Stéphane; Jorrin, Beatriz; Kenicer, Gregory J; Peix, Álvaro; Hynes, Michael F; Ramírez-Bahena, Martha Helena; Gulati, Arvind; Tian, Chang-Fu.

Genes (Basel) ; 12(1)2021 01 18.

Artículo en Inglés | MEDLINE | ID: mdl-33477547

RESUMEN

Bacteria currently included in Rhizobium leguminosarum are too diverse to be considered a single species, so we can refer to this as a species complex (the Rlc). We have found 429 publicly available genome sequences that fall within the Rlc and these show that the Rlc is a distinct entity, well separated from other species in the genus. Its sister taxon is R. anhuiense. We constructed a phylogeny based on concatenated sequences of 120 universal (core) genes, and calculated pairwise average nucleotide identity (ANI) between all genomes. From these analyses, we concluded that the Rlc includes 18 distinct genospecies, plus 7 unique strains that are not placed in these genospecies. Each genospecies is separated by a distinct gap in ANI values, usually at approximately 96% ANI, implying that it is a 'natural' unit. Five of the genospecies include the type strains of named species: R. laguerreae, R. sophorae, R. ruizarguesonis, "R. indicum" and R. leguminosarum itself. The 16S ribosomal RNA sequence is remarkably diverse within the Rlc, but does not distinguish the genospecies. Partial sequences of housekeeping genes, which have frequently been used to characterize isolate collections, can mostly be assigned unambiguously to a genospecies, but alleles within a genospecies do not always form a clade, so single genes are not a reliable guide to the true phylogeny of the strains. We conclude that access to a large number of genome sequences is a powerful tool for characterizing the diversity of bacteria, and that taxonomic conclusions should be based on all available genome sequences, not just those of type strains.

Asunto(s)

ADN Bacteriano/genética , Genoma Bacteriano , Filogenia , Rhizobium leguminosarum/clasificación , Rhizobium leguminosarum/genética , Análisis de Secuencia de ADN

Symbiosis genes show a unique pattern of introgression and selection within a Rhizobium leguminosarum species complex.

Cavassim, Maria Izabel A; Moeskjær, Sara; Moslemi, Camous; Fields, Bryden; Bachmann, Asger; Vilhjálmsson, Bjarni J; Schierup, Mikkel Heide; W Young, J Peter; Andersen, Stig U.

Microb Genom ; 6(4)2020 04.

Artículo en Inglés | MEDLINE | ID: mdl-32176601

RESUMEN

Rhizobia supply legumes with fixed nitrogen using a set of symbiosis genes. These can cross rhizobium species boundaries, but it is unclear how many other genes show similar mobility. Here, we investigate inter-species introgression using de novo assembly of 196 Rhizobium leguminosarum sv. trifolii genomes. The 196 strains constituted a five-species complex, and we calculated introgression scores based on gene-tree traversal to identify 171 genes that frequently cross species boundaries. Rather than relying on the gene order of a single reference strain, we clustered the introgressing genes into four blocks based on population structure-corrected linkage disequilibrium patterns. The two largest blocks comprised 125 genes and included the symbiosis genes, a smaller block contained 43 mainly chromosomal genes, and the last block consisted of three genes with variable genomic location. All introgression events were likely mediated by conjugation, but only the genes in the symbiosis linkage blocks displayed overrepresentation of distinct, high-frequency haplotypes. The three genes in the last block were core genes essential for symbiosis that had, in some cases, been mobilized on symbiosis plasmids. Inter-species introgression is thus not limited to symbiosis genes and plasmids, but other cases are infrequent and show distinct selection signatures.

Asunto(s)

Proteínas Bacterianas/genética , Plásmidos/genética , Rhizobium leguminosarum/genética , Trifolium/microbiología , Secuenciación Completa del Genoma/métodos , Introgresión Genética , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Desequilibrio de Ligamiento , Filogenia , Raíces de Plantas/microbiología , Rhizobium leguminosarum/clasificación , Selección Genética , Simbiosis

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA