Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
PLoS Genet ; 19(10): e1010999, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37816069

RESUMEN

Identifying regions of the genome that act as barriers to gene flow between recently diverged taxa has remained challenging given the many evolutionary forces that generate variation in genetic diversity and divergence along the genome, and the stochastic nature of this variation. Progress has been impeded by a conceptual and methodological divide between analyses that infer the demographic history of speciation and genome scans aimed at identifying locally maladaptive alleles i.e. genomic barriers to gene flow. Here we implement genomewide IM blockwise likelihood estimation (gIMble), a composite likelihood approach for the quantification of barriers, that bridges this divide. This analytic framework captures background selection and selection against barriers in a model of isolation with migration (IM) as heterogeneity in effective population size (Ne) and effective migration rate (me), respectively. Variation in both effective demographic parameters is estimated in sliding windows via pre-computed likelihood grids. gIMble includes modules for pre-processing/filtering of genomic data and performing parametric bootstraps using coalescent simulations. To demonstrate the new approach, we analyse data from a well-studied pair of sister species of tropical butterflies with a known history of post-divergence gene flow: Heliconius melpomene and H. cydno. Our analyses uncover both large-effect barrier loci (including well-known wing-pattern genes) and a genome-wide signal of a polygenic barrier architecture.


Asunto(s)
Mariposas Diurnas , Flujo Génico , Animales , Funciones de Verosimilitud , Especiación Genética , Mariposas Diurnas/genética , Evolución Biológica
2.
Bioinformatics ; 40(6)2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38796683

RESUMEN

SUMMARY: Ancestral recombination graphs (ARGs) encode the ensemble of correlated genealogical trees arising from recombination in a compact and efficient structure and are of fundamental importance in population and statistical genetics. Recent breakthroughs have made it possible to simulate and infer ARGs at biobank scale, and there is now intense interest in using ARG-based methods across a broad range of applications, particularly in genome-wide association studies (GWAS). Sophisticated methods exist to simulate ARGs using population genetics models, but there is currently no software to simulate quantitative traits directly from these ARGs. To apply existing quantitative trait simulators users must export genotype data, losing important information about ancestral processes and producing prohibitively large files when applied to the biobank-scale datasets currently of interest in GWAS. We present tstrait, an open-source Python library to simulate quantitative traits on ARGs, and show how this user-friendly software can quickly simulate phenotypes for biobank-scale datasets on a laptop computer. AVAILABILITY AND IMPLEMENTATION: tstrait is available for download on the Python Package Index. Full documentation with examples and workflow templates is available on https://tskit.dev/tstrait/docs/, and the development version is maintained on GitHub (https://github.com/tskit-dev/tstrait).


Asunto(s)
Estudio de Asociación del Genoma Completo , Recombinación Genética , Programas Informáticos , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo , Humanos , Genética de Población/métodos , Fenotipo , Genotipo , Simulación por Computador
3.
PLoS Comput Biol ; 18(9): e1010532, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36108047

RESUMEN

Extracting information on the selective and demographic past of populations that is contained in samples of genome sequences requires a description of the distribution of the underlying genealogies. Using the Laplace transform, this distribution can be generated with a simple recursive procedure, regardless of model complexity. Assuming an infinite-sites mutation model, the probability of observing specific configurations of linked variants within small haplotype blocks can be recovered from the Laplace transform of the joint distribution of branch lengths. However, the repeated differentiation required to compute these probabilities has proven to be a serious computational bottleneck in earlier implementations. Here, I show that the state space diagram can be turned into a computational graph, allowing efficient evaluation of the Laplace transform by means of a graph traversal algorithm. This general algorithm can, for example, be applied to tabulate the likelihoods of mutational configurations in non-recombining blocks. This work provides a crucial speed up for existing composite likelihood approaches that rely on the joint distribution of branch lengths to fit isolation with migration models and estimate the parameters of selective sweeps. The associated software is available as an open-source Python library, agemo.


Asunto(s)
Algoritmos , Modelos Genéticos , Genoma/genética , Funciones de Verosimilitud , Programas Informáticos
4.
bioRxiv ; 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38559118

RESUMEN

Summary: Ancestral recombination graphs (ARGs) encode the ensemble of correlated genealogical trees arising from recombination in a compact and efficient structure, and are of fundamental importance in population and statistical genetics. Recent breakthroughs have made it possible to simulate and infer ARGs at biobank scale, and there is now intense interest in using ARG-based methods across a broad range of applications, particularly in genome-wide association studies (GWAS). Sophisticated methods exist to simulate ARGs using population genetics models, but there is currently no software to simulate quantitative traits directly from these ARGs. To apply existing quantitative trait simulators users must export genotype data, losing important information about ancestral processes and producing prohibitively large files when applied to the biobank-scale datasets currently of interest in GWAS. We present tstrait, an open-source Python library to simulate quantitative traits on ARGs, and show how this user-friendly software can quickly simulate phenotypes for biobank-scale datasets on a laptop computer. Availability and Implementation: tstrait is available for download on the Python Package Index. Full documentation with examples and workflow templates is available on https://tskit.dev/tstrait/docs/, and the development version is maintained on GitHub (https://github.com/tskit-dev/tstrait). Contact: daiki.tagami@hertford.ox.ac.uk.

5.
Wellcome Open Res ; 7: 260, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36408293

RESUMEN

We present a genome assembly from an individual female Anthocharis cardamines (the orange-tip; Arthropoda; Insecta; Lepidoptera; Pieridae). The genome sequence is 360 megabases in span. The majority (99.74%) of the assembly is scaffolded into 31 chromosomal pseudomolecules, with the W and Z sex chromosomes assembled. Gene annotation of this assembly on Ensembl has identified 12,477 protein coding genes.

6.
Genetics ; 222(3)2022 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-36173327

RESUMEN

Understanding the demographic history of populations is a key goal in population genetics, and with improving methods and data, ever more complex models are being proposed and tested. Demographic models of current interest typically consist of a set of discrete populations, their sizes and growth rates, and continuous and pulse migrations between those populations over a number of epochs, which can require dozens of parameters to fully describe. There is currently no standard format to define such models, significantly hampering progress in the field. In particular, the important task of translating the model descriptions in published work into input suitable for population genetic simulators is labor intensive and error prone. We propose the Demes data model and file format, built on widely used technologies, to alleviate these issues. Demes provide a well-defined and unambiguous model of populations and their properties that is straightforward to implement in software, and a text file format that is designed for simplicity and clarity. We provide thoroughly tested implementations of Demes parsers in multiple languages including Python and C, and showcase initial support in several simulators and inference methods. An introduction to the file format and a detailed specification are available at https://popsim-consortium.github.io/demes-spec-docs/.


Asunto(s)
Genética de Población , Programas Informáticos , Demografía
7.
Genetics ; 220(3)2022 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-34897427

RESUMEN

Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime's many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.


Asunto(s)
Algoritmos , Modelos Genéticos , Simulación por Computador , Genética de Población , Mutación , Programas Informáticos
8.
Genetics ; 219(2)2021 10 02.
Artículo en Inglés | MEDLINE | ID: mdl-34849880

RESUMEN

Current methods of identifying positively selected regions in the genome are limited in two key ways: the underlying models cannot account for the timing of adaptive events and the comparison between models of selective sweeps and sequence data is generally made via simple summaries of genetic diversity. Here, we develop a tractable method of describing the effect of positive selection on the genealogical histories in the surrounding genome, explicitly modeling both the timing and context of an adaptive event. In addition, our framework allows us to go beyond analyzing polymorphism data via the site frequency spectrum or summaries thereof and instead leverage information contained in patterns of linked variants. Tests on both simulations and a human data example, as well as a comparison to SweepFinder2, show that even with very small sample sizes, our analytic framework has higher power to identify old selective sweeps and to correctly infer both the time and strength of selection. Finally, we derived the marginal distribution of genealogical branch lengths at a locus affected by selection acting at a linked site. This provides a much-needed link between our analytic understanding of the effects of sweeps on sequence variation and recent advances in simulation and heuristic inference procedures that allow researchers to examine the sequence of genealogical histories along the genome.


Asunto(s)
Genoma Humano , Modelos Genéticos , Selección Genética , Evolución Molecular , Humanos , Linaje
9.
Philos Trans R Soc Lond B Biol Sci ; 375(1806): 20190531, 2020 08 31.
Artículo en Inglés | MEDLINE | ID: mdl-32654652

RESUMEN

Despite the homogenizing effect of strong gene flow between two populations, adaptation under symmetric divergent selection pressures results in partial reproductive isolation: adaptive substitutions act as local barriers to gene flow, and if divergent selection continues unimpeded, this will result in complete reproductive isolation of the two populations, i.e. speciation. However, a key issue in framing the process of speciation as a tension between local adaptation and the homogenizing force of gene flow is that the mutation process is blind to changes in the environment and therefore tends to limit adaptation. Here we investigate how globally beneficial mutations (GBMs) affect divergent local adaptation and reproductive isolation. When phenotypic divergence is finite, we show that the presence of GBMs limits local adaptation, generating a persistent genetic load at the loci that contribute to the trait under divergent selection and reducing genome-wide divergence. Furthermore, we show that while GBMs cannot prohibit the process of continuous differentiation, they induce a substantial delay in the genome-wide shutdown of gene flow. This article is part of the theme issue 'Towards the completion of speciation: the evolution of reproductive isolation beyond the first barriers'.


Asunto(s)
Adaptación Biológica/genética , Flujo Génico , Aislamiento Reproductivo , Selección Genética/fisiología , Modelos Genéticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA