Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Trends Genet ; 39(6): 491-504, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36890036

RESUMO

Recent studies of cosmopolitan Drosophila populations have found hundreds to thousands of genetic loci with seasonally fluctuating allele frequencies, bringing temporally fluctuating selection to the forefront of the historical debate surrounding the maintenance of genetic variation in natural populations. Numerous mechanisms have been explored in this longstanding area of research, but these exciting empirical findings have prompted several recent theoretical and experimental studies that seek to better understand the drivers, dynamics, and genome-wide influence of fluctuating selection. In this review, we evaluate the latest evidence for multilocus fluctuating selection in Drosophila and other taxa, highlighting the role of potential genetic and ecological mechanisms in maintaining these loci and their impacts on neutral genetic variation.


Assuntos
Variação Genética , Animais , Drosophila melanogaster/genética , Humanos , Estações do Ano , Adaptação Fisiológica , Seleção Genética , Genoma
2.
Proc Natl Acad Sci U S A ; 120(22): e2213061120, 2023 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-37220274

RESUMO

The evolutionarily recent dispersal of anatomically modern humans (AMH) out of Africa (OoA) and across Eurasia provides a unique opportunity to examine the impacts of genetic selection as humans adapted to multiple new environments. Analysis of ancient Eurasian genomic datasets (~1,000 to 45,000 y old) reveals signatures of strong selection, including at least 57 hard sweeps after the initial AMH movement OoA, which have been obscured in modern populations by extensive admixture during the Holocene. The spatiotemporal patterns of these hard sweeps provide a means to reconstruct early AMH population dispersals OoA. We identify a previously unsuspected extended period of genetic adaptation lasting ~30,000 y, potentially in the Arabian Peninsula area, prior to a major Neandertal genetic introgression and subsequent rapid dispersal across Eurasia as far as Australia. Consistent functional targets of selection initiated during this period, which we term the Arabian Standstill, include loci involved in the regulation of fat storage, neural development, skin physiology, and cilia function. Similar adaptive signatures are also evident in introgressed archaic hominin loci and modern Arctic human groups, and we suggest that this signal represents selection for cold adaptation. Surprisingly, many of the candidate selected loci across these groups appear to directly interact and coordinately regulate biological processes, with a number associated with major modern diseases including the ciliopathies, metabolic syndrome, and neurodegenerative disorders. This expands the potential for ancestral human adaptation to directly impact modern diseases, providing a platform for evolutionary medicine.


Assuntos
Homem de Neandertal , Humanos , Animais , África , Aclimatação , Arábia , Seleção Genética
3.
Proc Natl Acad Sci U S A ; 120(16): e2206808120, 2023 04 18.
Artigo em Inglês | MEDLINE | ID: mdl-37043536

RESUMO

Repeated herbicide applications in agricultural fields exert strong selection on weeds such as blackgrass (Alopecurus myosuroides), which is a major threat for temperate climate cereal crops. This inadvertent selection pressure provides an opportunity for investigating the underlying genetic mechanisms and evolutionary processes of rapid adaptation, which can occur both through mutations in the direct targets of herbicides and through changes in other, often metabolic, pathways, known as non-target-site resistance. How much target-site resistance (TSR) relies on de novo mutations vs. standing variation is important for developing strategies to manage herbicide resistance. We first generated a chromosome-level reference genome for A. myosuroides for population genomic studies of herbicide resistance and genome-wide diversity across Europe in this species. Next, through empirical data in the form of highly accurate long-read amplicons of alleles encoding acetyl-CoA carboxylase (ACCase) and acetolactate synthase (ALS) variants, we showed that most populations with resistance due to TSR mutations-23 out of 27 and six out of nine populations for ACCase and ALS, respectively-contained at least two TSR haplotypes, indicating that soft sweeps are the norm. Finally, through forward-in-time simulations, we inferred that TSR is likely to mainly result from standing genetic variation, with only a minor role for de novo mutations.


Assuntos
Resistência a Herbicidas , Herbicidas , Resistência a Herbicidas/genética , Poaceae/genética , Poaceae/metabolismo , Mutação , Haplótipos , Europa (Continente) , Herbicidas/farmacologia , Acetil-CoA Carboxilase/genética , Acetil-CoA Carboxilase/metabolismo
4.
Genome Res ; 31(1): 110-120, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33208456

RESUMO

Quantifying and comparing the amount of adaptive evolution among different species is key to understanding how evolution works. Previous studies have shown differences in adaptive evolution across species; however, their specific causes remain elusive. Here, we use improved modeling of weakly deleterious mutations and the demographic history of the outgroup species and ancestral population and estimate that at least 20% of nonsynonymous substitutions between humans and an outgroup species were fixed by positive selection. This estimate is much higher than previous estimates, which did not correct for the sizes of the outgroup species and ancestral population. Next, we jointly estimate the proportion and selection coefficient (p + and s +, respectively) of newly arising beneficial nonsynonymous mutations in humans, mice, and Drosophila melanogaster by examining patterns of polymorphism and divergence. We develop a novel composite likelihood framework to test whether these parameters differ across species. Overall, we reject a model with the same p + and s + of beneficial mutations across species and estimate that humans have a higher p+s + compared with that of D. melanogaster and mice. We show that this result cannot be caused by biased gene conversion or hypermutable CpG sites. We discuss possible biological explanations that could generate the observed differences in the amount of adaptive evolution across species.


Assuntos
Drosophila melanogaster , Mutação , Aminoácidos , Animais , Drosophila melanogaster/genética , Evolução Molecular , Humanos , Camundongos , Polimorfismo Genético
5.
Proc Natl Acad Sci U S A ; 118(10)2021 03 09.
Artigo em Inglês | MEDLINE | ID: mdl-33608481

RESUMO

The current rate of species extinction is rapidly approaching unprecedented highs, and life on Earth presently faces a sixth mass extinction event driven by anthropogenic activity, climate change, and ecological collapse. The field of conservation genetics aims at preserving species by using their levels of genetic diversity, usually measured as neutral genome-wide diversity, as a barometer for evaluating population health and extinction risk. A fundamental assumption is that higher levels of genetic diversity lead to an increase in fitness and long-term survival of a species. Here, we argue against the perceived importance of neutral genetic diversity for the conservation of wild populations and species. We demonstrate that no simple general relationship exists between neutral genetic diversity and the risk of species extinction. Instead, a better understanding of the properties of functional genetic diversity, demographic history, and ecological relationships is necessary for developing and implementing effective conservation genetic strategies.


Assuntos
Variação Genética , Genoma , Endogamia , Modelos Genéticos , Animais , Genética Populacional
6.
PLoS Genet ; 16(5): e1008827, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32469868

RESUMO

Comparative genomic approaches have been used to identify sites where mutations are under purifying selection and of functional consequence by searching for sequences that are conserved across distantly related species. However, the performance of these approaches has not been rigorously evaluated under population genetic models. Further, short-lived functional elements may not leave a footprint of sequence conservation across many species. We use simulations to study how one measure of conservation, the Genomic Evolutionary Rate Profiling (GERP) score, relates to the strength of selection (Nes). We show that the GERP score is related to the strength of purifying selection. However, changes in selection coefficients or functional elements over time (i.e. functional turnover) can strongly affect the GERP distribution, leading to unexpected relationships between GERP and Nes. Further, we show that for functional elements that have a high turnover rate, adding more species to the analysis does not necessarily increase statistical power. Finally, we use the distribution of GERP scores across the human genome to compare models with and without turnover of sites where mutations are under purifying selection. We show that mutations in 4.51% of the noncoding human genome are under purifying selection and that most of this sequence has likely experienced changes in selection coefficients throughout mammalian evolution. Our work reveals limitations to using comparative genomic approaches to identify deleterious mutations. Commonly used GERP score thresholds miss over half of the noncoding sites in the human genome where mutations are under purifying selection.


Assuntos
Biologia Computacional/métodos , Mamíferos/genética , Mutação , Animais , Sequência Conservada , Evolução Molecular , Genética Populacional , Genoma Humano , Humanos , Modelos Genéticos , Seleção Genética , Alinhamento de Sequência
7.
Am J Hum Genet ; 103(5): 707-726, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30401458

RESUMO

Most population isolates examined to date were founded from a single ancestral population. Consequently, there is limited knowledge about the demographic history of admixed population isolates. Here we investigate genomic diversity of recently admixed population isolates from Costa Rica and Colombia and compare their diversity to a benchmark population isolate, the Finnish. These Latin American isolates originated during the 16th century from admixture between a few hundred European males and Amerindian females, with a limited contribution from African founders. We examine whole-genome sequence data from 449 individuals, ascertained as families to build mutigenerational pedigrees, with a mean sequencing depth of coverage of approximately 36×. We find that Latin American isolates have increased genetic diversity relative to the Finnish. However, there is an increase in the amount of identity by descent (IBD) segments in the Latin American isolates relative to the Finnish. The increase in IBD segments is likely a consequence of a very recent and severe population bottleneck during the founding of the admixed population isolates. Furthermore, the proportion of the genome that falls within a long run of homozygosity (ROH) in Costa Rican and Colombian individuals is significantly greater than that in the Finnish, suggesting more recent consanguinity in the Latin American isolates relative to that seen in the Finnish. Lastly, we find that recent consanguinity increased the number of deleterious variants found in the homozygous state, which is relevant if deleterious variants are recessive. Our study suggests that there is no single genetic signature of a population isolate.


Assuntos
Genoma Humano/genética , Colômbia , Consanguinidade , Costa Rica , Feminino , Genética Populacional/métodos , Genômica/métodos , Homozigoto , Humanos , Masculino , Linhagem , População Branca/genética , Sequenciamento Completo do Genoma/métodos
8.
PLoS Genet ; 14(10): e1007741, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30346959

RESUMO

While it is appreciated that population size changes can impact patterns of deleterious variation in natural populations, less attention has been paid to how gene flow affects and is affected by the dynamics of deleterious variation. Here we use population genetic simulations to examine how gene flow impacts deleterious variation under a variety of demographic scenarios, mating systems, dominance coefficients, and recombination rates. Our results show that admixture between populations can temporarily reduce the genetic load of smaller populations and cause increases in the frequency of introgressed ancestry, especially if deleterious mutations are recessive. Additionally, when fitness effects of new mutations are recessive, between-population differences in the sites at which deleterious variants exist creates heterosis in hybrid individuals. Together, these factors lead to an increase in introgressed ancestry, particularly when recombination rates are low. Under certain scenarios, introgressed ancestry can increase from an initial frequency of 5% to 30-75% and fix at many loci, even in the absence of beneficial mutations. Further, deleterious variation and admixture can generate correlations between the frequency of introgressed ancestry and recombination rate or exon density, even in the absence of other types of selection. The direction of these correlations is determined by the specific demography and whether mutations are additive or recessive. Therefore, it is essential that null models of admixture include both demography and deleterious variation before invoking other mechanisms to explain unusual patterns of genetic variation.


Assuntos
Fluxo Gênico/genética , Seleção Genética/genética , Alelos , Simulação por Computador , Demografia , Evolução Molecular , Frequência do Gene/genética , Carga Genética , Variação Genética/genética , Genética Populacional/métodos , Genômica , Humanos , Vigor Híbrido , Hibridização Genética/genética , Modelos Genéticos , Mutação , Densidade Demográfica
9.
Proc Natl Acad Sci U S A ; 114(17): 4465-4470, 2017 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-28400513

RESUMO

The distribution of fitness effects (DFE) of new mutations plays a fundamental role in evolutionary genetics. However, the extent to which the DFE differs across species has yet to be systematically investigated. Furthermore, the biological mechanisms determining the DFE in natural populations remain unclear. Here, we show that theoretical models emphasizing different biological factors at determining the DFE, such as protein stability, back-mutations, species complexity, and mutational robustness make distinct predictions about how the DFE will differ between species. Analyzing amino acid-changing variants from natural populations in a comparative population genomic framework, we find that humans have a higher proportion of strongly deleterious mutations than Drosophila melanogaster. Furthermore, when comparing the DFE across yeast, Drosophila, mice, and humans, the average selection coefficient becomes more deleterious with increasing species complexity. Last, pleiotropic genes have a DFE that is less variable than that of nonpleiotropic genes. Comparing four categories of theoretical models, only Fisher's geometrical model (FGM) is consistent with our findings. FGM assumes that multiple phenotypes are under stabilizing selection, with the number of phenotypes defining the complexity of the organism. Our results suggest that long-term population size and cost of complexity drive the evolution of the DFE, with many implications for evolutionary and medical genomics.


Assuntos
Drosophila melanogaster/genética , Modelos Genéticos , Leveduras/genética , Adaptação Fisiológica/genética , Animais , Evolução Molecular , Aptidão Genética , Humanos , Camundongos , Mutação , Seleção Genética , Especificidade da Espécie
10.
PLoS Genet ; 12(8): e1006199, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27508305

RESUMO

A major goal in evolutionary biology is to understand how natural selection has shaped patterns of genetic variation across genomes. Studies in a variety of species have shown that neutral genetic diversity (intra-species differences) has been reduced at sites linked to those under direct selection. However, the effect of linked selection on neutral sequence divergence (inter-species differences) remains ambiguous. While empirical studies have reported correlations between divergence and recombination, which is interpreted as evidence for natural selection reducing linked neutral divergence, theory argues otherwise, especially for species that have diverged long ago. Here we address these outstanding issues by examining whether natural selection can affect divergence between both closely and distantly related species. We show that neutral divergence between closely related species (e.g. human-primate) is negatively correlated with functional content and positively correlated with human recombination rate. We also find that neutral divergence between distantly related species (e.g. human-rodent) is negatively correlated with functional content and positively correlated with estimates of background selection from primates. These patterns persist after accounting for the confounding factors of hypermutable CpG sites, GC content, and biased gene conversion. Coalescent models indicate that even when the contribution of ancestral polymorphism to divergence is small, background selection in the ancestral population can still explain a large proportion of the variance in divergence across the genome, generating the observed correlations. Our findings reveal that, contrary to previous intuition, natural selection can indirectly affect linked neutral divergence between both closely and distantly related species. Though we cannot formally exclude the possibility that the direct effects of purifying selection drive some of these patterns, such a scenario would be possible only if more of the genome is under purifying selection than currently believed. Our work has implications for understanding the evolution of genomes and interpreting patterns of genetic variation.


Assuntos
Evolução Molecular , Deriva Genética , Recombinação Genética , Seleção Genética/genética , Animais , Ilhas de CpG/genética , Conversão Gênica/genética , Variação Genética , Genoma , Humanos , Mutação/genética , Polimorfismo Genético , Primatas/genética , Roedores/genética , Especificidade da Espécie
11.
Bioinformatics ; 32(12): 1895-7, 2016 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-27153702

RESUMO

UNLABELLED: SweepFinder is a widely used program that implements a powerful likelihood-based method for detecting recent positive selection, or selective sweeps. Here, we present SweepFinder2, an extension of SweepFinder with increased sensitivity and robustness to the confounding effects of mutation rate variation and background selection. Moreover, SweepFinder2 has increased flexibility that enables the user to specify test sites, set the distance between test sites and utilize a recombination map. AVAILABILITY AND IMPLEMENTATION: SweepFinder2 is a freely-available (www.personal.psu.edu/mxd60/sf2.html) software package that is written in C and can be run from a Unix command line. CONTACT: mxd60@psu.edu.


Assuntos
Taxa de Mutação , Seleção Genética , Software , Evolução Molecular , Humanos , Funções Verossimilhança
12.
Mol Ecol ; 25(1): 142-56, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26290347

RESUMO

A composite likelihood ratio test implemented in the program sweepfinder is a commonly used method for scanning a genome for recent selective sweeps. sweepfinder uses information on the spatial pattern (along the chromosome) of the site frequency spectrum around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson-Kreitman-Aguadé test, we suggest adding fixed differences relative to an out-group to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection, modelled as a local reduction in the effective population size. Using simulations, we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.


Assuntos
Genética Populacional , Modelos Genéticos , Taxa de Mutação , Seleção Genética , Frequência do Gene , Humanos
13.
Mol Biol Evol ; 31(11): 3026-39, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25158800

RESUMO

Detecting positive selection in species with heterogeneous habitats and complex demography is notoriously difficult and prone to statistical biases. The model plant Arabidopsis thaliana exemplifies this problem: In spite of the large amounts of data, little evidence for classic selective sweeps has been found. Moreover, many aspects of the demography are unclear, which makes it hard to judge whether the few signals are indeed signs of selection, or false positives caused by demographic events. Here, we focus on Swedish A. thaliana and we find that the demography can be approximated as a two-population model. Careful analysis of the data shows that such a two island model is characterized by a very old split time that significantly predates the last glacial maximum followed by secondary contact with strong migration. We evaluate selection based on this demography and find that this secondary contact model strongly affects the power to detect sweeps. Moreover, it affects the power differently for northern Sweden (more false positives) as compared with southern Sweden (more false negatives). However, even when the demographic history is accounted for, sweep signals in northern Sweden are stronger than in southern Sweden, with little or no positional overlap. Further simulations including the complex demography and selection confirm that this is not compatible with global selection acting on both populations, and thus can be taken as evidence for local selection within subpopulations of Swedish A. thaliana. This study demonstrates the necessity of combining demographic analyses and sweep scans for the detection of selection, particularly when selection acts predominantly local.


Assuntos
Arabidopsis/genética , Modelos Genéticos , Dispersão Vegetal/genética , Seleção Genética , Arabidopsis/classificação , Fluxo Gênico , Variação Genética , Filogeografia , Suécia
14.
Genetics ; 228(1)2024 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-39013011

RESUMO

Our knowledge of human evolutionary history has been greatly advanced by paleogenomics. Since the 2020s, the study of ancient DNA has increasingly focused on reconstructing the recent past. However, the accuracy of paleogenomic methods in resolving questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation remains an open question. We evaluated the performance and behavior of two commonly used methods, qpAdm and the f3-statistic, on admixture inference under a diversity of demographic models and data conditions. We performed two complementary simulation approaches-firstly exploring a wide demographic parameter space under four simple demographic models of varying complexities and configurations using branch-length data from two chromosomes-and secondly, we analyzed a model of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudohaploidization. We observe that population differentiation is the primary factor driving qpAdm performance. Notably, while complex gene flow histories influence which models are classified as plausible, they do not reduce overall performance. Under conditions reflective of the historical period, qpAdm most frequently identifies the true model as plausible among a small candidate set of closely related populations. To increase the utility for resolving fine-scaled hypotheses, we provide a heuristic for further distinguishing between candidate models that incorporates qpAdm model P-values and f3-statistics. Finally, we demonstrate a significant performance increase for qpAdm using whole-genome branch-length f2-statistics, highlighting the potential for improved demographic inference that could be achieved with future advancements in f-statistic estimations.


Assuntos
DNA Antigo , Modelos Genéticos , DNA Antigo/análise , Humanos , Genética Populacional/métodos , Fluxo Gênico , Polimorfismo de Nucleotídeo Único , Genoma Humano , Evolução Molecular
15.
Mol Ecol Resour ; 24(3): e13930, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38247258

RESUMO

Population genetic simulation has emerged as a common tool for investigating increasingly complex evolutionary and demographic models. Software capable of handling high-level model complexity has recently been developed, and the advancement of tree sequence recording now allows simulations to merge the efficiency and genealogical insight of coalescent simulations with the flexibility of forward simulations. However, frameworks utilizing these features have not yet been compared and benchmarked. Here, we evaluate various simulation workflows using the coalescent simulator msprime and the forward simulator SLiM, to assess resource efficiency and determine an optimal simulation framework. Three aspects were evaluated: (1) the burn-in, to establish an equilibrium level of neutral diversity in the population; (2) the forward simulation, in which temporally fluctuating selection is acting; and (3) the final computation of summary statistics. We provide typical memory and computation time requirements for each step. We find that the fastest framework, a combination of coalescent and forward simulation with tree sequence recording, increases simulation speed by over twenty times compared to classical forward simulations without tree sequence recording, although it does require six times more memory. Overall, using efficient simulation workflows can lead to a substantial improvement when modelling complex evolutionary scenarios-although the optimal framework ultimately depends on the available computational resources.


Assuntos
Benchmarking , Genética Populacional , Simulação por Computador , Software , Seleção Genética , Modelos Genéticos
16.
Genome Biol Evol ; 16(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38242694

RESUMO

The ancestral recombination graph (ARG) is a structure that represents the history of coalescent and recombination events connecting a set of sequences (Hudson RR. In: Futuyma D, Antonovics J, editors. Gene genealogies and the coalescent process. In: Oxford Surveys in Evolutionary Biology; 1991. p. 1 to 44.). The full ARG can be represented as a set of genealogical trees at every locus in the genome, annotated with recombination events that change the topology of the trees between adjacent loci and the mutations that occurred along the branches of those trees (Griffiths RC, Marjoram P. An ancestral recombination graph. In: Donnelly P, Tavare S, editors. Progress in population genetics and human evolution. Springer; 1997. p. 257 to 270.). Valuable insights can be gained into past evolutionary processes, such as demographic events or the influence of natural selection, by studying the ARG. It is regarded as the "holy grail" of population genetics (Hubisz M, Siepel A. Inference of ancestral recombination graphs using ARGweaver. In: Dutheil JY, editors. Statistical population genomics. New York, NY: Springer US; 2020. p. 231-266.) since it encodes the processes that generate all patterns of allelic and haplotypic variation from which all commonly used summary statistics in population genetic research (e.g. heterozygosity and linkage disequilibrium) can be derived. Many previous evolutionary inferences relied on summary statistics extracted from the genotype matrix. Evolutionary inferences using the ARG represent a significant advancement as the ARG is a representation of the evolutionary history of a sample that shows the past history of recombination, coalescence, and mutation events across a particular sequence. This representation in theory contains as much information, if not more, than the combination of all independent summary statistics that could be derived from the genotype matrix. Consistent with this idea, some of the first ARG-based analyses have proven to be more powerful than summary statistic-based analyses (Speidel L, Forest M, Shi S, Myers SR. A method for genome-wide genealogy estimation for thousands of samples. Nat Genet. 2019:51(9):1321 to 1329.; Stern AJ, Wilton PR, Nielsen R. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. PLoS Genet. 2019:15(9):e1008384.; Hubisz MJ, Williams AL, Siepel A. Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph. PLoS Genet. 2020:16(8):e1008895.; Fan C, Mancuso N, Chiang CWK. A genealogical estimate of genetic relationships. Am J Hum Genet. 2022:109(5):812-824.; Fan C, Cahoon JL, Dinh BL, Ortega-Del Vecchyo D, Huber C, Edge MD, Mancuso N, Chiang CWK. A likelihood-based framework for demographic inference from genealogical trees. bioRxiv. 2023.10.10.561787. 2023.; Hejase HA, Mo Z, Campagna L, Siepel A. A deep-learning approach for inference of selective sweeps from the ancestral recombination graph. Mol Biol Evol. 2022:39(1):msab332.; Link V, Schraiber JG, Fan C, Dinh B, Mancuso N, Chiang CWK, Edge MD. Tree-based QTL mapping with expected local genetic relatedness matrices. bioRxiv. 2023.04.07.536093. 2023.; Zhang BC, Biddanda A, Gunnarsson ÁF, Cooper F, Palamara PF. Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nat Genet. 2023:55(5):768-776.). As such, there has been significant interest in the field to investigate 2 main problems related to the ARG: (i) How can we estimate the ARG based on genomic data, and (ii) how can we extract information of past evolutionary processes from the ARG? In this perspective, we highlight 3 topics that pertain to these main issues: The development of computational innovations that enable the estimation of the ARG; remaining challenges in estimating the ARG; and methodological advances for deducing evolutionary forces and mechanisms using the ARG. This perspective serves to introduce the readers to the types of questions that can be explored using the ARG and to highlight some of the most pressing issues that must be addressed in order to make ARG-based inference an indispensable tool for evolutionary research.


Assuntos
Algoritmos , Recombinação Genética , Humanos , Funções Verossimilhança , Mapeamento Cromossômico , Mutação , Modelos Genéticos
17.
bioRxiv ; 2024 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-39416208

RESUMO

Modern sequencing instruments bring unprecedented opportunity to study within-host viral evolution in conjunction with viral transmissions between hosts. However, no computational simulators are available to assist the characterization of within-host dynamics. This limits our ability to interpret epidemiological predictions incorporating within-host evolution and to validate computational inference tools. To fill this need we developed Apollo, a GPU-accelerated, out-of-core tool for within-host simulation of viral evolution and infection dynamics across population, tissue, and cellular levels. Apollo is scalable to hundreds of millions of viral genomes and can handle complex demographic and population genetic models. Apollo can replicate real within-host viral evolution; accurately recapturing observed viral sequences from an HIV cohort derived from initial population-genetic configurations. For practical applications, using Apollo-simulated viral genomes and transmission networks, we validated and uncovered the limitations of a widely used viral transmission inference tool.

18.
bioRxiv ; 2023 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-38014190

RESUMO

Paleogenomics has expanded our knowledge of human evolutionary history. Since the 2020s, the study of ancient DNA has increased its focus on reconstructing the recent past. However, the accuracy of paleogenomic methods in answering questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation within the historical period remains an open question. We used two simulation approaches to evaluate the limitations and behavior of commonly used methods, qpAdm and the f3-statistic, on admixture inference. The first is based on branch-length data simulated from four simple demographic models of varying complexities and configurations. The second, an analysis of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudo-haploidization. We show that under conditions resembling historical populations, qpAdm can identify a small candidate set of true sources and populations closely related to them. However, in typical ancient DNA conditions, qpAdm is unable to further distinguish between them, limiting its utility for resolving fine-scaled hypotheses. Notably, we find that complex gene-flow histories generally lead to improvements in the performance of qpAdm and observe no bias in the estimation of admixture weights. We offer a heuristic for admixture inference that incorporates admixture weight estimate and P-values of qpAdm models, and f3-statistics to enhance the power to distinguish between multiple plausible candidates. Finally, we highlight the future potential of qpAdm through whole-genome branch-length f2-statistics, demonstrating the improved demographic inference that could be achieved with advancements in f-statistic estimations.

19.
bioRxiv ; 2023 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-37904998

RESUMO

Although a broad range of methods exists for reconstructing population history from genome-wide single nucleotide polymorphism data, just a few methods gained popularity in archaeogenetics: principal component analysis (PCA); ADMIXTURE, an algorithm that models individuals as mixtures of multiple ancestral sources represented by actual or inferred populations; formal tests for admixture such as f3-statistics and D/f4-statistics; and qpAdm, a tool for fitting two-component and more complex admixture models to groups or individuals. Despite their popularity in archaeogenetics, which is explained by modest computational requirements and ability to analyze data of various types and qualities, protocols relying on qpAdm that screen numerous alternative models of varying complexity and find "fitting" models (often considering both estimated admixture proportions and p-values as a composite criterion of model fit) remain untested on complex simulated population histories in the form of admixture graphs of random topology. We analyzed genotype data extracted from such simulations and tested various types of high-throughput qpAdm protocols ("rotating" and "non-rotating", with or without temporal stratification of target groups and proxy ancestry sources, and with or without a "model competition" step). We caution that high-throughput qpAdm protocols may be inappropriate for exploratory analyses in poorly studied regions/periods since their false discovery rates varied between 12% and 68% depending on the details of the protocol and on the amount and quality of simulated data (i.e., >12% of fitting two-way admixture models imply gene flows that were not simulated). We demonstrate that for reducing false discovery rates of qpAdm protocols to nearly 0% it is advisable to use large SNP sets with low missing data rates, the rotating qpAdm protocol with a strictly enforced rule that target groups do not pre-date their proxy sources, and an unsupervised ADMIXTURE analysis as a way to verify feasible qpAdm models. Our study has a number of limitations: for instance, these recommendations depend on the assumption that the underlying genetic history is a complex admixture graph and not a stepping-stone model.

20.
Nat Ecol Evol ; 6(12): 2003-2015, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36316412

RESUMO

The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.


Assuntos
Hominidae , Seleção Genética , Animais , Humanos , Evolução Biológica , Genoma Humano , Genômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA