Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 64
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Genet ; 20(3): e1011144, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38507461

RESUMEN

Across the human genome, there are large-scale fluctuations in genetic diversity caused by the indirect effects of selection. This "linked selection signal" reflects the impact of selection according to the physical placement of functional regions and recombination rates along chromosomes. Previous work has shown that purifying selection acting against the steady influx of new deleterious mutations at functional portions of the genome shapes patterns of genomic variation. To date, statistical efforts to estimate purifying selection parameters from linked selection models have relied on classic Background Selection theory, which is only applicable when new mutations are so deleterious that they cannot fix in the population. Here, we develop a statistical method based on a quantitative genetics view of linked selection, that models how polygenic additive fitness variance distributed along the genome increases the rate of stochastic allele frequency change. By jointly predicting the equilibrium fitness variance and substitution rate due to both strong and weakly deleterious mutations, we estimate the distribution of fitness effects (DFE) and mutation rate across three geographically distinct human samples. While our model can accommodate weaker selection, we find evidence of strong selection operating similarly across all human samples. Although our quantitative genetic model of linked selection fits better than previous models, substitution rates of the most constrained sites disagree with observed divergence levels. We find that a model incorporating selective interference better predicts observed divergence in conserved regions, but overall our results suggest uncertainty remains about the processes generating fitness variation in humans.


Asunto(s)
Modelos Genéticos , Selección Genética , Humanos , Evolución Molecular , Frecuencia de los Genes/genética , Mutación , Genoma Humano/genética , Variación Genética , Aptitud Genética
2.
Proc Natl Acad Sci U S A ; 120(11): e2219835120, 2023 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-36881629

RESUMEN

Species distributed across heterogeneous environments often evolve locally adapted ecotypes, but understanding of the genetic mechanisms involved in their formation and maintenance in the face of gene flow is incomplete. In Burkina Faso, the major African malaria mosquito Anopheles funestus comprises two strictly sympatric and morphologically indistinguishable yet karyotypically differentiated forms reported to differ in ecology and behavior. However, knowledge of the genetic basis and environmental determinants of An. funestus diversification was impeded by lack of modern genomic resources. Here, we applied deep whole-genome sequencing and analysis to test the hypothesis that these two forms are ecotypes differentially adapted to breeding in natural swamps versus irrigated rice fields. We demonstrate genome-wide differentiation despite extensive microsympatry, synchronicity, and ongoing hybridization. Demographic inference supports a split only ~1,300 y ago, closely following the massive expansion of domesticated African rice cultivation ~1,850 y ago. Regions of highest divergence, concentrated in chromosomal inversions, were under selection during lineage splitting, consistent with local adaptation. The origin of nearly all variations implicated in adaptation, including chromosomal inversions, substantially predates the ecotype split, suggesting that rapid adaptation was fueled mainly by standing genetic variation. Sharp inversion frequency differences likely facilitated adaptive divergence between ecotypes by suppressing recombination between opposing chromosomal orientations of the two ecotypes, while permitting free recombination within the structurally monomorphic rice ecotype. Our results align with growing evidence from diverse taxa that rapid ecological diversification can arise from evolutionarily old structural genetic variants that modify genetic recombination.


Asunto(s)
Anopheles , Malaria , Oryza , Animales , Inversión Cromosómica , Ecotipo , Fitomejoramiento , Anopheles/genética , Oryza/genética
3.
BMC Bioinformatics ; 24(1): 385, 2023 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-37817115

RESUMEN

Spatial genetic variation is shaped in part by an organism's dispersal ability. We present a deep learning tool, disperseNN2, for estimating the mean per-generation dispersal distance from georeferenced polymorphism data. Our neural network performs feature extraction on pairs of genotypes, and uses the geographic information that comes with each sample. These attributes led disperseNN2 to outperform a state-of-the-art deep learning method that does not use explicit spatial information: the mean relative absolute error was reduced by 33% and 48% using sample sizes of 10 and 100 individuals, respectively. disperseNN2 is particularly useful for non-model organisms or systems with sparse genomic resources, as it uses unphased, single nucleotide polymorphisms as its input. The software is open source and available from https://github.com/kr-colab/disperseNN2 , with documentation located at https://dispersenn2.readthedocs.io/en/latest/ .


Asunto(s)
Redes Neurales de la Computación , Programas Informáticos , Humanos , Genómica/métodos , Genoma , Polimorfismo de Nucleótido Simple
4.
Mol Biol Evol ; 38(3): 1168-1183, 2021 03 09.
Artículo en Inglés | MEDLINE | ID: mdl-33022051

RESUMEN

Identification of partial sweeps, which include both hard and soft sweeps that have not currently reached fixation, provides crucial information about ongoing evolutionary responses. To this end, we introduce partialS/HIC, a deep learning method to discover selective sweeps from population genomic data. partialS/HIC uses a convolutional neural network for image processing, which is trained with a large suite of summary statistics derived from coalescent simulations incorporating population-specific history, to distinguish between completed versus partial sweeps, hard versus soft sweeps, and regions directly affected by selection versus those merely linked to nearby selective sweeps. We perform several simulation experiments under various demographic scenarios to demonstrate partialS/HIC's performance, which exhibits excellent resolution for detecting partial sweeps. We also apply our classifier to whole genomes from eight mosquito populations sampled across sub-Saharan Africa by the Anopheles gambiae 1000 Genomes Consortium, elucidating both continent-wide patterns as well as sweeps unique to specific geographic regions. These populations have experienced intense insecticide exposure over the past two decades, and we observe a strong overrepresentation of sweeps at insecticide resistance loci. Our analysis thus provides a list of candidate adaptive loci that may be relevant to mosquito control efforts. More broadly, our supervised machine learning approach introduces a method to distinguish between completed and partial sweeps, as well as between hard and soft sweeps, under a variety of demographic scenarios. As whole-genome data rapidly accumulate for a greater diversity of organisms, partialS/HIC addresses an increasing demand for useful selection scan tools that can track in-progress evolutionary dynamics.


Asunto(s)
Anopheles/genética , Aprendizaje Profundo , Resistencia a los Insecticidas/genética , Selección Genética , Animales , Genoma de los Insectos
5.
Mol Biol Evol ; 37(6): 1790-1808, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32077950

RESUMEN

Accurately inferring the genome-wide landscape of recombination rates in natural populations is a central aim in genomics, as patterns of linkage influence everything from genetic mapping to understanding evolutionary history. Here, we describe recombination landscape estimation using recurrent neural networks (ReLERNN), a deep learning method for estimating a genome-wide recombination map that is accurate even with small numbers of pooled or individually sequenced genomes. Rather than use summaries of linkage disequilibrium as its input, ReLERNN takes columns from a genotype alignment, which are then modeled as a sequence across the genome using a recurrent neural network. We demonstrate that ReLERNN improves accuracy and reduces bias relative to existing methods and maintains high accuracy in the face of demographic model misspecification, missing genotype calls, and genome inaccessibility. We apply ReLERNN to natural populations of African Drosophila melanogaster and show that genome-wide recombination landscapes, although largely correlated among populations, exhibit important population-specific differences. Lastly, we connect the inferred patterns of recombination with the frequencies of major inversions segregating in natural Drosophila populations.


Asunto(s)
Aprendizaje Profundo , Genómica/métodos , Recombinación Genética , Animales , Drosophila melanogaster
6.
Trends Genet ; 34(4): 301-312, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29331490

RESUMEN

As population genomic datasets grow in size, researchers are faced with the daunting task of making sense of a flood of information. To keep pace with this explosion of data, computational methodologies for population genetic inference are rapidly being developed to best utilize genomic sequence data. In this review we discuss a new paradigm that has emerged in computational population genomics: that of supervised machine learning (ML). We review the fundamentals of ML, discuss recent applications of supervised ML to population genetics that outperform competing methods, and describe promising future directions in this area. Ultimately, we argue that supervised ML is an important and underutilized tool that has considerable potential for the world of evolutionary genomics.


Asunto(s)
Minería de Datos/métodos , Genética de Población , Genoma Humano , Aprendizaje Automático Supervisado , Evolución Biológica , Conjuntos de Datos como Asunto , Humanos , Selección Genética
7.
PLoS Genet ; 14(4): e1007341, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29684059

RESUMEN

Hybridization and gene flow between species appears to be common. Even though it is clear that hybridization is widespread across all surveyed taxonomic groups, the magnitude and consequences of introgression are still largely unknown. Thus it is crucial to develop the statistical machinery required to uncover which genomic regions have recently acquired haplotypes via introgression from a sister population. We developed a novel machine learning framework, called FILET (Finding Introgressed Loci via Extra-Trees) capable of revealing genomic introgression with far greater power than competing methods. FILET works by combining information from a number of population genetic summary statistics, including several new statistics that we introduce, that capture patterns of variation across two populations. We show that FILET is able to identify loci that have experienced gene flow between related species with high accuracy, and in most situations can correctly infer which population was the donor and which was the recipient. Here we describe a data set of outbred diploid Drosophila sechellia genomes, and combine them with data from D. simulans to examine recent introgression between these species using FILET. Although we find that these populations may have split more recently than previously appreciated, FILET confirms that there has indeed been appreciable recent introgression (some of which might have been adaptive) between these species, and reveals that this gene flow is primarily in the direction of D. simulans to D. sechellia.


Asunto(s)
Drosophila simulans/genética , Drosophila/genética , Genoma de los Insectos , Aprendizaje Automático Supervisado , Animales , Simulación por Computador , Drosophila/clasificación , Drosophila simulans/clasificación , Evolución Molecular , Flujo Génico , Especiación Genética , Variación Genética , Genética de Población , Haplotipos , Hibridación Genética , Modelos Genéticos , Programas Informáticos , Especificidad de la Especie , Aprendizaje Automático Supervisado/estadística & datos numéricos
8.
Proc Natl Acad Sci U S A ; 115(19): 5028-5033, 2018 05 08.
Artículo en Inglés | MEDLINE | ID: mdl-29686078

RESUMEN

Evidence for adaptation to different climates in the model species Arabidopsis thaliana is seen in reciprocal transplant experiments, but the genetic basis of this adaptation remains poorly understood. Field-based quantitative trait locus (QTL) studies provide direct but low-resolution evidence for the genetic basis of local adaptation. Using high-resolution population genomic approaches, we examine local adaptation along previously identified genetic trade-off (GT) and conditionally neutral (CN) QTLs for fitness between locally adapted Italian and Swedish A. thaliana populations [Ågren J, et al. (2013) Proc Natl Acad Sci USA 110:21077-21082]. We find that genomic regions enriched in high FST SNPs colocalize with GT QTL peaks. Many of these high FST regions also colocalize with regions enriched for SNPs significantly correlated to climate in Eurasia and evidence of recent selective sweeps in Sweden. Examining unfolded site frequency spectra across genes containing high FST SNPs suggests GTs may be due to more recent adaptation in Sweden than Italy. Finally, we collapse a list of thousands of genes spanning GT QTLs to 42 genes that likely underlie the observed GTs and explore potential biological processes driving these trade-offs, from protein phosphorylation, to seed dormancy and longevity. Our analyses link population genomic analyses and field-based QTL studies of local adaptation, and emphasize that GTs play an important role in the process of local adaptation.


Asunto(s)
Adaptación Fisiológica/genética , Arabidopsis/genética , Genoma de Planta , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Italia , Suecia
9.
Mol Biol Evol ; 35(6): 1366-1371, 2018 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-29722831

RESUMEN

In this perspective, we evaluate the explanatory power of the neutral theory of molecular evolution, 50 years after its introduction by Kimura. We argue that the neutral theory was supported by unreliable theoretical and empirical evidence from the beginning, and that in light of modern, genome-scale data, we can firmly reject its universality. The ubiquity of adaptive variation both within and between species means that a more comprehensive theory of molecular evolution must be sought.


Asunto(s)
Evolución Molecular , Flujo Genético , Selección Genética , Animales , Humanos
10.
PLoS Genet ; 12(3): e1005928, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26977894

RESUMEN

Detecting the targets of adaptive natural selection from whole genome sequencing data is a central problem for population genetics. However, to date most methods have shown sub-optimal performance under realistic demographic scenarios. Moreover, over the past decade there has been a renewed interest in determining the importance of selection from standing variation in adaptation of natural populations, yet very few methods for inferring this model of adaptation at the genome scale have been introduced. Here we introduce a new method, S/HIC, which uses supervised machine learning to precisely infer the location of both hard and soft selective sweeps. We show that S/HIC has unrivaled accuracy for detecting sweeps under demographic histories that are relevant to human populations, and distinguishing sweeps from linked as well as neutrally evolving regions. Moreover, we show that S/HIC is uniquely robust among its competitors to model misspecification. Thus, even if the true demographic model of a population differs catastrophically from that specified by the user, S/HIC still retains impressive discriminatory power. Finally, we apply S/HIC to the case of resequencing data from human chromosome 18 in a European population sample, and demonstrate that we can reliably recover selective sweeps that have been identified earlier using less specific and sensitive methods.


Asunto(s)
Flujo Genético , Genética de Población , Aprendizaje Automático , Selección Genética/genética , Cromosomas Humanos Par 18/genética , Genoma Humano , Haplotipos/genética , Humanos
11.
Mol Biol Evol ; 34(8): 1863-1877, 2017 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-28482049

RESUMEN

The degree to which adaptation in recent human evolution shapes genetic variation remains controversial. This is in part due to the limited evidence in humans for classic "hard selective sweeps", wherein a novel beneficial mutation rapidly sweeps through a population to fixation. However, positive selection may often proceed via "soft sweeps" acting on mutations already present within a population. Here, we examine recent positive selection across six human populations using a powerful machine learning approach that is sensitive to both hard and soft sweeps. We found evidence that soft sweeps are widespread and account for the vast majority of recent human adaptation. Surprisingly, our results also suggest that linked positive selection affects patterns of variation across much of the genome, and may increase the frequencies of deleterious mutations. Our results also reveal insights into the role of sexual selection, cancer risk, and central nervous system development in recent human evolution.


Asunto(s)
Adaptación Fisiológica/genética , Genoma Humano/genética , Aclimatación , Adaptación Biológica/genética , Bases de Datos de Ácidos Nucleicos , Evolución Molecular , Variación Genética/genética , Genética de Población , Humanos , Aprendizaje Automático , Mutación , Selección Genética/genética
12.
Bioinformatics ; 32(24): 3839-3841, 2016 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-27559153

RESUMEN

Here we describe discoal, a coalescent simulator able to generate population samples that include selective sweeps in a feature-rich, flexible manner. discoal can perform simulations conditioning on the fixation of an allele due to drift or either hard or soft sweeps-even those occurring a large genetic distance away from the simulated locus. discoal can simulate sweeps with recurrent mutation to the adaptive allele, recombination, and gene conversion, under non-equilibrium demographic histories and without specifying an allele frequency trajectory in advance. AVAILABILITY AND IMPLEMENTATION: discoal is implemented in the C programming language. Source code is freely available on GitHub (https://github.com/kern-lab/discoal) under a GNU General Public License. CONTACT: kern@dls.rutgers.edu or dan.schrider@rutgers.eduSupplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Alelos , Biología Computacional/métodos , Genética de Población/métodos , Programas Informáticos , Simulación por Computador , Frecuencia de los Genes , Modelos Genéticos , Mutación , Lenguajes de Programación , Procesos Estocásticos
13.
Genome Res ; 22(12): 2455-66, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-22987666

RESUMEN

Ultraconserved elements (UCEs), stretches of DNA that are identical between distantly related species, are enigmatic genomic features whose function is not well understood. First identified and characterized in mammals, UCEs have been proposed to play important roles in gene regulation, RNA processing, and maintaining genome integrity. However, because all of these functions can tolerate some sequence variation, their ultraconserved and ultraselected nature is not explained. We investigated whether there are highly conserved DNA elements without genic function in distantly related plant genomes. We compared the genomes of Arabidopsis thaliana and Vitis vinifera; species that diverged ∼115 million years ago (Mya). We identified 36 highly conserved elements with at least 85% similarity that are longer than 55 bp. Interestingly, these elements exhibit properties similar to mammalian UCEs, such that we named them UCE-like elements (ULEs). ULEs are located in intergenic or intronic regions and are depleted from segmental duplications. Like UCEs, ULEs are under strong purifying selection, suggesting a functional role for these elements. As their mammalian counterparts, ULEs show a sharp drop of A+T content at their borders and are enriched close to genes encoding transcription factors and genes involved in development, the latter showing preferential expression in undifferentiated tissues. By comparing the genomes of Brachypodium distachyon and Oryza sativa, species that diverged ∼50 Mya, we identified a different set of ULEs with similar properties in monocots. The identification of ULEs in plant genomes offers new opportunities to study their possible roles in genome function, integrity, and regulation.


Asunto(s)
Biología Computacional/métodos , Secuencia Conservada , Genoma de Planta , Arabidopsis/genética , Brachypodium/genética , Metilación de ADN , Evolución Molecular , Variación Genética , Intrones , Oryza/genética , Selección Genética , Análisis de Secuencia de ADN , Sorghum/genética , Vitis/genética , Zea mays/genética
14.
Mol Biol Evol ; 30(7): 1729-44, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23640124

RESUMEN

Here, we describe the construction of a phylogenetically deep, whole-genome alignment of 20 flowering plants, along with an analysis of plant genome conservation. Each included angiosperm genome was aligned to a reference genome, Arabidopsis thaliana, using the LASTZ/MULTIZ paradigm and tools from the University of California-Santa Cruz Genome Browser source code. In addition to the multiple alignment, we created a local genome browser displaying multiple tracks of newly generated genome annotation, as well as annotation sourced from published data of other research groups. An investigation into A. thaliana gene features present in the aligned A. lyrata genome revealed better conservation of start codons, stop codons, and splice sites within our alignments (51% of features from A. thaliana conserved without interruption in A. lyrata) when compared with previous publicly available plant pairwise alignments (34% of features conserved). The detailed view of conservation across angiosperms revealed not only high coding-sequence conservation but also a large set of previously uncharacterized intergenic conservation. From this, we annotated the collection of conserved features, revealing dozens of putative noncoding RNAs, including some with recorded small RNA expression. Comparing conservation between kingdoms revealed a faster decay of vertebrate genome features when compared with angiosperm genomes. Finally, conserved sequences were searched for folding RNA features, including but not limited to noncoding RNA (ncRNA) genes. Among these, we highlight a double hairpin in the 5'-untranslated region (5'-UTR) of the PRIN2 gene and a putative ncRNA with homology targeting the LAF3 protein.


Asunto(s)
Arabidopsis/genética , Codón/genética , Secuencia Conservada/genética , Genoma de Planta , Animales , Bases de Datos Genéticas , Magnoliopsida/genética , ARN no Traducido/genética , Alineación de Secuencia , Vertebrados
15.
Genetics ; 226(4)2024 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-38242701

RESUMEN

For at least the past 5 decades, population genetics, as a field, has worked to describe the precise balance of forces that shape patterns of variation in genomes. The problem is challenging because modeling the interactions between evolutionary processes is difficult, and different processes can impact genetic variation in similar ways. In this paper, we describe how diversity and divergence between closely related species change with time, using correlations between landscapes of genetic variation as a tool to understand the interplay between evolutionary processes. We find strong correlations between landscapes of diversity and divergence in a well-sampled set of great ape genomes, and explore how various processes such as incomplete lineage sorting, mutation rate variation, GC-biased gene conversion and selection contribute to these correlations. Through highly realistic, chromosome-scale, forward-in-time simulations, we show that the landscapes of diversity and divergence in the great apes are too well correlated to be explained via strictly neutral processes alone. Our best fitting simulation includes both deleterious and beneficial mutations in functional portions of the genome, in which 9% of fixations within those regions is driven by positive selection. This study provides a framework for modeling genetic variation in closely related species, an approach which can shed light on the complex balance of forces that have shaped genetic variation.


Asunto(s)
Variación Genética , Hominidae , Animales , Selección Genética , Hominidae/genética , Mutación , Genómica
16.
G3 (Bethesda) ; 14(3)2024 03 06.
Artículo en Inglés | MEDLINE | ID: mdl-38230808

RESUMEN

The often tight association between parasites and their hosts means that under certain scenarios, the evolutionary histories of the two species can become closely coupled both through time and across space. Using spatial genetic inference, we identify a potential signal of common dispersal patterns in the Anopheles gambiae and Plasmodium falciparum host-parasite system as seen through a between-species correlation of the differences between geographic sampling location and geographic location predicted from the genome. This correlation may be due to coupled dispersal dynamics between host and parasite but may also reflect statistical artifacts due to uneven spatial distribution of sampling locations. Using continuous-space population genetics simulations, we investigate the degree to which uneven distribution of sampling locations leads to bias in prediction of spatial location from genetic data and implement methods to counter this effect. We demonstrate that while algorithmic bias presents a problem in inference from spatio-genetic data, the correlation structure between A. gambiae and P. falciparum predictions cannot be attributed to spatial bias alone and is thus likely a genetic signal of co-dispersal in a host-parasite system.


Asunto(s)
Anopheles , Malaria Falciparum , Parásitos , Plasmodium , Animales , Parásitos/genética , Anopheles/genética , Anopheles/parasitología , Interacciones Huésped-Parásitos/genética , Plasmodium/genética , Plasmodium falciparum/genética , Geografía
17.
bioRxiv ; 2024 Mar 17.
Artículo en Inglés | MEDLINE | ID: mdl-38559192

RESUMEN

A fundamental goal in population genetics is to understand how variation is arrayed over natural landscapes. From first principles we know that common features such as heterogeneous population densities and source sink dynamics of dispersal should shape genetic variation over space, however there are few tools currently available that can deal with these ubiquitous complexities. Geographically referenced single nucleotide polymorphism (SNP) data are increasingly accessible, presenting an opportunity to study genetic variation across geographic space in myriad species. We present a new inference method that uses geo-referenced SNPs and a deep neural network to estimate spatially heterogeneous maps of population density and dispersal rate. Our neural network trains on simulated input and output pairings, where the input consists of genotypes and sampling locations generated from a continuous space population genetic simulator, and the output is a map of the true demographic parameters. We benchmark our tool against existing methods and discuss qualitative differences between the different approaches; in particular, our program is unique because it infers the magnitude of both dispersal and density as well as their variation over the landscape, and it does so using SNP data. Similar methods are constrained to estimating relative migration rates, or require identity by descent blocks as input. We applied our tool to empirical data from North American grey wolves, for which it estimated mostly reasonable demographic parameters, but was affected by incomplete spatial sampling. Genetic based methods like ours complement other, direct methods for estimating past and present demography, and we believe will serve as valuable tools for applications in conservation, ecology, and evolutionary biology. An open source software package implementing our method is available from https://github.com/kr-colab/mapNN.

18.
bioRxiv ; 2024 Feb 29.
Artículo en Inglés | MEDLINE | ID: mdl-38463997

RESUMEN

Sex chromosomes are critical elements of sexual reproduction in many animal and plant taxa, however they show incredible diversity and rapid turnover even within clades. Here, using a chromosome-level assembly generated with long read sequencing, we report the first evidence for genetic sex determination in cephalopods. We have uncovered a sex chromosome in California two-spot octopus (Octopus bimaculoides) in which males/females show ZZ/ZO karyotypes respectively. We show that the octopus Z chromosome is an evolutionary outlier with respect to divergence and repetitive element content as compared to other chromosomes and that it is present in all coleoid cephalopods that we have examined. Our results suggest that the cephalopod Z chromosome originated between 455 and 248 million years ago and has been conserved to the present, making it the among the oldest conserved animal sex chromosomes known.

19.
PLoS Genet ; 6(5): e1000960, 2010 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-20502635

RESUMEN

Regions of the genome that have been the target of positive selection specifically along the human lineage are of special importance in human biology. We used high throughput sequencing combined with methods to enrich human genomic samples for particular targets to obtain the sequence of 22 chromosomal samples at high depth in 40 kb neighborhoods of 49 previously identified 100-400 bp elements that show evidence for human accelerated evolution. In addition to selection, the pattern of nucleotide substitutions in several of these elements suggested an historical bias favoring the conversion of weak (A or T) alleles into strong (G or C) alleles. Here we found strong evidence in the derived allele frequency spectra of many of these 40 kb regions for ongoing weak-to-strong fixation bias. Comparison of the nucleotide composition at polymorphic loci to the composition at sites of fixed substitutions additionally reveals the signature of historical weak-to-strong fixation bias in a subset of these regions. Most of the regions with evidence for historical bias do not also have signatures of ongoing bias, suggesting that the evolutionary forces generating weak-to-strong bias are not constant over time. To investigate the role of selection in shaping these regions, we analyzed the spatial pattern of polymorphism in our samples. We found no significant evidence for selective sweeps, possibly because the signal of such sweeps has decayed beyond the power of our tests to detect them. Together, these results do not rule out functional roles for the observed changes in these regions-indeed there is good evidence that the first two are functional elements in humans-but they suggest that a fixation process (such as biased gene conversion) that is biased at the nucleotide level, but is otherwise selectively neutral, could be an important evolutionary force at play in them, both historically and at present.


Asunto(s)
Evolución Molecular , Secuencia Rica en GC , Mapeo Cromosómico , Frecuencia de los Genes , Humanos , Mutación
20.
Nat Genet ; 36(11): 1207-12, 2004 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-15502829

RESUMEN

The function of protein and RNA molecules depends on complex epistatic interactions between sites. Therefore, the deleterious effect of a mutation can be suppressed by a compensatory second-site substitution. In relating a list of 86 pathogenic mutations in human tRNAs encoded by mitochondrial genes to the sequences of their mammalian orthologs, we noted that 52 pathogenic mutations were present in normal tRNAs of one or several nonhuman mammals. We found at least five mechanisms of compensation for 32 pathogenic mutations that destroyed a Watson-Crick pair in one of the four tRNA stems: restoration of the affected Watson-Crick interaction (25 cases), strengthening of another pair (4 cases), creation of a new pair (8 cases), changes of multiple interactions in the affected stem (11 cases) and changes involving the interaction between the loop and stem structures (3 cases). A pathogenic mutation and its compensating substitution are fixed in a lineage in rapid succession, and often a compensatory interaction evolves convergently in different clades. At least 10%, and perhaps as many as 50%, of all nucleotide substitutions in evolving mammalian tRNAs participate in such interactions, indicating that the evolution of tRNAs proceeds along highly epistatic fitness ridges.


Asunto(s)
Mutación , ARN de Transferencia/genética , ARN/genética , Animales , ADN Mitocondrial , Epistasis Genética , Evolución Molecular , Humanos , Mamíferos , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Filogenia , ARN Mitocondrial
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA