Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters











Database
Language
Publication year range
1.
Mob DNA ; 11: 11, 2020.
Article in English | MEDLINE | ID: mdl-32095164

ABSTRACT

BACKGROUND: Previously, 3% of the human genome has been annotated as simple sequence repeats (SSRs), similar to the proportion annotated as protein coding. The origin of much of the genome is not well annotated, however, and some of the unidentified regions are likely to be ancient SSR-derived regions not identified by current methods. The identification of these regions is complicated because SSRs appear to evolve through complex cycles of expansion and contraction, often interrupted by mutations that alter both the repeated motif and mutation rate. We applied an empirical, kmer-based, approach to identify genome regions that are likely derived from SSRs. RESULTS: The sequences flanking annotated SSRs are enriched for similar sequences and for SSRs with similar motifs, suggesting that the evolutionary remains of SSR activity abound in regions near obvious SSRs. Using our previously described P-clouds approach, we identified 'SSR-clouds', groups of similar kmers (or 'oligos') that are enriched near a training set of unbroken SSR loci, and then used the SSR-clouds to detect likely SSR-derived regions throughout the genome. CONCLUSIONS: Our analysis indicates that the amount of likely SSR-derived sequence in the human genome is 6.77%, over twice as much as previous estimates, including millions of newly identified ancient SSR-derived loci. SSR-clouds identified poly-A sequences adjacent to transposable element termini in over 74% of the oldest class of Alu (roughly, AluJ), validating the sensitivity of the approach. Poly-A's annotated by SSR-clouds also had a length distribution that was more consistent with their poly-A origins, with mean about 35 bp even in older Alus. This work demonstrates that the high sensitivity provided by SSR-Clouds improves the detection of SSR-derived regions and will enable deeper analysis of how decaying repeats contribute to genome structure.

2.
J Hered ; 105 Suppl 1: 810-20, 2014.
Article in English | MEDLINE | ID: mdl-25149256

ABSTRACT

Our current understanding of speciation is often based on considering a relatively small number of genes, sometimes in isolation of one another. Here, we describe a possible emergent genome process involving the aggregate effect of many genes contributing to the evolution of reproductive isolation across the speciation continuum. When a threshold number of divergently selected mutations of modest to low fitness effects accumulate between populations diverging with gene flow, nonlinear transitions can occur in which levels of adaptive differentiation, linkage disequilibrium, and reproductive isolation dramatically increase. In effect, the genomes of the populations start to "congeal" into distinct entities representing different species. At this stage, reproductive isolation changes from being a characteristic of specific, divergently selected genes to a property of the genome. We examine conditions conducive to such genome-wide congealing (GWC), describe how to empirically test for GWC, and highlight a putative empirical example involving Rhagoletis fruit flies. We conclude with cautious optimism that the models and concepts discussed here, once extended to large numbers of neutral markers, may provide a framework for integrating information from genome scans, selection experiments, quantitative trait loci mapping, association studies, and natural history to develop a deeper understanding of the genomics of speciation.


Subject(s)
Gene Flow , Genetic Speciation , Genome , Animals , Genes , Genetics, Population , Genome, Insect , Linkage Disequilibrium , Microsatellite Repeats , Mutation , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Reproductive Isolation , Selection, Genetic , Sympatry , Tephritidae/genetics
3.
PLoS Genet ; 10(8): e1004482, 2014 Aug.
Article in English | MEDLINE | ID: mdl-25121584

ABSTRACT

Most common methods for inferring transposable element (TE) evolutionary relationships are based on dividing TEs into subfamilies using shared diagnostic nucleotides. Although originally justified based on the "master gene" model of TE evolution, computational and experimental work indicates that many of the subfamilies generated by these methods contain multiple source elements. This implies that subfamily-based methods give an incomplete picture of TE relationships. Studies on selection, functional exaptation, and predictions of horizontal transfer may all be affected. Here, we develop a Bayesian method for inferring TE ancestry that gives the probability that each sequence was replicative, its frequency of replication, and the probability that each extant TE sequence came from each possible ancestral sequence. Applying our method to 986 members of the newly-discovered LAVA family of TEs, we show that there were far more source elements in the history of LAVA expansion than subfamilies identified using the CoSeg subfamily-classification program. We also identify multiple replicative elements in the AluSc subfamily in humans. Our results strongly indicate that a reassessment of subfamily structures is necessary to obtain accurate estimates of mutation processes, phylogenetic relationships and historical times of activity.


Subject(s)
DNA Transposable Elements/genetics , Evolution, Molecular , Phylogeny , Bayes Theorem , Gene Transfer, Horizontal/genetics , Humans , Mutation
4.
Mol Ecol ; 23(16): 4074-88, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24724861

ABSTRACT

A long-standing problem in evolutionary biology has been determining whether and how gradual, incremental changes at the gene level can account for rapid speciation and bursts of adaptive radiation. Using genome-scale computer simulations, we extend previous theory showing how gradual adaptive change can generate nonlinear population transitions, resulting in the rapid formation of new, reproductively isolated species. We show that these transitions occur via a mechanism rooted in a basic property of biological heredity: the organization of genes in genomes. Genomic organization of genes facilitates two processes: (i) the build-up of statistical associations among large numbers of genes and (ii) the action of divergent selection on persistent combinations of alleles. When a population has accumulated a critical amount of standing, divergently selected variation, the combination of these two processes allows many mutations of small effect to act synergistically and precipitously split one population into two discontinuous, reproductively isolated groups. Periods of allopatry, chromosomal linkage among loci, and large-effect alleles can facilitate this process under some conditions, but are not required for it. Our results complement and extend existing theory on alternative stable states during population divergence, distinct phases of speciation and the rapid emergence of multilocus barriers to gene flow. The results are thus a step towards aligning population genomic theory with modern empirical studies.


Subject(s)
Biological Evolution , Genetic Speciation , Genetics, Population/methods , Models, Genetic , Cluster Analysis , Computer Simulation , Gene Flow , Genetic Linkage , Mutation
SELECTION OF CITATIONS
SEARCH DETAIL