Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Theor Popul Biol ; 145: 80-94, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35314171

RESUMEN

Gene gain-loss-duplication models are commonly based on continuous-time birth-death processes. Employed in a phylogenetic context, such models have been increasingly popular in studies of gene content evolution across multiple genomes. While the applications are becoming more varied and demanding, bioinformatics methods for probabilistic inference on copy numbers (or integer-valued evolutionary characters, in general) are scarce. We describe a flexible probabilistic framework for phylogenetic gain-loss-duplication models. The framework is based on a novel elementary representation by dependent random variables with well-characterized conditional distributions: binomial, Pólya (negative binomial), and Poisson. The corresponding graphical model yields exact numerical procedures for computing the likelihood and the posterior distribution of ancestral copy numbers. The resulting algorithms take quadratic time in the total number of copies. In addition, we show how the likelihood gradient can be computed by a linear-time algorithm.


Asunto(s)
Algoritmos , Variaciones en el Número de Copia de ADN , Evolución Molecular , Duplicación de Gen , Modelos Genéticos , Filogenia
2.
Theor Popul Biol ; 92: 22-9, 2014 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-24269334

RESUMEN

Shared genealogies introduce allele dependences in diploid genotypes, as alleles within an individual or between different individuals will likely match when they originate from a recent common ancestor. At a locus shared by a pair of diploid individuals, there are nine combinatorially distinct modes of identity-by-descent (IBD), capturing all possible combinations of coancestry and inbreeding. A distribution over the IBD modes is described by the nine associated probabilities, known as (Jacquard's) identity coefficients. The genetic relatedness between two individuals can be succinctly characterized by the identity coefficients corresponding to a pedigree that contains both individuals. The identity coefficients (together with allele frequencies) determine the distribution of joint genotypes at a locus. At a locus with two possible alleles, identity coefficients are not identifiable because different coefficients can generate the same genotype distribution. We analyze precisely how different IBD modes combine into identical genotype distributions at diallelic loci. In particular, we describe IBD mode mixtures that result in identical genotype distributions at all allele frequencies, implying the non-identifiability of the identity coefficients from independent loci. Our analysis yields an exhaustive characterization of relatedness statistics that are always identifiable. Importantly, we show that identifiable relatedness statistics include the kinship coefficient (probability that a random pair of alleles are identical by descent between individuals) and inbreeding-related measures, which can thus be estimated consistently from genotype distributions at independent loci.


Asunto(s)
Alelos , Genotipo , Humanos , Modelos Teóricos , Probabilidad
3.
BMC Bioinformatics ; 14 Suppl 5: S3, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23734724

RESUMEN

The joint sequencing of related genomes has become an important means to discover rare variants. Normal-tumor genome pairs are routinely sequenced together to find somatic mutations and their associations with different cancers. Parental and sibling genomes reveal de novo germline mutations and inheritance patterns related to Mendelian diseases.Acute lymphoblastic leukemia (ALL) is the most common paediatric cancer and the leading cause of cancer-related death among children. With the aim of uncovering the full spectrum of germline and somatic genetic alterations in childhood ALL genomes, we conducted whole-exome re-sequencing on a unique cohort of over 120 exomes of childhood ALL quartets, each comprising a patient's tumor and matched-normal material, and DNA from both parents. We developed a general probabilistic model for such quartet sequencing reads mapped to the reference human genome. The model is used to infer joint genotypes at homologous loci across a normal-tumor genome pair and two parental genomes.We describe the algorithms and data structures for genotype inference, model parameter training. We implemented the methods in an open-source software package (QUADGT) that uses the standard file formats of the 1000 Genomes Project. Our method's utility is illustrated on quartets from the ALL cohort.


Asunto(s)
Análisis Mutacional de ADN/métodos , Técnicas de Genotipaje , Mutación de Línea Germinal , Mutación , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Algoritmos , Niño , Exoma , Genoma Humano , Genotipo , Humanos , Programas Informáticos
4.
PLoS Comput Biol ; 7(9): e1002150, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21935348

RESUMEN

Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6-7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing.


Asunto(s)
Eucariontes/genética , Genoma , Intrones , Empalme Alternativo , Animales , Evolución Molecular , Humanos , Cadenas de Markov
5.
Bioinformatics ; 26(15): 1910-2, 2010 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-20551134

RESUMEN

SUMMARY: Count is a software package for the analysis of numerical profiles on a phylogeny. It is primarily designed to deal with profiles derived from the phyletic distribution of homologous gene families, but is suited to study any other integer-valued evolutionary characters. Count performs ancestral reconstruction, and infers family- and lineage-specific characteristics along the evolutionary tree. It implements popular methods employed in gene content analysis such as Dollo and Wagner parsimony, propensity for gene loss, as well as probabilistic methods involving a phylogenetic birth-and-death model. AVAILABILITY: Count is available as a stand-alone Java application, as well as an application bundle for MacOS X, at the web site http://www.iro.umontreal.ca/ approximately csuros/gene_content/count.html. It can also be launched using Java Webstart from the same site. The software is distributed under a BSD-style license. Source code is available upon request from the author.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Filogenia , Programas Informáticos , Probabilidad
6.
Mol Biol Evol ; 26(9): 2087-95, 2009 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-19570746

RESUMEN

Homologous genes originate from a common ancestor through vertical inheritance, duplication, or horizontal gene transfer. Entire homolog families spawned by a single ancestral gene can be identified across multiple genomes based on protein sequence similarity. The sequences, however, do not always reveal conclusively the history of large families. To study the evolution of complete gene repertoires, we propose here a mathematical framework that does not rely on resolved gene family histories. We show that so-called phylogenetic profiles, formed by family sizes across multiple genomes, are sufficient to infer principal evolutionary trends. The main novelty in our approach is an efficient algorithm to compute the likelihood of a phylogenetic profile in a model of birth-and-death processes acting on a phylogeny. We examine known gene families in 28 archaeal genomes using a probabilistic model that involves lineage- and family-specific components of gene acquisition, duplication, and loss. The model enables us to consider all possible histories when inferring statistics about archaeal evolution. According to our reconstruction, most lineages are characterized by a net loss of gene families. Major increases in gene repertoire have occurred only a few times. Our reconstruction underlines the importance of persistent streamlining processes in shaping genome composition in Archaea. It also suggests that early archaeal genomes were as complex as typical modern ones, and even show signs, in the case of the methanogenic ancestor, of an extremely large gene repertoire.


Asunto(s)
Archaea/genética , Genoma Arqueal/genética , Modelos Genéticos , Filogenia , Sustitución de Aminoácidos/genética , Secuencia de Bases , Biología Computacional , Evolución Molecular , Genes Arqueales
7.
Trends Genet ; 23(11): 543-6, 2007 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17964682

RESUMEN

By conventional wisdom, a feature that occurs too often or too rarely in a genome can indicate a functional element. To infer functionality from frequency, it is crucial to precisely characterize occurrences in randomly evolving DNA. We find that the frequency of oligonucleotides in a genomic sequence follows primarily a Pareto-lognormal distribution, which encapsulates lognormal and power-law features found across all known genomes. Such a distribution could be the result of completely random evolution by a copying process. Our characterization of the entire frequency distribution of genomic words opens a way to a more accurate reasoning about their over- and underrepresentation in genomic sequences.


Asunto(s)
Genómica , Animales , Evolución Molecular , Duplicación de Gen , Genoma , Humanos , Cadenas de Markov , Oligonucleótidos/metabolismo
8.
Trends Genet ; 23(3): 105-8, 2007 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-17239982

RESUMEN

Comparison of the exon-intron structures of ancient eukaryotic paralogs reveals the absence of conserved intron positions in these genes. This is in contrast to the conservation of intron positions in orthologous genes from even the most evolutionarily distant eukaryotes and in more recent paralogs. The lack of conserved intron positions in ancient paralogs probably reflects the origination of these genes during the earliest phase of eukaryotic evolution, which was characterized by concomitant invasion of genes by group II self-splicing elements (which were to become introns in the future) and extensive duplication of genes.


Asunto(s)
Evolución Molecular , Intrones , Animales , Duplicación de Gen , Humanos , Modelos Genéticos
9.
Mol Biol Evol ; 25(5): 903-11, 2008 May.
Artículo en Inglés | MEDLINE | ID: mdl-18296415

RESUMEN

Chromalveolates are a large, diverse supergroup of unicellular eukaryotes that includes Apicomplexa, dinoflagellates, ciliates (three lineages that form the alveolate branch), heterokonts, haptophytes, and cryptomonads (three lineages comprising the chromist branch). All sequenced genomes of chromalveolates have relatively low intron density in protein-coding genes, and few intron positions are shared between chromalveolate lineages. In contrast, genes of different chromalveolates share many intron positions with orthologous genes from other eukaryotic supergroups, in particular, the intron-rich orthologs from animals and plants. Reconstruction of the history of intron gain and loss during the evolution of chromalveolates using a general and flexible maximum-likelihood approach indicates that genes of the ancestors of chromalveolates and, particularly, alveolates had unexpectedly high intron densities. It is estimated that the chromalveolate ancestor had, approximately, two-third of the human intron density, whereas the intron density in the genes of the alveolate ancestor is estimated to be slightly greater than the human intron density. Accordingly, it is inferred that the evolution of chromalveolates was dominated by intron loss. The conclusion that ancestral chromalveolate forms had high intron densities is unexpected because all extant unicellular eukaryotes have relatively few introns and are thought to be unable to maintain numerous introns due to intense purifying selection in their, typically, large populations. It is suggested that, at early stages of evolution, chromalveolates went through major population bottlenecks that were accompanied by intron invasion.


Asunto(s)
Eucariontes/genética , Evolución Molecular , Intrones , Animales , Células Eucariotas , Transferencia de Gen Horizontal , Genes Protozoarios , Funciones de Verosimilitud , Plantas/genética
10.
Bioinformatics ; 24(13): 1538-9, 2008 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-18474506

RESUMEN

UNLABELLED: Malin is a software package for the analysis of eukaryotic gene structure evolution. It provides a graphical user interface for various tasks commonly used to infer the evolution of exon-intron structure in protein-coding orthologs. Implemented tasks include the identification of conserved homologous intron sites in protein alignments, as well as the estimation of ancestral intron content, lineage-specific intron losses and gains. Estimates are computed either with parsimony, or with a probabilistic model that incorporates rate variation across lineages and intron sites. AVAILABILITY: Malin is available as a stand-alone Java application, as well as an application bundle for MacOS X, at the website http://www.iro.umontreal.ca/~csuros/introns/malin/. The software is distributed under a BSD-style license.


Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Análisis Mutacional de ADN/métodos , Evolución Molecular , Intrones/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Secuencia de Bases , Humanos , Funciones de Verosimilitud , Datos de Secuencia Molecular
11.
Bioinformatics ; 23(13): i87-96, 2007 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-17646350

RESUMEN

UNLABELLED: Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after O(n l) preprocessing time, subsequent evaluations take O(n l/log l) time almost surely in the Yule-Harding random model of n-taxon phylogenies, where l is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, which is more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now. AVAILABILITY: The Java implementations of the algorithms are publicly available from the corresponding author's site http://www.iro.umontreal.ca/~csuros/introns/. Data are available on request.


Asunto(s)
Mapeo Cromosómico/métodos , Análisis Mutacional de ADN/métodos , Evolución Molecular , Variación Genética/genética , Intrones/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Secuencia de Bases , Datos de Secuencia Molecular
12.
Genome Biol Evol ; 8(8): 2340-50, 2016 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-27412607

RESUMEN

We examine exon junctions near apparent amino acid insertions and deletions in alignments of orthologous protein-coding genes. In 1,917 ortholog families across nine oomycete genomes, 10-20% of introns are near an alignment gap, indicating at first sight that splice-site displacements are frequent. We designed a robust algorithmic procedure for the delineation of intron-containing homologous regions, and combined it with a parsimony-based reconstruction of intron loss, gain, and splice-site shift events on a phylogeny. The reconstruction implies that 12% of introns underwent an acceptor-site shift, and 10% underwent a donor-site shift. In order to offset gene annotation problems, we amended the procedure with the reannotation of intron boundaries using alignment evidence. The corresponding reconstruction involves much fewer intron gain and splice-site shift events. The frequency of acceptor- and donor-side shifts drops to 4% and 3%, respectively, which are not much different from what one would expect by random codon insertions and deletions. In other words, gaps near exon junctions are mostly artifacts of gene annotation rather than evidence of sliding intron boundaries. Our study underscores the importance of using well-supported gene structure annotations in comparative studies. When transcription evidence is not available, we propose a robust ancestral reconstruction procedure that corrects misannotated intron boundaries using sequence alignments. The results corroborate the view that boundary shifts and complete intron sliding are only accidental in eukaryotic genome evolution and have a negligible impact on protein diversity.


Asunto(s)
Evolución Molecular , Oomicetos/genética , Filogenia , Sitios de Empalme de ARN/genética , Secuencia de Aminoácidos/genética , Eucariontes/genética , Exones/genética , Genoma , Mutación INDEL/genética , Intrones/genética , Anotación de Secuencia Molecular , Alineación de Secuencia , Homología de Secuencia
13.
J Comput Biol ; 9(2): 277-97, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12015882

RESUMEN

We present a novel distance-based algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n(2)) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1 - o(1)) probability from sequences with polynomial length in n. Moreover, for almost all trees, our algorithm achieves the same success probability on polylogarithmic sample sizes. The theoretical results are supported by simulation experiments involving trees with 500, 1,895, and 3,135 leaves. The topologies of the trees are recovered with high success from 2,000 bp DNA sequences.


Asunto(s)
Algoritmos , Evolución Molecular , Secuencia de Bases , Biología Computacional , ADN/genética , Cadenas de Markov , Modelos Genéticos , Filogenia
14.
J Comput Biol ; 11(5): 1001-21, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15700414

RESUMEN

Pooled Genomic Indexing (PGI) is a novel method for physical mapping of clones onto known sequences. PGI is carried out by pooling arrayed clones and generating shotgun sequence reads from the pools. The shotgun sequences are compared to a reference sequence. In the simplest case, clones are placed on an array and are pooled by rows and columns. If a shotgun sequence from a row pool and another shotgun sequence from a column pool match the reference sequence at a close distance, they are both assigned to the clone at the intersection of the two pools. Accordingly, the clone is mapped onto the region of the reference sequence between the two matches. A probabilistic model for PGI is developed, and several pooling designs are described and analyzed, including transversal designs and designs from linear codes. The probabilistic model and the pooling schemes are validated in simulated experiments where 625 rat bacterial artificial chromosome (BAC) clones and 207 mouse BAC clones are mapped onto homologous human sequence.


Asunto(s)
Biología Computacional , Mapeo Físico de Cromosoma , Proyectos de Investigación , Animales , Cromosomas Artificiales Bacterianos , Interpretación Estadística de Datos , Ratones , Filogenia , Probabilidad , Ratas
15.
Artículo en Inglés | MEDLINE | ID: mdl-17051696

RESUMEN

We examine the problem of finding maximum-scoring sets of disjoint segments in a sequence of scores. The problem arises in DNA and protein segmentation and in postprocessing of sequence alignments. Our key result states a simple recursive relationship between maximum-scoring segment sets. The statement leads to fast algorithms for finding such segment sets. We apply our methods to the identification of noncoding RNA genes in thermophiles.


Asunto(s)
Biología Computacional/métodos , ADN/química , Alineación de Secuencia , Algoritmos , Animales , Interpretación Estadística de Datos , Humanos , Funciones de Verosimilitud , Methanococcus/genética , Modelos Estadísticos , Modelos Teóricos , Probabilidad , Sulfolobus/genética
16.
Genome Inform ; 14: 186-95, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-15706533

RESUMEN

This paper studies sequencing and mapping methods that rely solely on pooling and shotgun sequencing of clones. First, we scrutinize and improve the recently proposed Clone-Array Pooled Shotgun Sequencing (CAPSS) method, which delivers a BAC-linked assembly of a whole genome sequence. Secondly, we introduce a novel physical mapping method, called Clone-Array Pooled Shotgun Mapping (CAPS-MAP), which computes the physical ordering of BACs in a random library. Both CAPSS and CAPS-MAP construct subclone libraries from pooled genomic BAC clones.


Asunto(s)
Biología Computacional/métodos , Animales , Cromosomas Artificiales Bacterianos , Simulación por Computador , Drosophila melanogaster/genética , Genoma , Reproducibilidad de los Resultados , Proyectos de Investigación , Alineación de Secuencia
17.
mBio ; 4(4)2013 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-23839216

RESUMEN

UNLABELLED: Marine bacteria in the Roseobacter and SAR11 lineages successfully exploit the ocean habitat, together accounting for ~40% of bacteria in surface waters, yet have divergent life histories that exemplify patch-adapted versus free-living ecological roles. Here, we use a phylogenetic birth-and-death model to understand how genome content supporting different life history strategies evolved in these related alphaproteobacterial taxa, showing that the streamlined genomes of free-living SAR11 were gradually downsized from a common ancestral genome only slightly larger than the extant members (~2,000 genes), while the larger and variably sized genomes of roseobacters evolved along dynamic pathways from a sizeable common ancestor (~8,000 genes). Genome changes in the SAR11 lineage occurred gradually over ~800 million years, whereas Roseobacter genomes underwent more substantial modifications, including major periods of expansion, over ~260 million years. The timing of the first Roseobacter genome expansion was coincident with the predicted radiation of modern marine eukaryotic phytoplankton of sufficient size to create nutrient-enriched microzones and is consistent with present-day ecological associations between these microbial groups. We suggest that diversification of red-lineage phytoplankton is an important driver of divergent life history strategies among the heterotrophic bacterioplankton taxa that dominate the present-day ocean. IMPORTANCE: One-half of global primary production occurs in the oceans, and more than half of this is processed by heterotrophic bacterioplankton through the marine microbial food web. The diversity of life history strategies that characterize different bacterioplankton taxa is an important subject, since the locations and mechanisms whereby bacteria interact with seawater organic matter has effects on microbial growth rates, metabolic pathways, and growth efficiencies, and these in turn affect rates of carbon mineralization to the atmosphere and sequestration into the deep sea. Understanding the evolutionary origins of the ecological strategies that underlie biochemical interactions of bacteria with the ocean system, and which scale up to affect globally important biogeochemical processes, will improve understanding of how microbial diversity is maintained and enable useful predictions about microbial response in the future ocean.


Asunto(s)
Alphaproteobacteria/genética , Organismos Acuáticos/genética , Evolución Molecular , Agua de Mar/microbiología , Genoma Bacteriano
18.
Wiley Interdiscip Rev RNA ; 4(1): 93-105, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23139082

RESUMEN

In eukaryotes, protein-coding sequences are interrupted by non-coding sequences known as introns. During mRNA maturation, introns are excised by the spliceosome and the coding regions, exons, are spliced to form the mature coding region. The intron densities widely differ between eukaryotic lineages, from 6 to 7 introns per kb of coding sequence in vertebrates, some invertebrates and green plants, to only a few introns across the entire genome in many unicellular eukaryotes. Evolutionary reconstructions using maximum likelihood methods suggest intron-rich ancestors for each major group of eukaryotes. For the last common ancestor of animals, the highest intron density of all extant and extinct eukaryotes was inferred, at 120-130% of the human intron density. Furthermore, an intron density within 53-74% of the human values was inferred for the last eukaryotic common ancestor. Accordingly, evolution of eukaryotic genes in all lines of descent involved primarily intron loss, with substantial gain only at the bases of several branches including plants and animals. These conclusions have substantial biological implications indicating that the common ancestor of all modern eukaryotes was a complex organism with a gene architecture resembling those in multicellular organisms. Alternative splicing most likely initially appeared as an inevitable result of splicing errors and only later was employed to generate structural and functional diversification of proteins.


Asunto(s)
Eucariontes/genética , Exones/genética , Intrones/genética , Empalme Alternativo , Animales , Evolución Biológica , Secuencia Conservada/genética , Evolución Molecular , Genoma , Humanos , Empalmosomas/genética
20.
Biol Direct ; 7: 11, 2012 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-22507701

RESUMEN

Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded 'introns first' held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes.


Asunto(s)
Evolución Molecular , Intrones , Empalmosomas/genética , Empalme Alternativo , Animales , Secuencia de Bases , Secuencia Conservada , Eucariontes/química , Eucariontes/clasificación , Eucariontes/genética , Exones , Genética de Población , Genoma , Filogenia , Sitios de Empalme de ARN , Selección Genética , Empalmosomas/química , Regiones no Traducidas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA