Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
RNA ; 18(1): 1-15, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22128342

RESUMEN

Pre-mRNA structure impacts many cellular processes, including splicing in genes associated with disease. The contemporary paradigm of RNA structure prediction is biased toward secondary structures that occur within short ranges of pre-mRNA, although long-range base-pairings are known to be at least as important. Recently, we developed an efficient method for detecting conserved RNA structures on the genome-wide scale, one that does not require multiple sequence alignments and works equally well for the detection of local and long-range base-pairings. Using an enhanced method that detects base-pairings at all possible combinations of splice sites within each gene, we now report RNA structures that could be involved in the regulation of splicing in mammals. Statistically, we demonstrate strong association between the occurrence of conserved RNA structures and alternative splicing, where local RNA structures are generally more frequent at alternative donor splice sites, while long-range structures are more associated with weak alternative acceptor splice sites. As an example, we validated the RNA structure in the human SF1 gene using minigenes in the HEK293 cell line. Point mutations that disrupted the base-pairing of two complementary boxes between exons 9 and 10 of this gene altered the splicing pattern, while the compensatory mutations that reestablished the base-pairing reverted splicing to that of the wild-type. There is statistical evidence for a Dscam-like class of mammalian genes, in which mutually exclusive RNA structures control mutually exclusive alternative splicing. In sum, we propose that long-range base-pairings carry an important, yet unconsidered part of the splicing code, and that, even by modest estimates, there must be thousands of such potentially regulatory structures conserved throughout the evolutionary history of mammals.


Asunto(s)
Empalme Alternativo , Precursores del ARN/química , Precursores del ARN/genética , Empalme del ARN , Animales , Secuencia de Bases , Secuencia Conservada , Quinasas Similares a Doblecortina , Células HEK293 , Humanos , Péptidos y Proteínas de Señalización Intracelular/genética , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Proteínas Serina-Treonina Quinasas/genética , Sitios de Empalme de ARN , Análisis de Secuencia de ARN
2.
Am J Hum Genet ; 83(1): 94-8, 2008 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-18571144

RESUMEN

Alternative splicing is a well-recognized mechanism of accelerated genome evolution. We have studied single-nucleotide polymorphisms and human-chimpanzee divergence in the exons of 6672 alternatively spliced human genes, with the aim of understanding the forces driving the evolution of alternatively spliced sequences. Here, we show that alternatively spliced exons and exon fragments (alternative exons) from minor isoforms experience lower selective pressure at the amino acid level, accompanied by selection against synonymous sequence variation. The results of the McDonald-Kreitman test suggest that alternatively spliced exons, unlike exons constitutively included in the mRNA, are also subject to positive selection, with up to 27% of amino acids fixed by positive selection.


Asunto(s)
Empalme Alternativo/genética , Exones , Genes/genética , Selección Genética , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Codón , Bases de Datos Factuales , Etiquetas de Secuencia Expresada , Humanos , Datos de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Homología de Secuencia de Aminoácido
3.
Nucleic Acids Res ; 37(14): 4533-44, 2009 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-19465384

RESUMEN

Accurate and efficient recognition of splice sites during pre-mRNA splicing is essential for proper transcriptome expression. Splice site usage can be modulated by secondary structures, but it is unclear if this type of modulation is commonly used or occurs to a significant degree with secondary structures forming over long distances. Using phlyogenetic comparisons of intronic sequences among 12 Drosophila genomes, we elucidated a group of 202 highly conserved pairs of sequences, each at least nine nucleotides long, capable of forming stable stem structures. This set was highly enriched in alternatively spliced introns and introns with weak acceptor sites and long introns, and most occurred over long distances (>150 nucleotides). Experimentally, we analyzed the splicing of several of these introns using mini-genes in Drosophila S2 cells. Wild-type splicing patterns were changed by mutations that opened the stem structure, and restored by compensatory mutations that re-established the base-pairing potential, demonstrating that these secondary structures were indeed implicated in the splice site choice. Mechanistically, the RNA structures masked splice sites, brought together distant splice sites and/or looped out introns. Thus, base-pairing interactions within introns, even those occurring over long distances, are more frequent modulators of alternative splicing than is currently assumed.


Asunto(s)
Empalme Alternativo , Drosophila melanogaster/genética , Intrones , Precursores del ARN/química , ARN Mensajero/química , Animales , Emparejamiento Base , Secuencia de Bases , Secuencia Conservada , Datos de Secuencia Molecular , Sitios de Empalme de ARN
4.
RNA ; 14(4): 717-35, 2008 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-18359782

RESUMEN

T-box antitermination is one of the main mechanisms of regulation of genes involved in amino acid metabolism in Gram-positive bacteria. T-box regulatory sites consist of conserved sequence and RNA secondary structure elements. Using a set of known T-box sites, we constructed the common pattern and used it to scan available bacterial genomes. New T-boxes were found in various Gram-positive bacteria, some Gram-negative bacteria (delta-proteobacteria), and some other bacterial groups (Deinococcales/Thermales, Chloroflexi, Dictyoglomi). The majority of T-box-regulated genes encode aminoacyl-tRNA synthetases. Two other groups of T-box-regulated genes are amino acid biosynthetic genes and transporters, as well as genes with unknown function. Analysis of candidate T-box sites resulted in new functional annotations. We assigned the amino acid specificity to a large number of candidate amino acid transporters and a possible function to amino acid biosynthesis genes. We then studied the evolution of the T-boxes. Analysis of the constructed phylogenetic trees demonstrated that in addition to the normal evolution consistent with the evolution of regulated genes, T-boxes may be duplicated, transferred to other genes, and change specificity. We observed several cases of recent T-box regulon expansion following the loss of a previously existing regulatory system, in particular, arginine regulon in Clostridium difficile and methionine regulon in Lactobacillaceae. Finally, we described a new structural class of T-boxes containing duplicated terminator-antiterminator elements and unusual reduced T-boxes regulating initiation of translation in the Actinobacteria.


Asunto(s)
Bacterias/genética , Bacterias/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Proteínas de Dominio T Box/genética , Proteínas de Dominio T Box/metabolismo , Regiones no Traducidas 5' , Sistemas de Transporte de Aminoácidos/genética , Sistemas de Transporte de Aminoácidos/metabolismo , Aminoácidos/metabolismo , Secuencia de Bases , ADN Bacteriano/genética , Evolución Molecular , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano , Genómica , Modelos Biológicos , Modelos Moleculares , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Filogenia , ARN Bacteriano/química , ARN Bacteriano/genética , ARN Mensajero/química , ARN Mensajero/genética , Regulón , Homología de Secuencia de Ácido Nucleico
5.
Nat Biotechnol ; 23(1): 137-44, 2005 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-15637633

RESUMEN

The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.


Asunto(s)
Biología Computacional/métodos , Expresión Génica , Transcripción Genética , Secuencias de Aminoácidos , Animales , Sitios de Unión , Bases de Datos de Proteínas , Drosophila , Proteínas Fúngicas/química , Humanos , Internet , Ratones , Reproducibilidad de los Resultados , Programas Informáticos
6.
J Bioinform Comput Biol ; 4(2): 589-96, 2006 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-16819804

RESUMEN

The RNAKinetics server (http://www.ig-msk.ru/RNA/kinetics) is a web interface for the newly developed RNAKinetics software. The software models the dynamics of RNA secondary structure by the means of kinetic analysis of folding transitions of a growing RNA molecule. The result of the modeling is a kinetic ensemble, i.e. a collection of RNA structures that are endowed with probabilities, which depend on time. This approach gives comprehensive probabilistic description of RNA folding pathways, revealing important kinetic details that are not captured by the traditional structure prediction methods. The access to the RNAKinetics server is free.


Asunto(s)
Modelos Químicos , Modelos Moleculares , ARN/química , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Bases , Gráficos por Computador , Simulación por Computador , Cinética , Datos de Secuencia Molecular , Movimiento (Física) , Conformación de Ácido Nucleico
7.
PLoS One ; 8(1): e54835, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23382983

RESUMEN

Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3-14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Análisis de Secuencia de ADN , Genotipo , VIH-1/genética , Virus de Hepatitis/clasificación , Virus de Hepatitis/genética , Humanos , Mutación INDEL , Mycobacterium tuberculosis/clasificación , Mycobacterium tuberculosis/genética , Filogenia , ARN Ribosómico 16S
8.
Genome Res ; 12(10): 1507-16, 2002 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-12368242

RESUMEN

Biotin is a necessary cofactor of numerous biotin-dependent carboxylases in a variety of microorganisms. The strict control of biotin biosynthesis in Escherichia coli is mediated by the bifunctional BirA protein, which acts both as a biotin-protein ligase and as a transcriptional repressor of the biotin operon. Little is known about regulation of biotin biosynthesis in other bacteria. Using comparative genomics and phylogenetic analysis, we describe the biotin biosynthetic pathway and the BirA regulon in most available bacterial genomes. Existence of an N-terminal DNA-binding domain in BirA strictly correlates with the presence of putative BirA-binding sites upstream of biotin operons. The predicted BirA-binding sites are well conserved among various eubacterial and archaeal genomes. The possible role of the hypothetical genes bioY and yhfS-yhfT, newly identified members of the BirA regulon, in the biotin metabolism is discussed. Based on analysis of co-occurrence of the biotin biosynthetic genes and bioY in complete genomes, we predict involvement of the transmembrane protein BioY in biotin transport. Various nonorthologous substitutes of the bioC-coupled gene bioH from E. coli, observed in several genomes, possibly represent the existence of different pathways for pimeloyl-CoA biosynthesis. Another interesting result of analysis of operon structures and BirA sites is that some biotin-dependent carboxylases from Rhodobacter capsulatus, actinomycetes, and archaea are possibly coregulated with BirA. BirA is the first example of a transcriptional regulator with a conserved binding signal in eubacteria and archaea.


Asunto(s)
Proteínas Arqueales/genética , Biotina/genética , Ligasas de Carbono-Nitrógeno/genética , Secuencia Conservada/fisiología , Proteínas de Escherichia coli/genética , Regulón/genética , Proteínas Represoras/genética , Transducción de Señal/genética , Factores de Transcripción/genética , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Secuencia Conservada/genética , Orden Génico/genética , Genes Arqueales/genética , Genes Bacterianos/genética , Funciones de Verosimilitud
9.
Hum Mol Genet ; 12(11): 1313-20, 2003 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-12761046

RESUMEN

Alternative splicing has recently emerged as a major mechanism of generating protein diversity in higher eukaryotes. We compared alternative splicing isoforms of 166 pairs of orthologous human and mouse genes. As the mRNA and EST libraries of human and mouse are not complete and thus cannot be compared directly, we instead analyzed whether known cassette exons or alternative splicing sites from one genome are conserved in the other genome. We demonstrate that about half of the analyzed genes have species-specific isoforms, and about a quarter of elementary alternatives are not conserved between the human and mouse genomes. The detailed results of this study are available at www.ig-msk.ru:8005/HMG_paper.


Asunto(s)
Empalme Alternativo , Secuencia Conservada , Genoma Humano , Animales , Secuencia de Bases , Proteínas de Unión al ADN/genética , Exones , Etiquetas de Secuencia Expresada , Humanos , Proteínas de la Membrana/genética , Ratones , Proteínas del Tejido Nervioso/genética , Proteínas Proto-Oncogénicas/genética , Factores de Empalme de ARN , ARN Mensajero/genética , Proteínas de Unión al ARN/genética , ATPasa Intercambiadora de Sodio-Potasio/genética , Factores de Transcripción/genética , Proteína AIRE
10.
J Bacteriol ; 186(19): 6575-85, 2004 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-15375139

RESUMEN

We describe a simple theoretical framework for identifying orthologous sets of genes that deviate from a clock-like model of evolution. The approach used is based on comparing the evolutionary distances within a set of orthologs to a standard intergenomic distance, which was defined as the median of the distribution of the distances between all one-to-one orthologs. Under the clock-like model, the points on a plot of intergenic distances versus intergenomic distances are expected to fit a straight line. A statistical technique to identify significant deviations from the clock-like behavior is described. For several hundred analyzed orthologous sets representing three well-defined bacterial lineages, the alpha-Proteobacteria, the gamma-Proteobacteria, and the Bacillus-Clostridium group, the clock-like null hypothesis could not be rejected for approximately 70% of the sets, whereas the rest showed substantial anomalies. Subsequent detailed phylogenetic analysis of the genes with the strongest deviations indicated that over one-half of these genes probably underwent a distinct form of horizontal gene transfer, xenologous gene displacement, in which a gene is displaced by an ortholog from a different lineage. The remaining deviations from the clock-like model could be explained by lineage-specific acceleration of evolution. The results indicate that although xenologous gene displacement is a major force in bacterial evolution, a significant majority of orthologous gene sets in three major bacterial lineages evolved in accordance with the clock-like model. The approach described here allows rapid detection of deviations from this mode of evolution on the genome scale.


Asunto(s)
Evolución Molecular , Transferencia de Gen Horizontal , Genoma Bacteriano , Modelos Genéticos , Filogenia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA