Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Characteristics of transposable element exonization within human and mouse.

Sela, Noa; Mersch, Britta; Hotz-Wagenblatt, Agnes; Ast, Gil.

PLoS One ; 5(6): e10907, 2010 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-20532223

RESUMO

Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.

Assuntos

Elementos de DNA Transponíveis , Animais , Éxons , Humanos , Íntrons , Camundongos , Polimorfismo de Nucleotídeo Único , Edição de RNA

Automatic detection of exonic splicing enhancers (ESEs) using SVMs.

Mersch, Britta; Gepperth, Alexander; Suhai, Sándor; Hotz-Wagenblatt, Agnes.

BMC Bioinformatics ; 9: 369, 2008 Sep 10.

Artigo em Inglês | MEDLINE | ID: mdl-18783607

RESUMO

BACKGROUND: Exonic splicing enhancers (ESEs) activate nearby splice sites and promote the inclusion (vs. exclusion) of exons in which they reside, while being a binding site for SR proteins. To study the impact of ESEs on alternative splicing it would be useful to have a possibility to detect them in exons. Identifying SR protein-binding sites in human DNA sequences by machine learning techniques is a formidable task, since the exon sequences are also constrained by their functional role in coding for proteins. RESULTS: The choice of training examples needed for machine learning approaches is difficult since there are only few exact locations of human ESEs described in the literature which could be considered as positive examples. Additionally, it is unclear which sequences are suitable as negative examples. Therefore, we developed a motif-oriented data-extraction method that extracts exon sequences around experimentally or theoretically determined ESE patterns. Positive examples are restricted by heuristics based on known properties of ESEs, e.g. location in the vicinity of a splice site, whereas negative examples are taken in the same way from the middle of long exons. We show that a suitably chosen SVM using optimized sequence kernels (e.g., combined oligo kernel) can extract meaningful properties from these training examples. Once the classifier is trained, every potential ESE sequence can be passed to the SVM for verification. Using SVMs with the combined oligo kernel yields a high accuracy of about 90 percent and well interpretable parameters. CONCLUSION: The motif-oriented data-extraction method seems to produce consistent training and test data leading to good classification rates and thus allows verification of potential ESE motifs. The best results were obtained using an SVM with the combined oligo kernel, while oligo kernels with oligomers of a certain length could be used to extract relevant features.

Assuntos

Algoritmos , Inteligência Artificial , Éxons/genética , Reconhecimento Automatizado de Padrão/métodos , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Dados de Sequência Molecular

Evolutionary optimization of sequence kernels for detection of bacterial gene starts.

Mersch, Britta; Glasmachers, Tobias; Meinicke, Peter; Igel, Christian.

Int J Neural Syst ; 17(5): 369-81, 2007 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-18098369

RESUMO

Oligo kernels for biological sequence classification have a high discriminative power. A new parameterization for the K-mer oligo kernel is presented, where all oligomers of length K are weighted individually. The task specific choice of these parameters increases the classification performance and reveals information about discriminative features. For adapting the multiple kernel parameters based on cross-validation the covariance matrix adaptation evolution strategy is proposed. It is applied to optimize the trimer oligo kernels for the detection of bacterial gene starts. The resulting kernels lead to higher classification rates, and the adapted parameters reveal the importance of particular triplets for classification, for example of those occurring in the Shine-Dalgarno Sequence.

Assuntos

Sequência de Bases , Evolução Biológica , Genes Bacterianos/genética , Análise de Sequência de DNA/métodos , Algoritmos , Dados de Sequência Molecular

SERpredict: detection of tissue- or tumor-specific isoforms generated through exonization of transposable elements.

Mersch, Britta; Sela, Noa; Ast, Gil; Suhai, Sándor; Hotz-Wagenblatt, Agnes.

BMC Genet ; 8: 78, 2007 Nov 06.

Artigo em Inglês | MEDLINE | ID: mdl-17986331

RESUMO

BACKGROUND: Transposed elements (TEs) are known to affect transcriptomes, because either new exons are generated from intronic transposed elements (this is called exonization), or the element inserts into the exon, leading to a new transcript. Several examples in the literature show that isoforms generated by an exonization are specific to a certain tissue (for example the heart muscle) or inflict a disease. Thus, exonizations can have negative effects for the transcriptome of an organism. RESULTS: As we aimed at detecting other tissue- or tumor-specific isoforms in human and mouse genomes which were generated through exonization of a transposed element, we designed the automated analysis pipeline SERpredict (SER = Specific Exonized Retroelement) making use of Bayesian Statistics. With this pipeline, we found several genes in which a transposed element formed a tissue- or tumor-specific isoform. CONCLUSION: Our results show that SERpredict produces relevant results, demonstrating the importance of transposed elements in shaping both the human and the mouse transcriptomes. The effect of transposed elements on the human transcriptome is several times higher than the effect on the mouse transcriptome, due to the contribution of the primate-specific Alu elements.

Assuntos

Elementos de DNA Transponíveis , Bases de Dados Genéticas , Éxons , Genes Neoplásicos , Isoformas de Proteínas/genética , Retroelementos , Processamento Alternativo , Elementos Alu , Animais , Etiquetas de Sequências Expressas , Biblioteca Gênica , Humanos , Camundongos , Especificidade de Órgãos

Comparative analysis of transposed element insertion within human and mouse genomes reveals Alu's unique role in shaping the human transcriptome.

Sela, Noa; Mersch, Britta; Gal-Mark, Nurit; Lev-Maor, Galit; Hotz-Wagenblatt, Agnes; Ast, Gil.

Genome Biol ; 8(6): R127, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17594509

RESUMO

BACKGROUND: Transposed elements (TEs) have a substantial impact on mammalian evolution and are involved in numerous genetic diseases. We compared the impact of TEs on the human transcriptome and the mouse transcriptome. RESULTS: We compiled a dataset of all TEs in the human and mouse genomes, identifying 3,932,058 and 3,122,416 TEs, respectively. We than extracted TEs located within human and mouse genes and, surprisingly, we found that 60% of TEs in both human and mouse are located in intronic sequences, even though introns comprise only 24% of the human genome. All TE families in both human and mouse can exonize. TE families that are shared between human and mouse exhibit the same percentage of TE exonization in the two species, but the exonization level of Alu, a primate-specific retroelement, is significantly greater than that of other TEs within the human genome, leading to a higher level of TE exonization in human than in mouse (1,824 exons compared with 506 exons, respectively). We detected a primate-specific mechanism for intron gain, in which Alu insertion into an exon creates a new intron located in the 3' untranslated region (termed 'intronization'). Finally, the insertion of TEs into the first and last exons of a gene is more frequent in human than in mouse, leading to longer exons in human. CONCLUSION: Our findings reveal many effects of TEs on these two transcriptomes. These effects are substantially greater in human than in mouse, which is due to the presence of Alu elements in human.

Assuntos

Elementos Alu , Elementos de DNA Transponíveis , Perfilação da Expressão Gênica , Processamento Alternativo , Animais , Sequência de Bases , Éxons , Humanos , Íntrons , Camundongos , Dados de Sequência Molecular , Alinhamento de Sequência

Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection.

Igel, Christian; Glasmachers, Tobias; Mersch, Britta; Pfeifer, Nico; Meinicke, Peter.

IEEE/ACM Trans Comput Biol Bioinform ; 4(2): 216-26, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17473315

RESUMO

Biological data mining using kernel methods can be improved by a task-specific choice of the kernel function. Oligo kernels for genomic sequence analysis have proven to have a high discriminative power and to provide interpretable results. Oligo kernels that consider subsequences of different lengths can be combined and parameterized to increase their flexibility. For adapting these parameters efficiently, gradient-based optimization of the kernel-target alignment is proposed. The power of this new, general model selection procedure and the benefits of fitting kernels to problem classes are demonstrated by adapting oligo kernels for bacterial gene start detection.

Assuntos

Algoritmos , Inteligência Artificial , Códon de Iniciação/genética , DNA Bacteriano/genética , Reconhecimento Automatizado de Padrão/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Dados de Sequência Molecular , Sítio de Iniciação de Transcrição

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA