Your browser doesn't support javascript.
loading
A pipeline of programs for collecting and analyzing group II intron retroelement sequences from GenBank.
Abebe, Michael; Candales, Manuel A; Duong, Adrian; Hood, Keyar S; Li, Tony; Neufeld, Ryan A E; Shakenov, Abat; Sun, Runda; Wu, Li; Jarding, Ashley M; Semper, Cameron; Zimmerly, Steven.
Afiliação
  • Abebe M; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Candales MA; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Duong A; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Hood KS; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Li T; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Neufeld RAE; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Shakenov A; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Sun R; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Wu L; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Jarding AM; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Semper C; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
  • Zimmerly S; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1 N4, Canada.
Mob DNA ; 4(1): 28, 2013 Dec 20.
Article em En | MEDLINE | ID: mdl-24359548
ABSTRACT

BACKGROUND:

Accurate and complete identification of mobile elements is a challenging task in the current era of sequencing, given their large numbers and frequent truncations. Group II intron retroelements, which consist of a ribozyme and an intron-encoded protein (IEP), are usually identified in bacterial genomes through their IEP; however, the RNA component that defines the intron boundaries is often difficult to identify because of a lack of strong sequence conservation corresponding to the RNA structure. Compounding the problem of boundary definition is the fact that a majority of group II intron copies in bacteria are truncated.

RESULTS:

Here we present a pipeline of 11 programs that collect and analyze group II intron sequences from GenBank. The pipeline begins with a BLAST search of GenBank using a set of representative group II IEPs as queries. Subsequent steps download the corresponding genomic sequences and flanks, filter out non-group II introns, assign introns to phylogenetic subclasses, filter out incomplete and/or non-functional introns, and assign IEP sequences and RNA boundaries to the full-length introns. In the final step, the redundancy in the data set is reduced by grouping introns into sets of ≥95% identity, with one example sequence chosen to be the representative.

CONCLUSIONS:

These programs should be useful for comprehensive identification of group II introns in sequence databases as data continue to rapidly accumulate.

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Mob DNA Ano de publicação: 2013 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Mob DNA Ano de publicação: 2013 Tipo de documento: Article