Your browser doesn't support javascript.
loading
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.
Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L.
Afiliação
  • Zeng L; School of Biological Sciences, The University of Adelaide, Adelaide, SA 5005, Australia.
  • Kortschak RD; School of Biological Sciences, The University of Adelaide, Adelaide, SA 5005, Australia.
  • Raison JM; School of Biological Sciences, The University of Adelaide, Adelaide, SA 5005, Australia.
  • Bertozzi T; School of Biological Sciences, The University of Adelaide, Adelaide, SA 5005, Australia.
  • Adelson DL; Evolutionary Biology Unit, South Australian Museum, Adelaide, SA 5005, Australia.
PLoS One ; 13(3): e0193588, 2018.
Article em En | MEDLINE | ID: mdl-29538441
ABSTRACT
Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Elementos de DNA Transponíveis / Genoma Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Animals / Humans Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2018 Tipo de documento: Article País de afiliação: Austrália

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Elementos de DNA Transponíveis / Genoma Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Animals / Humans Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2018 Tipo de documento: Article País de afiliação: Austrália