Your browser doesn't support javascript.
loading
Removal of redundant contigs from de novo RNA-Seq assemblies via homology search improves accurate detection of differentially expressed genes.
Ono, Hanako; Ishii, Kazuo; Kozaki, Toshinori; Ogiwara, Isao; Kanekatsu, Motoki; Yamada, Tetsuya.
Afiliação
  • Ono H; United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology, Fuchu, Tokyo, 183-8509, Japan. 50013951002@st.tuat.ac.jp.
  • Ishii K; Department of Applied Biological Science, Faculty of Agriculture, Tokyo University of Agriculture and Technology, Fuchu, Tokyo, 183-8509, Japan. kishii@cc.tuat.ac.jp.
  • Kozaki T; Department of Applied Biological Science, Faculty of Agriculture, Tokyo University of Agriculture and Technology, Fuchu, Tokyo, 183-8509, Japan. kozakit@cc.tuat.ac.jp.
  • Ogiwara I; United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology, Fuchu, Tokyo, 183-8509, Japan. ogiwara@cc.tuat.ac.jp.
  • Kanekatsu M; United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology, Fuchu, Tokyo, 183-8509, Japan. kanekatu@cc.tuat.ac.jp.
  • Yamada T; United Graduate School of Agricultural Science, Tokyo University of Agriculture and Technology, Fuchu, Tokyo, 183-8509, Japan. teyamada@cc.tuat.ac.jp.
BMC Genomics ; 16: 1031, 2015 Dec 04.
Article em En | MEDLINE | ID: mdl-26637306
BACKGROUND: For plant species with unsequenced genomes, cDNA contigs created by de novo assembly of RNA-Seq reads are used as reference sequences for comparative analysis of RNA-Seq datasets and the detection of differentially expressed genes (DEGs). Redundancies in such contigs are evident in previous RNA-Seq studies, and such redundancies can lead to difficulties in subsequent analysis. Nevertheless, the effects of removing redundancy from contig assemblies on comparative RNA-Seq analysis have not been evaluated. RESULTS: Here we describe a method for removing redundancy from raw contigs that were primarily created by de novo assembly of Arabidopsis thaliana RNA-Seq reads. Specifically, the contigs with the highest bit scores were selected from raw contigs by a homology search against the gene dataset in the TAIR10 database. The two existing methods for removal of redundancy based on contig length or clustering analysis used to eliminate redundancies from raw contigs. Contig number was reduced most effectively with the method based on homology search. In a comparative analysis of RNA-Seq datasets, DEGs detected in contigs that underwent redundancy removal via the homology search method showed the highest identity to the DEGs detected when the TAIR10 gene dataset was used as an exact reference. Redundancy in raw contigs could also be removed by a homology search against integrated protein datasets from several plant species other than A. thaliana. DEGs detected using contigs that underwent such redundancy-removed also showed high homology to DEGs detected using the TAIR10 gene dataset. CONCLUSION: Here we describe a method for removing redundant contigs within raw contigs; this method involves a homology search against a gene or protein database. In principal, this method can be used with unsequenced plant genomes that lack a well-developed gene database. Redundant contigs were not removed adequately via either of two existing methods, but our method allowed for removal of all redundant contigs. To our knowledge, this is the first reported improvement in accurate detection of DEGs via comparative RNA-Seq analysis that involved preparation of a non-redundant reference sequence. This method could be used to rapidly and cost-effectively detect useful genes in unsequenced plants.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Expressão Gênica / Análise de Sequência de RNA / Arabidopsis / Biologia Computacional Tipo de estudo: Diagnostic_studies Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Expressão Gênica / Análise de Sequência de RNA / Arabidopsis / Biologia Computacional Tipo de estudo: Diagnostic_studies Idioma: En Ano de publicação: 2015 Tipo de documento: Article