Your browser doesn't support javascript.
loading
FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome.
Wucher, Valentin; Legeai, Fabrice; Hédan, Benoît; Rizk, Guillaume; Lagoutte, Lætitia; Leeb, Tosso; Jagannathan, Vidhya; Cadieu, Edouard; David, Audrey; Lohi, Hannes; Cirera, Susanna; Fredholm, Merete; Botherel, Nadine; Leegwater, Peter A J; Le Béguec, Céline; Fieten, Hille; Johnson, Jeremy; Alföldi, Jessica; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Derrien, Thomas.
Afiliação
  • Wucher V; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Legeai F; IGEPP, BIPAA, INRA, Campus Beaulieu, Le Rheu 35653, France.
  • Hédan B; Institut National de Recherche en Informatique et en Automatique, Institut de Recherche en Informatique et Systèmes Aléatoires, Genscale, Campus Beaulieu, Rennes 35042, France.
  • Rizk G; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Lagoutte L; Institut National de Recherche en Informatique et en Automatique, Institut de Recherche en Informatique et Systèmes Aléatoires, Genscale, Campus Beaulieu, Rennes 35042, France.
  • Leeb T; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Jagannathan V; Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern 3001, Switzerland.
  • Cadieu E; Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern 3001, Switzerland.
  • David A; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Lohi H; IGEPP, BIPAA, INRA, Campus Beaulieu, Le Rheu 35653, France.
  • Cirera S; Department of Veterinary Biosciences and Research Programs Unit, Molecular Neurology, University of Helsinki, PO Box 63, Helsinki 00014, Finland.
  • Fredholm M; The Folkhälsan Institute of Genetics, Helsinki 00014, Finland.
  • Botherel N; Department of Veterinary Clinical and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 1870, Denmark.
  • Leegwater PAJ; Department of Veterinary Clinical and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 1870, Denmark.
  • Le Béguec C; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Fieten H; Department of Clinical Sciences of Companion Animals, Faculty of Veterinary Medicine, Utrecht University, Utrecht 3584CM, the Netherlands.
  • Johnson J; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Alföldi J; Department of Clinical Sciences of Companion Animals, Faculty of Veterinary Medicine, Utrecht University, Utrecht 3584CM, the Netherlands.
  • André C; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  • Lindblad-Toh K; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  • Hitte C; Institut Génétique et Développement de Rennes, CNRS, UMR6290, University Rennes1, Rennes, Cedex 35043, France.
  • Derrien T; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Nucleic Acids Res ; 45(8): e57, 2017 05 05.
Article em En | MEDLINE | ID: mdl-28053114
ABSTRACT
Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https//github.com/tderrien/FEELnc.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Genoma / Anotação de Sequência Molecular / Transcriptoma / RNA Longo não Codificante Idioma: En Ano de publicação: 2017 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Genoma / Anotação de Sequência Molecular / Transcriptoma / RNA Longo não Codificante Idioma: En Ano de publicação: 2017 Tipo de documento: Article