Your browser doesn't support javascript.
loading
FGGA-lnc: automatic gene ontology annotation of lncRNA sequences based on secondary structures.
Spetale, Flavio E; Murillo, Javier; Villanova, Gabriela V; Bulacio, Pilar; Tapia, Elizabeth.
Afiliação
  • Spetale FE; CIFASIS-Conicet-UNR, 27 de Febrero 210 bis, S2000EZP Rosario, Santa Fe, Argentina.
  • Murillo J; Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, Riobamba 245 bis, S2000EZP Rosario, Argentina.
  • Villanova GV; CIFASIS-Conicet-UNR, 27 de Febrero 210 bis, S2000EZP Rosario, Santa Fe, Argentina.
  • Bulacio P; Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, Riobamba 245 bis, S2000EZP Rosario, Argentina.
  • Tapia E; Laboratorio Mixto de Biotecnología Acuática (FCByF-UNR), Av. Eduardo Carrasco S/N, S2000EZP Rosario, Argentina.
Interface Focus ; 11(4): 20200064, 2021 Jun.
Article em En | MEDLINE | ID: mdl-34123354
ABSTRACT
The study of long non-coding RNAs (lncRNAs), greater than 200 nucleotides, is central to understanding the development and progression of many complex diseases. Unlike proteins, the functionality of lncRNAs is only subtly encoded in their primary sequence. Current in-silico lncRNA annotation methods mostly rely on annotations inferred from interaction networks. But extensive experimental studies are required to build these networks. In this work, we present a graph-based machine learning method called FGGA-lnc for the automatic gene ontology (GO) annotation of lncRNAs across the three GO subdomains. We build upon FGGA (factor graph GO annotation), a computational method originally developed to annotate protein sequences from non-model organisms. In the FGGA-lnc version, a coding-based approach is introduced to fuse primary sequence and secondary structure information of lncRNA molecules. As a result, lncRNA sequences become sequences of a higher-order alphabet allowing supervised learning methods to assess individual GO-term annotations. Raw GO annotations obtained in this way are unaware of the GO structure and therefore likely to be inconsistent with it. The message-passing algorithm embodied by factor graph models overcomes this problem. Evaluations of the FGGA-lnc method on lncRNA data, from model and non-model organisms, showed promising results suggesting it as a candidate to satisfy the huge demand for functional annotations arising from high-throughput sequencing technologies.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: Interface Focus Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Argentina

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: Interface Focus Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Argentina