Your browser doesn't support javascript.
loading
A multi-view genomic data simulator.
Fratello, Michele; Serra, Angela; Fortino, Vittorio; Raiconi, Giancarlo; Tagliaferri, Roberto; Greco, Dario.
Afiliação
  • Fratello M; Department of Medical, Surgical, Neurological, Metabolic and Ageing Sciences, Second University of Napoli, Napoli, Italy. michele.fratello@unina2.it.
  • Serra A; Department of Computer Science, Fisciano, Italy. michele.fratello@unina2.it.
  • Fortino V; Department of Computer Science, Fisciano, Italy. aserra@unisa.it.
  • Raiconi G; Unit of Systems Toxicology and Nanosafety Research Centre, Finnish Institute of Occupational Health, FIOH, Helsinki, Finland. vittorio.fortino@ttl.fi.
  • Tagliaferri R; Department of Computer Science, Fisciano, Italy. gianni@unisa.it.
  • Greco D; Department of Computer Science, Fisciano, Italy. rtagliaferr@unisa.it.
BMC Bioinformatics ; 16: 151, 2015 May 12.
Article em En | MEDLINE | ID: mdl-25962835
ABSTRACT

BACKGROUND:

OMICs technologies allow to assay the state of a large number of different features (e.g., mRNA expression, miRNA expression, copy number variation, DNA methylation, etc.) from the same samples. The objective of these experiments is usually to find a reduced set of significant features, which can be used to differentiate the conditions assayed. In terms of development of novel feature selection computational methods, this task is challenging for the lack of fully annotated biological datasets to be used for benchmarking. A possible way to tackle this problem is generating appropriate synthetic datasets, whose composition and behaviour are fully controlled and known a priori.

RESULTS:

Here we propose a novel method centred on the generation of networks of interactions among different biological molecules, especially involved in regulating gene expression. Synthetic datasets are obtained from ordinary differential equations based models with known parameters. Our results show that the generated datasets are well mimicking the behaviour of real data, for popular data analysis methods are able to selectively identify existing interactions.

CONCLUSIONS:

The proposed method can be used in conjunction to real biological datasets in the assessment of data mining techniques. The main strength of this method consists in the full control on the simulated data while retaining coherence with the real biological processes. The R package MVBioDataSim is freely available to the scientific community at http//neuronelab.unisa.it/?p=1722.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Simulação por Computador / Biologia Computacional / Perfilação da Expressão Gênica / Genômica / Redes Reguladoras de Genes Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Simulação por Computador / Biologia Computacional / Perfilação da Expressão Gênica / Genômica / Redes Reguladoras de Genes Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article