Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
BMC Genomics ; 16: 1041, 2015 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-26647162

RESUMEN

BACKGROUND: Gene expression profiling using high-throughput screening (HTS) technologies allows clinical researchers to find prognosis gene signatures that could better discriminate between different phenotypes and serve as potential biological markers in disease diagnoses. In recent years, many feature selection methods have been devised for finding such discriminative genes, and more recently information theoretic filters have also been introduced for capturing feature-to-class relevance and feature-to-feature correlations in microarray-based classification. METHODS: In this paper, we present and fully formulate a new multivariate filter, iRDA, for the discovery of HTS gene-expression candidate genes. The filter constitutes a four-step framework and includes feature relevance, feature redundancy, and feature interdependence in the context of feature-pairs. The method is based upon approximate Markov blankets, information theory, several heuristic search strategies with forward, backward and insertion phases, and the method is aiming at higher order gene interactions. RESULTS: To show the strengths of iRDA, three performance measures, two evaluation schemes, two stability index sets, and the gene set enrichment analysis (GSEA) are all employed in our experimental studies. Its effectiveness has been validated by using seven well-known cancer gene-expression benchmarks and four other disease experiments, including a comparison to three popular information theoretic filters. In terms of classification performance, candidate genes selected by iRDA perform better than the sets discovered by the other three filters. Two stability measures indicate that iRDA is the most robust with the least variance. GSEA shows that iRDA produces more statistically enriched gene sets on five out of the six benchmark datasets. CONCLUSIONS: Through the classification performance, the stability performance, and the enrichment analysis, iRDA is a promising filter to find predictive, stable, and enriched gene-expression candidate genes.


Asunto(s)
Biología Computacional/métodos , Algoritmos , Biología Computacional/normas , Expresión Génica , Estudios de Asociación Genética/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos
2.
Bioinformatics ; 30(3): 343-52, 2014 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-24292936

RESUMEN

MOTIVATION: We study microRNA (miRNA) bindings to metastable RNA secondary structures close to minimum free energy conformations in the context of single nucleotide polymorphisms (SNPs) and messenger RNA (mRNA) concentration levels, i.e. whether features of miRNA bindings to metastable conformations could provide additional information supporting the differences in expression levels of the two sequences defined by a SNP. In our study, the instances [mRNA/3'UTR; SNP; miRNA] were selected based on strong expression level analyses, SNP locations within binding regions and the computationally feasible identification of metastable conformations. RESULTS: We identified 14 basic cases [mRNA; SNP; miRNA] of 3' UTR-lengths ranging from 124 up to 1078 nt reported in recent literature, and we analyzed the number, structure and miRNA binding to metastable conformations within an energy offset above mfe conformations. For each of the 14 instances, the miRNA binding characteristics are determined by the corresponding STarMir output. Among the different parameters we introduced and analyzed, we found that three of them, related to the average depth and average opening energy of metastable conformations, may provide supporting information for a stronger separation between miRNA bindings to the two alleles defined by a given SNP. AVAILABILITY AND IMPLEMENTATION: At http://kks.inf.kcl.ac.uk/MSbind.html the MSbind tool is available for calculating features of metastable conformations determined by putative miRNA binding sites.


Asunto(s)
Regiones no Traducidas 3' , MicroARNs/metabolismo , Polimorfismo de Nucleótido Simple , ARN Mensajero/química , Alelos , Sitios de Unión , Conformación de Ácido Nucleico , ARN Mensajero/metabolismo
3.
Comput Biol Chem ; 60: 43-52, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26657221

RESUMEN

The analysis of energy landscapes plays an important role in mathematical modelling, simulation and optimisation. Among the main features of interest are the number and distribution of local minima within the energy landscape. Granier and Kallel proposed in 2002 a new sampling procedure for estimating the number of local minima. In the present paper, we focus on improved heuristic implementations of the general framework devised by Granier and Kallel with regard to run-time behaviour and accuracy of predictions. The new heuristic method is demonstrated for the case of partial energy landscapes induced by RNA secondary structures. While the computation of minimum free energy RNA secondary structures has been studied for a long time, the analysis of folding landscapes has gained momentum over the past years in the context of co-transcriptional folding and deeper insights into cell processes. The new approach has been applied to ten RNA instances of length between 99 nt and 504 nt and their respective partial energy landscapes defined by secondary structures within an energy offset ΔE above the minimum free energy conformation. The number of local minima within the partial energy landscapes ranges from 1440 to 3441. Our heuristic method produces for the best approximations on average a deviation below 3.0% from the true number of local minima.


Asunto(s)
Heurística , Modelos Químicos , ARN/química , Algoritmos , Pliegue del ARN
4.
Adv Bioinformatics ; 2016: 9654921, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27110241

RESUMEN

Identifying sets of metastable conformations is a major research topic in RNA energy landscape analysis, and recently several methods have been proposed for finding local minima in landscapes spawned by RNA secondary structures. An important and time-critical component of such methods is steepest, or gradient, descent in attraction basins of local minima. We analyse the speed-up achievable by randomised descent in attraction basins in the context of large sample sets where the size has an order of magnitude in the region of ~10(6). While the gain for each individual sample might be marginal, the overall run-time improvement can be significant. Moreover, for the two nongradient methods we analysed for partial energy landscapes induced by ten different RNA sequences, we obtained that the number of observed local minima is on average larger by 7.3% and 3.5%, respectively. The run-time improvement is approximately 16.6% and 6.8% on average over the ten partial energy landscapes. For the large sample size we selected for descent procedures, the coverage of local minima is very high up to energy values of the region where the samples were randomly selected from the partial energy landscapes; that is, the difference to the total set of local minima is mainly due to the upper area of the energy landscapes.

5.
Biomolecules ; 4(1): 56-75, 2014 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-24970205

RESUMEN

We introduce a Firefly-inspired algorithmic approach for protein structure prediction over two different lattice models in three-dimensional space. In particular, we consider three-dimensional cubic and three-dimensional face-centred-cubic (FCC) lattices. The underlying energy models are the Hydrophobic-Polar (H-P) model, the Miyazawa-Jernigan (M-J) model and a related matrix model. The implementation of our approach is tested on ten H-P benchmark problems of a length of 48 and ten M-J benchmark problems of a length ranging from 48 until 61. The key complexity parameter we investigate is the total number of objective function evaluations required to achieve the optimum energy values for the H-P model or competitive results in comparison to published values for the M-J model. For H-P instances and cubic lattices, where data for comparison are available, we obtain an average speed-up over eight instances of 2.1, leaving out two extreme values (otherwise, 8.8). For six M-J instances, data for comparison are available for cubic lattices and runs with a population size of 100, where, a priori, the minimum free energy is a termination criterion. The average speed-up over four instances is 1.2 (leaving out two extreme values, otherwise 1.1), which is achieved for a population size of only eight instances. The present study is a test case with initial results for ad hoc parameter settings, with the aim of justifying future research on larger instances within lattice model settings, eventually leading to the ultimate goal of implementations for off-lattice models.


Asunto(s)
Proteínas/química , Algoritmos , Animales , Simulación por Computador , Luciérnagas , Modelos Teóricos , Pliegue de Proteína
6.
Int J Bioinform Res Appl ; 8(3-4): 171-91, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22961450

RESUMEN

In the present study, we define derivative scoring functions from PITA and STarMir predictions. The scoring functions are evaluated for up to five selected miRNAs with a relatively large number of validated targets reported by TarBase and miRecords. The average ranking of validated targets returned by PITA and STarMir is compared to the average ranking produced by the new derivatives scores. We obtain an average improvement of 13.6% (STD∼5.7%) relative to the average ranking of validated targets produced by PITA and STarMir.


Asunto(s)
MicroARNs/química , Programas Informáticos , Biología Computacional , Análisis de Secuencia de ARN
7.
Comput Biol Chem ; 34(5-6): 284-92, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21035401

RESUMEN

Over the past ten years, a variety of microRNA target prediction methods has been developed, and many of the methods are constantly improved and adapted to recent insights into miRNA-mRNA interactions. In a typical scenario, different methods return different rankings of putative targets, even if the ranking is reduced to selected mRNAs that are related to a specific disease or cell type. For the experimental validation it is then difficult to decide in which order to process the predicted miRNA-mRNA bindings, since each validation is a laborious task and therefore only a limited number of mRNAs can be analysed. We propose a new ranking scheme that combines ranked predictions from several methods and - unlike standard thresholding methods - utilises the concept of Pareto fronts as defined in multi-objective optimisation. In the present study, we attempt a proof of concept by applying the new ranking scheme to hsa-miR-21, hsa-miR-125b, and hsa-miR-373 and prediction scores supplied by PITA and RNAhybrid. The scores are interpreted as a two-objective optimisation problem, and the elements of the Pareto front are ranked by the STarMir score with a subsequent re-calculation of the Pareto front after removal of the top-ranked mRNA from the basic set of prediction scores. The method is evaluated on validated targets of the three miRNA, and the ranking is compared to scores from DIANA-microT and TargetScan. We observed that the new ranking method performs well and consistent, and the first validated targets are elements of Pareto fronts at a relatively early stage of the recurrent procedure, which encourages further research towards a higher-dimensional analysis of Pareto fronts.


Asunto(s)
Marcación de Gen/métodos , MicroARNs/análisis , Análisis de Secuencia de ARN/métodos , Algoritmos , Humanos , MicroARNs/genética , MicroARNs/metabolismo , ARN Mensajero/análisis , ARN Mensajero/genética , ARN Mensajero/metabolismo , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA