Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
BMC Bioinformatics ; 25(1): 23, 2024 Jan 12.
Article in English | MEDLINE | ID: mdl-38216898

ABSTRACT

BACKGROUND: With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches. RESULTS: Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets. CONCLUSION: However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.


Subject(s)
Neoplasms , Humans , Reproducibility of Results , Neoplasms/genetics , Entropy , Gene Expression
2.
Pak J Pharm Sci ; 32(3 Special): 1395-1408, 2019 May.
Article in English | MEDLINE | ID: mdl-31551221

ABSTRACT

Numerous cancer studies have combined different datasets for the prognosis of patients. This study incorporated four networks for significant directed random walk (sDRW) to predict cancerous genes and risk pathways. The study investigated the feasibility of cancer prediction via different networks. In this study, multiple micro array data were analysed and used in the experiment. Six gene expression datasets were applied in four networks to study the effectiveness of the networks in sDRW in terms of cancer prediction. The experimental results showed that one of the proposed networks is outstanding compared to other networks. The network is then proposed to be implemented in sDRW as a walker network. This study provides a foundation for further studies and research on other networks. We hope these finding will improve the prognostic methods of cancer patients.


Subject(s)
Computational Biology/methods , Gene Expression Regulation, Neoplastic , Neoplasms/genetics , Algorithms , Biomarkers, Tumor/genetics , Databases, Genetic , Humans , Microarray Analysis , Protein Interaction Maps/genetics , Random Allocation , Reproducibility of Results , Transcriptome
3.
Genes (Basel) ; 14(3)2023 02 24.
Article in English | MEDLINE | ID: mdl-36980844

ABSTRACT

The integration of microarray technologies and machine learning methods has become popular in predicting the pathological condition of diseases and discovering risk genes. Traditional microarray analysis considers pathways as a simple gene set, treating all genes in the pathway identically while ignoring the pathway network's structure information. This study proposed an entropy-based directed random walk (e-DRW) method to infer pathway activities. Two enhancements from the conventional DRW were conducted, which are (1) to increase the coverage of human pathway information by constructing two inputting networks for pathway activity inference, and (2) to enhance the gene-weighting method in DRW by incorporating correlation coefficient values and t-test statistic scores. To test the objectives, gene expression datasets were used as input datasets while the pathway datasets were used as reference datasets to build two directed graphs. The within-dataset experiments indicated that e-DRW method demonstrated robust and superior performance in terms of classification accuracy and robustness of the predicted risk-active pathways compared to the other methods. In conclusion, the results revealed that e-DRW not only improved the prediction performance, but also effectively extracted topologically important pathways and genes that were specifically related to the corresponding cancer types.


Subject(s)
Neoplasms , Humans , Entropy , Neoplasms/genetics , Neoplasms/metabolism , Genetic Techniques , Gene Expression
4.
Saudi J Biol Sci ; 24(8): 1828-1841, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29551932

ABSTRACT

Microarray technology has become one of the elementary tools for researchers to study the genome of organisms. As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analysis, cancerous classification is an emerging important trend. Significant directed random walk is proposed as one of the cancerous classification approach which have higher sensitivity of risk gene prediction and higher accuracy of cancer classification. In this paper, the methodology and material used for the experiment are presented. Tuning parameter selection method and weight as parameter are applied in proposed approach. Gene expression dataset is used as the input datasets while pathway dataset is used to build a directed graph, as reference datasets, to complete the bias process in random walk approach. In addition, we demonstrate that our approach can improve sensitive predictions with higher accuracy and biological meaningful classification result. Comparison result takes place between significant directed random walk and directed random walk to show the improvement in term of sensitivity of prediction and accuracy of cancer classification.

5.
Comput Biol Med ; 77: 102-15, 2016 10 01.
Article in English | MEDLINE | ID: mdl-27522238

ABSTRACT

Incorporation of pathway knowledge into microarray analysis has brought better biological interpretation of the analysis outcome. However, most pathway data are manually curated without specific biological context. Non-informative genes could be included when the pathway data is used for analysis of context specific data like cancer microarray data. Therefore, efficient identification of informative genes is inevitable. Embedded methods like penalized classifiers have been used for microarray analysis due to their embedded gene selection. This paper proposes an improved penalized support vector machine with absolute t-test weighting scheme to identify informative genes and pathways. Experiments are done on four microarray data sets. The results are compared with previous methods using 10-fold cross validation in terms of accuracy, sensitivity, specificity and F-score. Our method shows consistent improvement over the previous methods and biological validation has been done to elucidate the relation of the selected genes and pathway with the phenotype under study.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks/genetics , Support Vector Machine , Transcriptome/genetics , Animals , Apoptosis/genetics , Cell Cycle/genetics , Gene Expression Profiling , Humans , Mice , Microarray Analysis , Neoplasms/genetics , Neoplasms/metabolism
6.
Comput Biol Med ; 43(9): 1120-33, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23930805

ABSTRACT

A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in hypotheses with regard to the suggestion of dominant function for genes of interest. Besides that, the current fuzzy clustering algorithms do not conduct a thorough analysis of genes with low membership values. Therefore, we present a novel computational framework called the "multi-stage filtering-Clustering Functional Annotation" (msf-CluFA) for clustering gene expression data. The framework consists of four components: fuzzy c-means clustering (msf-CluFA-0), achieving dominant cluster (msf-CluFA-1), improving confidence level (msf-CluFA-2) and combination of msf-CluFA-0, msf-CluFA-1 and msf-CluFA-2 (msf-CluFA-3). By employing double filtering in msf-CluFA-1 and apriori algorithms in msf-CluFA-2, our new framework is capable of determining the dominant clusters and improving the confidence level of genes with lower membership values by means of which the unknown genes can be predicted.


Subject(s)
Algorithms , Gene Expression Profiling/methods , Gene Expression Regulation, Fungal/physiology , Genes, Fungal/physiology , Saccharomyces cerevisiae/metabolism , Software
7.
Comput Biol Med ; 40(6): 555-64, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20417930

ABSTRACT

Protein-protein interactions (PPIs) play a significant role in many crucial cellular operations such as metabolism, signaling and regulations. The computational methods for predicting PPIs have shown tremendous growth in recent years, but problem such as huge false positive rates has contributed to the lack of solid PPI information. We aimed at enhancing the overlap between computational predictions and experimental results in an effort to partially remove PPIs falsely predicted. The use of protein function predictor named PFP() that are based on shared interacting domain patterns is introduced in this study with the purpose of aiding the Gene Ontology Annotations (GOA). We used GOA and PFP() as agents in a filtering process to reduce false positive pairs in the computationally predicted PPI datasets. The functions predicted by PFP() were extracted from cross-species PPI data in order to assign novel functional annotations for the uncharacterized proteins and also as additional functions for those that are already characterized by the GO (Gene Ontology). The implementation of PFP() managed to increase the chances of finding matching function annotation for the first rule in the filtration process as much as 20%. To assess the capability of the proposed framework in filtering false PPIs, we applied it on the available S. cerevisiae PPIs and measured the performance in two aspects, the improvement made indicated as Signal-to-Noise Ratio (SNR) and the strength of improvement, respectively. The proposed filtering framework significantly achieved better performance than without it in both metrics.


Subject(s)
Computational Biology/methods , Models, Statistical , Protein Interaction Domains and Motifs , Protein Interaction Mapping/methods , Proteins/chemistry , Proteins/physiology , Algorithms , Animals , Caenorhabditis elegans Proteins , Cluster Analysis , Databases, Genetic , Drosophila Proteins , Humans , Saccharomyces cerevisiae Proteins , Terminology as Topic
SELECTION OF CITATIONS
SEARCH DETAIL