Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
1.
BMC Bioinformatics ; 20(1): 608, 2019 Nov 27.
Article in English | MEDLINE | ID: mdl-31775613

ABSTRACT

BACKGROUND: Microarray datasets consist of complex and high-dimensional samples and genes, and generally the number of samples is much smaller than the number of genes. Due to this data imbalance, gene selection is a demanding task for microarray expression data analysis. RESULTS: The gene set selected by DGS has shown its superior performances in cancer classification. DGS has a high capability of reducing the number of genes in the original microarray datasets. The experimental comparisons with other representative and state-of-the-art gene selection methods also showed that DGS achieved the best performance in terms of the number of selected genes, classification accuracy, and computational cost. CONCLUSIONS: We provide an efficient gene selection algorithm can select relevant genes which are significantly sensitive to the samples' classes. With the few discriminative genes and less cost time by the proposed algorithm achieved much high prediction accuracy on several public microarray data, which in turn verifies the efficiency and effectiveness of the proposed gene selection method.


Subject(s)
Genetic Techniques , Neoplasms/genetics , Algorithms , Gene Expression Profiling/methods , Gene Expression Regulation, Neoplastic , Humans , Microarray Analysis , Research Design
2.
BMC Med Genomics ; 12(1): 10, 2019 01 15.
Article in English | MEDLINE | ID: mdl-30646919

ABSTRACT

BACKGROUND: Microarray datasets are an important medical diagnostic tool as they represent the states of a cell at the molecular level. Available microarray datasets for classifying cancer types generally have a fairly small sample size compared to the large number of genes involved. This fact is known as a curse of dimensionality, which is a challenging problem. Gene selection is a promising approach that addresses this problem and plays an important role in the development of efficient cancer classification due to the fact that only a small number of genes are related to the classification problem. Gene selection addresses many problems in microarray datasets such as reducing the number of irrelevant and noisy genes, and selecting the most related genes to improve the classification results. METHODS: An innovative Gene Selection Programming (GSP) method is proposed to select relevant genes for effective and efficient cancer classification. GSP is based on Gene Expression Programming (GEP) method with a new defined population initialization algorithm, a new fitness function definition, and improved mutation and recombination operators. . Support Vector Machine (SVM) with a linear kernel serves as a classifier of the GSP. RESULTS: Experimental results on ten microarray cancer datasets demonstrate that Gene Selection Programming (GSP) is effective and efficient in eliminating irrelevant and redundant genes/features from microarray datasets. The comprehensive evaluations and comparisons with other methods show that GSP gives a better compromise in terms of all three evaluation criteria, i.e., classification accuracy, number of selected genes, and computational cost. The gene set selected by GSP has shown its superior performances in cancer classification compared to those selected by the up-to-date representative gene selection methods. CONCLUSION: Gene subset selected by GSP can achieve a higher classification accuracy with less processing time.


Subject(s)
Computational Biology/methods , Genes, Neoplasm/genetics , Neoplasms/classification , Neoplasms/genetics , Oligonucleotide Array Sequence Analysis , Support Vector Machine , Gene Expression Profiling , Mutation
3.
IET Syst Biol ; 13(3): 129-135, 2019 06.
Article in English | MEDLINE | ID: mdl-31170692

ABSTRACT

Non-small cell lung cancer (NSCLC) is the most popular and dangerous type of lung cancer. Adjuvant chemotherapy (ACT) is the main treatment after surgery resection to prevent the patient from cancer recurrence. However, ACT could be toxic and unhelpful in some cases. Therefore, it is highly desired in clinical applications to predict the treatment outcomes of chemotherapy. Conventional methods of predicting cancer treatment rely solely on histopathology and the results are not reliable in some cases. This study aims at building a predictive model to identify who needs ACT treatment and who should avoid it. To this end, the authors propose an innovative method to identify NSCLC-related prognostic genes from microarray gene-expression datasets. They also propose a new model using gene-expression programming algorithm for ACT classification. The proposed model was evaluated on integrated microarray datasets from four institutes and compared with four representative methods: general regression neural network, decision tree, support vector machine and naive Bayes. Evaluation results demonstrated the effectiveness of the proposed model with accuracy 89.8% which is higher than other representative models. They obtained four probes (four genes) that can get good prediction results. These genes are 204891_s_at (LCK), 208893_s_at (DUSP6), 202454_s_at (ERBB3) and 201076_at (MMD).


Subject(s)
Carcinoma, Non-Small-Cell Lung/drug therapy , Chemotherapy, Adjuvant , Lung Neoplasms/drug therapy , Models, Statistical , Algorithms , Bayes Theorem , Carcinoma, Non-Small-Cell Lung/genetics , Gene Expression Regulation, Neoplastic/drug effects , Humans , Lung Neoplasms/genetics , Treatment Outcome
4.
IET Syst Biol ; 10(5): 168-178, 2016 Oct.
Article in English | MEDLINE | ID: mdl-27762231

ABSTRACT

Lung cancer is a leading cause of cancer-related death worldwide. The early diagnosis of cancer has demonstrated to be greatly helpful for curing the disease effectively. Microarray technology provides a promising approach of exploiting gene profiles for cancer diagnosis. In this study, the authors propose a gene expression programming (GEP)-based model to predict lung cancer from microarray data. The authors use two gene selection methods to extract the significant lung cancer related genes, and accordingly propose different GEP-based prediction models. Prediction performance evaluations and comparisons between the authors' GEP models and three representative machine learning methods, support vector machine, multi-layer perceptron and radial basis function neural network, were conducted thoroughly on real microarray lung cancer datasets. Reliability was assessed by the cross-data set validation. The experimental results show that the GEP model using fewer feature genes outperformed other models in terms of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve. It is concluded that GEP model is a better solution to lung cancer prediction problems.


Subject(s)
Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Lung Neoplasms/diagnosis , Lung Neoplasms/genetics , Oligonucleotide Array Sequence Analysis , Algorithms , Area Under Curve , Biomarkers, Tumor/genetics , Computational Biology , Data Interpretation, Statistical , Humans , Neural Networks, Computer , ROC Curve , Reproducibility of Results , Sensitivity and Specificity , Support Vector Machine
SELECTION OF CITATIONS
SEARCH DETAIL