Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
Sensors (Basel) ; 24(13)2024 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-39001123

RESUMEN

As 5G technology becomes more widespread, the significant improvement in network speed and connection density has introduced more challenges to network security. In particular, distributed denial of service (DDoS) attacks have become more frequent and complex in software-defined network (SDN) environments. The complexity and diversity of 5G networks result in a great deal of unnecessary features, which may introduce noise into the detection process of an intrusion detection system (IDS) and reduce the generalization ability of the model. This paper aims to improve the performance of the IDS in 5G networks, especially in terms of detection speed and accuracy. It proposes an innovative feature selection (FS) method to filter out the most representative and distinguishing features from network traffic data to improve the robustness and detection efficiency of the IDS. To confirm the suggested method's efficacy, this paper uses four common machine learning (ML) models to evaluate the InSDN, CICIDS2017, and CICIDS2018 datasets and conducts real-time DDoS attack detection on the simulation platform. According to experimental results, the suggested FS technique may match 5G network requirements for high speed and high reliability of the IDS while also drastically cutting down on detection time and preserving or improving DDoS detection accuracy.

2.
J Med Syst ; 48(1): 10, 2024 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-38193948

RESUMEN

Gene expression datasets offer a wide range of information about various biological processes. However, it is difficult to find the important genes among the high-dimensional biological data due to the existence of redundant and unimportant ones. Numerous Feature Selection (FS) techniques have been created to get beyond this obstacle. Improving the efficacy and precision of FS methodologies is crucial in order to identify significant genes amongst complicated complex biological data. In this work, we present a novel approach to gene selection called the Sine Cosine and Cuckoo Search Algorithm (SCACSA). This hybrid method is designed to work with well-known machine learning classifiers Support Vector Machine (SVM). Using a dataset on breast cancer, the hybrid gene selection algorithm's performance is carefully assessed and compared to other feature selection methods. To improve the quality of the feature set, we use minimum Redundancy Maximum Relevance (mRMR) as a filtering strategy in the first step. The hybrid SCACSA method is then used to enhance and optimize the gene selection procedure. Lastly, we classify the dataset according to the chosen genes by using the SVM classifier. Given the pivotal role gene selection plays in unraveling complex biological datasets, SCACSA stands out as an invaluable tool for the classification of cancer datasets. The findings help medical practitioners make well-informed decisions about cancer diagnosis and provide them with a valuable tool for navigating the complex world of gene expression data.


Asunto(s)
Algoritmos , Neoplasias de la Mama , Humanos , Femenino , Neoplasias de la Mama/genética , Personal de Salud , Aprendizaje Automático , Máquina de Vectores de Soporte
3.
Entropy (Basel) ; 23(9)2021 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-34573818

RESUMEN

With the widespread use of intelligent information systems, a massive amount of data with lots of irrelevant, noisy, and redundant features are collected; moreover, many features should be handled. Therefore, introducing an efficient feature selection (FS) approach becomes a challenging aim. In the recent decade, various artificial methods and swarm models inspired by biological and social systems have been proposed to solve different problems, including FS. Thus, in this paper, an innovative approach is proposed based on a hybrid integration between two intelligent algorithms, Electric fish optimization (EFO) and the arithmetic optimization algorithm (AOA), to boost the exploration stage of EFO to process the high dimensional FS problems with a remarkable convergence speed. The proposed EFOAOA is examined with eighteen datasets for different real-life applications. The EFOAOA results are compared with a set of recent state-of-the-art optimizers using a set of statistical metrics and the Friedman test. The comparisons show the positive impact of integrating the AOA operator in the EFO, as the proposed EFOAOA can identify the most important features with high accuracy and efficiency. Compared to the other FS methods whereas, it got the lowest features number and the highest accuracy in 50% and 67% of the datasets, respectively.

4.
Comput Biol Med ; 182: 109175, 2024 Sep 24.
Artículo en Inglés | MEDLINE | ID: mdl-39321584

RESUMEN

Bladder cancer (BC) diagnosis presents a critical challenge in biomedical research, necessitating accurate tumor classification from diverse datasets for effective treatment planning. This paper introduces a novel wrapper feature selection (FS) method that leverages a hybrid optimization algorithm combining Orthogonal Learning (OL) with a rime optimization algorithm (RIME), termed mRIME. The mRIME algorithm is designed to avoid local optima, streamline the search process, and select the most relevant features without compromising classifier performance. It also introduces mRIME-SVM, a novel hybrid model integrating modified mRIME for FS with Support Vector Machine (SVM) for classification. The mRIME algorithm is employed as an FS method and is also utilized to fine-tune the hyperparameters of it the It SVM, enhancing the overall classification accuracy. Specifically, mRIME navigates complex search spaces to optimize FS without compromising classifier performance. Evaluated on eight diverse BC datasets, mRIME-SVM outperforms popular metaheuristic algorithms, ensuring precise and reliable diagnostic outcomes. Moreover, the proposed mRIME was employed for tackling global optimization problems. It has been thoroughly assessed using the IEEE Congress on Evolutionary Computation 2022 (CEC'2022) test suite. Comparative analyzes with Gray wolf optimization (GWO), Whale optimization algorithm (WOA), Harris hawks optimization (HHO), Golden Jackal Optimization (GJO), Hunger Game optimization algorithm (HGS), Sinh Cosh Optimizer (SCHO), and the original RIME highlight mRIME's competitiveness and efficacy across diverse optimization tasks. Leveraging mRIME's success, mRIME-SVM achieves high classification accuracy on nine BC datasets, surpassing existing models. Results underscore mRIME's competitiveness and applicability across diverse optimization tasks, extending its utility to enhance BC classification. This study contributes to advancing BC diagnostics with a robust computational framework, promising broader applications in bioinformatics and AI-driven medical research.

5.
Comput Biol Med ; 180: 108984, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39128177

RESUMEN

The identification of tumors through gene analysis in microarray data is a pivotal area of research in artificial intelligence and bioinformatics. This task is challenging due to the large number of genes relative to the limited number of observations, making feature selection a critical step. This paper introduces a novel wrapper feature selection method that leverages a hybrid optimization algorithm combining a genetic operator with a Sinh Cosh Optimizer (SCHO), termed SCHO-GO. The SCHO-GO algorithm is designed to avoid local optima, streamline the search process, and select the most relevant features without compromising classifier performance. Traditional methods often falter with extensive search spaces, necessitating hybrid approaches. Our method aims to reduce the dimensionality and improve the classification accuracy, which is essential in pattern recognition and data analysis. The SCHO-GO algorithm, integrated with a support vector machine (SVM) classifier, significantly enhances cancer classification accuracy. We evaluated the performance of SCHO-GO using the CEC'2022 benchmark function and compared it with seven well-known metaheuristic algorithms. Statistical analyses indicate that SCHO-GO consistently outperforms these algorithms. Experimental tests on eight microarray gene expression datasets, particularly the Gene Expression Cancer RNA-Seq dataset, demonstrate an impressive accuracy of 99.01% with the SCHO-GO-SVM model, highlighting its robustness and precision in handling complex datasets. Furthermore, the SCHO-GO algorithm excels in feature selection and solving mathematical benchmark problems, presenting a promising approach for tumor identification and classification in microarray data analysis.


Asunto(s)
Neoplasias , Máquina de Vectores de Soporte , Humanos , Neoplasias/genética , Algoritmos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos
6.
Diagnostics (Basel) ; 14(11)2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38893707

RESUMEN

This study, utilizing high-throughput technologies and Machine Learning (ML), has identified gene biomarkers and molecular signatures in Inflammatory Bowel Disease (IBD). We could identify significant upregulated or downregulated genes in IBD patients by comparing gene expression levels in colonic specimens from 172 IBD patients and 22 healthy individuals using the GSE75214 microarray dataset. Our ML techniques and feature selection methods revealed six Differentially Expressed Gene (DEG) biomarkers (VWF, IL1RL1, DENND2B, MMP14, NAAA, and PANK1) with strong diagnostic potential for IBD. The Random Forest (RF) model demonstrated exceptional performance, with accuracy, F1-score, and AUC values exceeding 0.98. Our findings were rigorously validated with independent datasets (GSE36807 and GSE10616), further bolstering their credibility and showing favorable performance metrics (accuracy: 0.841, F1-score: 0.734, AUC: 0.887). Our functional annotation and pathway enrichment analysis provided insights into crucial pathways associated with these dysregulated genes. DENND2B and PANK1 were identified as novel IBD biomarkers, advancing our understanding of the disease. The validation in independent cohorts enhances the reliability of these findings and underscores their potential for early detection and personalized treatment of IBD. Further exploration of these genes is necessary to fully comprehend their roles in IBD pathogenesis and develop improved diagnostic tools and therapies. This study significantly contributes to IBD research with valuable insights, potentially greatly enhancing patient care.

7.
Neural Comput Appl ; 35(7): 5251-5275, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36340595

RESUMEN

Feature selection (FS) is one of the basic data preprocessing steps in data mining and machine learning. It is used to reduce feature size and increase model generalization. In addition to minimizing feature dimensionality, it also enhances classification accuracy and reduces model complexity, which are essential in several applications. Traditional methods for feature selection often fail in the optimal global solution due to the large search space. Many hybrid techniques have been proposed depending on merging several search strategies which have been used individually as a solution to the FS problem. This study proposes a modified hunger games search algorithm (mHGS), for solving optimization and FS problems. The main advantages of the proposed mHGS are to resolve the following drawbacks that have been raised in the original HGS; (1) avoiding the local search, (2) solving the problem of premature convergence, and (3) balancing between the exploitation and exploration phases. The mHGS has been evaluated by using the IEEE Congress on Evolutionary Computation 2020 (CEC'20) for optimization test and ten medical and chemical datasets. The data have dimensions up to 20000 features or more. The results of the proposed algorithm have been compared to a variety of well-known optimization methods, including improved multi-operator differential evolution algorithm (IMODE), gravitational search algorithm, grey wolf optimization, Harris Hawks optimization, whale optimization algorithm, slime mould algorithm and hunger search games search. The experimental results suggest that the proposed mHGS can generate effective search results without increasing the computational cost and improving the convergence speed. It has also improved the SVM classification performance.

8.
Comput Biol Med ; 165: 107389, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37678138

RESUMEN

This paper introduces a new bio-inspired optimization algorithm named the Liver Cancer Algorithm (LCA), which mimics the liver tumor growth and takeover process. It uses an evolutionary search approach that simulates the behavior of liver tumors when taking over the liver organ. The tumor's ability to replicate and spread to other organs inspires the algorithm. LCA algorithm is developed using genetic operators and a Random Opposition-Based Learning (ROBL) strategy to efficiently balance local and global searches and explore the search space. The algorithm's efficiency is tested on the IEEE Congress of Evolutionary Computation in 2020 (CEC'2020) benchmark functions and compared to seven widely used metaheuristic algorithms, including Genetic Algorithm (GA), particle swarm optimization (PSO), Differential Evolution (DE), Adaptive Guided Differential Evolution Algorithm (AGDE), Improved Multi-Operator Differential Evolution (IMODE), Harris Hawks Optimization (HHO), Runge-Kutta Optimization Algorithm (RUN), weIghted meaN oF vectOrs (INFO), and Coronavirus Herd Immunity Optimizer (CHIO). The statistical results of the convergence curve, boxplot, parameter space, and qualitative metrics show that the LCA algorithm performs competitively compared to well-known algorithms. Moreover, the versatility of the LCA algorithm extends beyond mathematical benchmark problems. It was also successfully applied to tackle the feature selection problem and optimize the support vector machine for various biomedical data classifications, resulting in the creation of the LCA-SVM model. The LCA-SVM model was evaluated in a total of twelve datasets, among which the MonoAmine Oxidase (MAO) dataset stood out, showing the highest performance compared to the other datasets. In particular, the LCA-SVM model achieved an impressive accuracy of 98.704% on the MAO dataset. This outstanding result demonstrates the efficacy and potential of the LCA-SVM approach in handling complex datasets and producing highly accurate predictions. The experimental results indicate that the LCA algorithm surpasses other methods to solve mathematical benchmark problems and feature selection.


Asunto(s)
Neoplasias Hepáticas , Humanos , Algoritmos , Benchmarking , Monoaminooxidasa
9.
Neural Comput Appl ; 35(5): 3903-3923, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36267472

RESUMEN

Due to technical advancements and the proliferation of mobile applications, facial analysis (FA) of humans has recently become an important area for computer vision research. FA investigates a variety of difficulties, including gender recognition, facial expression recognition, age and race recognition, with the goal of automatically comprehending social interactions. Due to the dimensional challenge posed by pre-trained CNN networks, the scientific community has developed numerous techniques inspired by biology, swarm intelligence theory, physics, and mathematical rules. This article presents a gender recognition system based on scAOA, that is a modified version of the Archimedes optimization algorithm (AOA). The latest variant (scAOA) enhances the exploitation stage by using trigonometric operators inspired by the sine cosine algorithm (SCA) in order to prevent local optima and to accelerate the convergence. The main purpose of this paper is to apply scAOA to select the relevant deep features provided by two pretrained models of CNN (AlexNet & ResNet) to recognize the gender of a human person categorized into two classes (men and women). Two datasets are used to evaluate the proposed approach (scAOA): the Brazilian FEI dataset and the Georgia Tech Face dataset (GT). In terms of accuracy, Fscore and statistical test, the comparison analysis demonstrates that scAOA outperforms other modern and competitive optimizers such as AOA, SCA, Ant lion optimizer (ALO), Salp swarm algorithm (SSA), Grey wolf optimizer (GWO), Simple genetic algorithm (SGA), Grasshopper optimization algorithm (GOA) and Particle swarm optimizer (PSO).

10.
Front Genet ; 13: 844542, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35664298

RESUMEN

The standard therapy administered to patients with advanced esophageal cancer remains uniform, despite its two main histological subtypes, namely esophageal squamous cell carcinoma (SCC) and esophageal adenocarcinoma (AC), are being increasingly considered to be different. The identification of potential drug target genes between SCC and AC is crucial for more effective treatment of these diseases, given the high toxicity of chemotherapy and resistance to administered medications. Herein we attempted to identify and rank differentially expressed genes (DEGs) in SCC vs. AC using ensemble feature selection methods. RNA-seq data from The Cancer Genome Atlas and the Fudan-Taizhou Institute of Health Sciences (China). Six feature filters algorithms were used to identify DEGs. We built robust predictive models for histological subtypes with the random forest (RF) classification algorithm. Pathway analysis also be performed to investigate the functional role of genes. 294 informative DEGs (87 of them are newly discovered) have been identified. The areas under receiver operator curve (AUC) were higher than 99.5% for all feature selection (FS) methods. Nine genes (i.e., ERBB3, ATP7B, ABCC3, GALNT14, CLDN18, GUCY2C, FGFR4, KCNQ5, and CACNA1B) may play a key role in the development of more directed anticancer therapy for SCC and AC patients. The first four of them are drug targets for chemotherapy and immunotherapy of esophageal cancer and involved in pharmacokinetics and pharmacodynamics pathways. Research identified novel DEGs in SCC and AC, and detected four potential drug targeted genes (ERBB3, ATP7B, ABCC3, and GALNT14) and five drug-related genes.

11.
Soft comput ; 26(19): 10435-10464, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35250374

RESUMEN

Human facial analysis (HFA) has recently become an attractive topic for computer vision research due to technological progress and mobile applications. HFA explores several issues as gender recognition (GR), facial expression, age, and race recognition for automatically understanding social life. This study explores HFA from the angle of recognizing a person's gender from their face. Several hard challenges are provoked, such as illumination, occlusion, facial emotions, quality, and angle of capture by cameras, making gender recognition more difficult for machines. The Archimedes optimization algorithm (AOA) was recently designed as a metaheuristic-based population optimization method, inspired by the Archimedes theory's physical notion. Compared to other swarm algorithms in the realm of optimization, this method promotes a good balance between exploration and exploitation. The convergence area is increased By incorporating extra data into the solution, such as volume and density. Because of the preceding benefits of AOA and the fact that it has not been used to choose the best area of the face, we propose utilizing a wrapper feature selection technique, which is a real motivation in the field of computer vision and machine learning. The paper's primary purpose is to automatically determine the optimal face area using AOA to recognize the gender of a human person categorized by two classes (Men and women). In this paper, the facial image is divided into several subregions (blocks), where each area provides a vector of characteristics using one method from handcrafted techniques as the local binary pattern (LBP), histogram-oriented gradient (HOG), or gray-level co-occurrence matrix (GLCM). Two experiments assess the proposed method (AOA): The first employs two benchmarking datasets: the Georgia Tech Face dataset (GT) and the Brazilian FEI dataset. The second experiment represents a more challenging large dataset that uses Gallagher's uncontrolled dataset. The experimental results show the good performance of AOA compared to other recent and competitive optimizers for all datasets. In terms of accuracy, the AOA-based LBP outperforms the state-of-the-art deep convolutional neural network (CNN) with 96.08% for the Gallagher's dataset.

12.
Front Genet ; 13: 984068, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36338976

RESUMEN

SARS-COV-2 is prevalent all over the world, causing more than six million deaths and seriously affecting human health. At present, there is no specific drug against SARS-COV-2. Protein phosphorylation is an important way to understand the mechanism of SARS -COV-2 infection. It is often expensive and time-consuming to identify phosphorylation sites with specific modified residues through experiments. A method that uses machine learning to make predictions about them is proposed. As all the methods of extracting protein sequence features are knowledge-driven, these features may not be effective for detecting phosphorylation sites without a complete understanding of the mechanism of protein. Moreover, redundant features also have a great impact on the fitting degree of the model. To solve these problems, we propose a feature selection method based on ensemble learning, which firstly extracts protein sequence features based on knowledge, then quantifies the importance score of each feature based on data, and finally uses the subset of important features as the final features to predict phosphorylation sites.

13.
Front Bioinform ; 2: 927312, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36304293

RESUMEN

Machine learning has shown utility in detecting patterns within large, unstructured, and complex datasets. One of the promising applications of machine learning is in precision medicine, where disease risk is predicted using patient genetic data. However, creating an accurate prediction model based on genotype data remains challenging due to the so-called "curse of dimensionality" (i.e., extensively larger number of features compared to the number of samples). Therefore, the generalizability of machine learning models benefits from feature selection, which aims to extract only the most "informative" features and remove noisy "non-informative," irrelevant and redundant features. In this article, we provide a general overview of the different feature selection methods, their advantages, disadvantages, and use cases, focusing on the detection of relevant features (i.e., SNPs) for disease risk prediction.

14.
Math Biosci Eng ; 18(4): 3813-3854, 2021 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-34198414

RESUMEN

Feature selection (FS) is a classic and challenging optimization task in the field of machine learning and data mining. Gradient-based optimizer (GBO) is a recently developed metaheuristic with population-based characteristics inspired by gradient-based Newton's method that uses two main operators: the gradient search rule (GSR), the local escape operator (LEO) and a set of vectors to explore the search space for solving continuous problems. This article presents a binary GBO (BGBO) algorithm and for feature selecting problems. The eight independent GBO variants are proposed, and eight transfer functions divided into two families of S-shaped and V-shaped are evaluated to map the search space to a discrete space of research. To verify the performance of the proposed binary GBO algorithm, 18 well-known UCI datasets and 10 high-dimensional datasets are tested and compared with other advanced FS methods. The experimental results show that among the proposed binary GBO algorithms has the best comprehensive performance and has better performance than other well known metaheuristic algorithms in terms of the performance measures.

15.
Front Genet ; 12: 793629, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35350819

RESUMEN

OMIC datasets have high dimensions, and the connection among OMIC features is very complicated. It is difficult to establish linkages among these features and certain biological traits of significance. The proposed ensemble swarm intelligence-based approaches can identify key biomarkers and reduce feature dimension efficiently. It is an end-to-end method that only relies on the rules of the algorithm itself, without presets such as the number of filtering features. Additionally, this method achieves good classification accuracy without excessive consumption of computing resources.

16.
Interdiscip Sci ; 13(3): 463-475, 2021 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-32533456

RESUMEN

In the tremendous field of the bioinformatics look into, enormous volume of genetic information has been produced. Higher throughput gadgets are made accessible at lower cost made the age of Big data. In a time of developing information multifaceted nature and volume and the approach of huge information, feature selection has a key task to carry out in decreasing high dimensionality in AI issues. Dealing with such huge data has turned out to be incredibly testing strategy for choosing the exact features in enormous medical databases. Large clinical data frequently comprise of an enormous number of identifiers of the disease. Data mining when applied to clinical data for identification of diseases, a few identifiers are will not be much useful and sometimes may even have negative impacts. Consequently, when the FS is applied, it is vital as it can expel those insignificant disease identifiers. It likewise builds the adequacy of decision by a physician emotionally supportive network by viably diminishing the time of learning of the framework. In this paper, a unique approach is presented for the feature selection utilizing the Artificial Plant algorithm which uses the Enhanced Support Vector Machine classifier. The features got are additionally dimensionally decreased by presenting the Improved Singular Value Decomposition strategy; finally, enhancement is done by the outstanding BAT streamlining method. The examinations are completed with real-time large cervical cancer data and it demonstrated to be more effective than the current methods.


Asunto(s)
Algoritmos , Máquina de Vectores de Soporte , Biología Computacional , Minería de Datos , Humanos
17.
Brain Sci ; 10(11)2020 Nov 17.
Artículo en Inglés | MEDLINE | ID: mdl-33212777

RESUMEN

Motor deficiencies constitute a significant problem affecting millions of people worldwide. Such people suffer from a debility in daily functioning, which may lead to decreased and incoherence in daily routines and deteriorate their quality of life (QoL). Thus, there is an essential need for assistive systems to help those people achieve their daily actions and enhance their overall QoL. This study proposes a novel brain-computer interface (BCI) system for assisting people with limb motor disabilities in performing their daily life activities by using their brain signals to control assistive devices. The extraction of useful features is vital for an efficient BCI system. Therefore, the proposed system consists of a hybrid feature set that feeds into three machine-learning (ML) classifiers to classify motor Imagery (MI) tasks. This hybrid feature selection (FS) system is practical, real-time, and an efficient BCI with low computation cost. We investigate different combinations of channels to select the combination that has the highest impact on performance. The results indicate that the highest achieved accuracies using a support vector machine (SVM) classifier are 93.46% and 86.0% for the BCI competition III-IVa dataset and the autocalibration and recurrent adaptation dataset, respectively. These datasets are used to test the performance of the proposed BCI. Also, we verify the effectiveness of the proposed BCI by comparing its performance with recent studies. We show that the proposed system is accurate and efficient. Future work can apply the proposed system to individuals with limb motor disabilities to assist them and test their capability to improve their QoL. Moreover, the forthcoming work can examine the system's performance in controlling assistive devices such as wheelchairs or artificial limbs.

18.
Comput Biol Chem ; 71: 161-169, 2017 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-29096382

RESUMEN

This paper proposes a new hybrid search technique for feature (gene) selection (FS) using Independent component analysis (ICA) and Artificial Bee Colony (ABC) called ICA+ABC, to select informative genes based on a Naïve Bayes (NB) algorithm. An important trait of this technique is the optimization of ICA feature vector using ABC. ICA+ABC is a hybrid search algorithm that combines the benefits of extraction approach, to reduce the size of data and wrapper approach, to optimize the reduced feature vectors. This hybrid search technique is facilitated by evaluating the performance of ICA+ABC on six standard gene expression datasets of classification. Extensive experiments were conducted to compare the performance of ICA+ABC with the results obtained from recently published Minimum Redundancy Maximum Relevance (mRMR) +ABC algorithm for NB classifier. Also to check the performance that how ICA+ABC works as feature selection with NB classifier, compared the combination of ICA with popular filter techniques and with other similar bio inspired algorithm such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The result shows that ICA+ABC has a significant ability to generate small subsets of genes from the ICA feature vector, that significantly improve the classification accuracy of NB classifier compared to other previously suggested methods.


Asunto(s)
Algoritmos , Teorema de Bayes , Análisis de Secuencia por Matrices de Oligonucleótidos , Expresión Génica , Humanos
19.
Comput Methods Programs Biomed ; 113(1): 175-85, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24210167

RESUMEN

Medical datasets are often classified by a large number of disease measurements and a relatively small number of patient records. All these measurements (features) are not important or irrelevant/noisy. These features may be especially harmful in the case of relatively small training sets, where this irrelevancy and redundancy is harder to evaluate. On the other hand, this extreme number of features carries the problem of memory usage in order to represent the dataset. Feature Selection (FS) is a solution that involves finding a subset of prominent features to improve predictive accuracy and to remove the redundant features. Thus, the learning model receives a concise structure without forfeiting the predictive accuracy built by using only the selected prominent features. Therefore, nowadays, FS is an essential part of knowledge discovery. In this study, new supervised feature selection methods based on hybridization of Particle Swarm Optimization (PSO), PSO based Relative Reduct (PSO-RR) and PSO based Quick Reduct (PSO-QR) are presented for the diseases diagnosis. The experimental result on several standard medical datasets proves the efficiency of the proposed technique as well as enhancements over the existing feature selection techniques.


Asunto(s)
Diagnóstico , Modelos Teóricos
20.
Comput Methods Programs Biomed ; 113(2): 465-73, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24290902

RESUMEN

Machine learning-based classification techniques provide support for the decision-making process in many areas of health care, including diagnosis, prognosis, screening, etc. Feature selection (FS) is expected to improve classification performance, particularly in situations characterized by the high data dimensionality problem caused by relatively few training examples compared to a large number of measured features. In this paper, a random forest classifier (RFC) approach is proposed to diagnose lymph diseases. Focusing on feature selection, the first stage of the proposed system aims at constructing diverse feature selection algorithms such as genetic algorithm (GA), Principal Component Analysis (PCA), Relief-F, Fisher, Sequential Forward Floating Search (SFFS) and the Sequential Backward Floating Search (SBFS) for reducing the dimension of lymph diseases dataset. Switching from feature selection to model construction, in the second stage, the obtained feature subsets are fed into the RFC for efficient classification. It was observed that GA-RFC achieved the highest classification accuracy of 92.2%. The dimension of input feature space is reduced from eighteen to six features by using GA.


Asunto(s)
Algoritmos , Enfermedades Linfáticas/clasificación , Inteligencia Artificial , Humanos , Análisis de Componente Principal
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA