Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35062018

RESUMO

Combination therapy has shown an obvious curative effect on complex diseases, whereas the search space of drug combinations is too large to be validated experimentally even with high-throughput screens. With the increase of the number of drugs, artificial intelligence techniques, especially machine learning methods, have become applicable for the discovery of synergistic drug combinations to significantly reduce the experimental workload. In this study, in order to predict novel synergistic drug combinations in various cancer cell lines, the cell line-specific drug-induced gene expression profile (GP) is added as a new feature type to capture the cellular response of drugs and reveal the biological mechanism of synergistic effect. Then, an enhanced cascade-based deep forest regressor (EC-DFR) is innovatively presented to apply the new small-scale drug combination dataset involving chemical, physical and biological (GP) properties of drugs and cells. Verified by the dataset, EC-DFR outperforms two state-of-the-art deep neural network-based methods and several advanced classical machine learning algorithms. Biological experimental validation performed subsequently on a set of previously untested drug combinations further confirms the performance of EC-DFR. What is more prominent is that EC-DFR can distinguish the most important features, making it more interpretable. By evaluating the contribution of each feature type, GP feature contributes 82.40%, showing the cellular responses of drugs may play crucial roles in synergism prediction. The analysis based on the top contributing genes in GP further demonstrates some potential relationships between the transcriptomic levels of key genes under drug regulation and the synergism of drug combinations.


Assuntos
Inteligência Artificial , Biologia Computacional , Biologia Computacional/métodos , Combinação de Medicamentos , Aprendizado de Máquina , Redes Neurais de Computação
2.
BMC Bioinformatics ; 24(1): 325, 2023 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-37644423

RESUMO

INTRODUCTION: There are countless possibilities for drug combinations, which makes it expensive and time-consuming to rely solely on clinical trials to determine the effects of each possible drug combination. In order to screen out the most effective drug combinations more quickly, scholars began to apply machine learning to drug combination prediction. However, most of them are of low interpretability. Consequently, even though they can sometimes produce high prediction accuracy, experts in the medical and biological fields can still not fully rely on their judgments because of the lack of knowledge about the decision-making process. RELATED WORK: Decision trees and their ensemble algorithms are considered to be suitable methods for pharmaceutical applications due to their excellent performance and good interpretability. We review existing decision trees or decision tree ensemble algorithms in the medical field and point out their shortcomings. METHOD: This study proposes a decision stump (DS)-based solution to extract interpretable knowledge from data sets. In this method, a set of DSs is first generated to selectively form a decision tree (DST). Different from the traditional decision tree, our algorithm not only enables a partial exchange of information between base classifiers by introducing a stump exchange method but also uses a modified Gini index to evaluate stump performance so that the generation of each node is evaluated by a global view to maintain high generalization ability. Furthermore, these trees are combined to construct an ensemble of DST (EDST). EXPERIMENT: The two-drug combination data sets are collected from two cell lines with three classes (additive, antagonistic and synergistic effects) to test our method. Experimental results show that both our DST and EDST perform better than other methods. Besides, the rules generated by our methods are more compact and more accurate than other rule-based algorithms. Finally, we also analyze the extracted knowledge by the model in the field of bioinformatics. CONCLUSION: The novel decision tree ensemble model can effectively predict the effect of drug combination datasets and easily obtain the decision-making process.


Assuntos
Algoritmos , Biologia Computacional , Linhagem Celular , Combinação de Medicamentos , Conhecimento
3.
J Chem Inf Model ; 63(12): 3941-3954, 2023 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-37303117

RESUMO

Combination therapy is a promising clinical treatment strategy for cancer and other complex diseases. Multiple drugs can target multiple proteins and pathways, greatly improving the therapeutic effect and slowing down drug resistance. To narrow the search space of synergistic drug combinations, many prediction models have been developed. However, drug combination datasets always have the characteristics of class imbalance. Synergistic drug combinations receive the most attention in clinical application but are in small numbers. To predict synergistic drug combinations in different cancer cell lines, in this study, we propose a genetic algorithm-based ensemble learning framework, GA-DRUG, to address the problems of class imbalance and high dimensionality of input data. The cell-line-specific gene expression profiles under drug perturbations are used to train GA-DRUG, which contains imbalanced data processing and the search of global optimal solutions. Compared to 11 state-of-the-art algorithms, GA-DRUG achieves the best performance and significantly improves the prediction performance in the minority class (Synergy). The ensemble framework can effectively correct the classification results of a single classifier. In addition, the cellular proliferation experiment performed on several previously unexplored drug combinations further confirms the predictive ability of GA-DRUG.


Assuntos
Algoritmos , Neoplasias , Humanos , Combinação de Medicamentos , Neoplasias/tratamento farmacológico , Proteínas , Aprendizado de Máquina
4.
Methods ; 208: 48-58, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36283656

RESUMO

Automatic whole heart segmentation plays an important role in the treatment and research of cardiovascular diseases. In this paper, we propose an improved Deep Forest framework, named Multi-Resolution Deep Forest Framework (MRDFF), which accomplishes whole heart segmentation in two stages. We extract the heart region by binary classification in the first stage, thus avoiding the class imbalance problem caused by too much background. The results of the first stage are then subdivided in the second stage to obtain accurate cardiac substructures. In addition, we also propose hybrid feature fusion, multi-resolution fusion and multi-scale fusion to further improve the segmentation accuracy. Experiments on the public dataset MM-WHS show that our model can achieve comparable accuracy in about half the training time of neural network models.


Assuntos
Processamento de Imagem Assistida por Computador , Tomografia Computadorizada por Raios X , Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Redes Neurais de Computação , Coração/diagnóstico por imagem , Florestas
5.
Molecules ; 27(10)2022 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-35630587

RESUMO

In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and is one of the most common and important issues in toxicity evaluation research. It directly leads to the high wear attrition of the drug. At present, there are a variety of computer algorithms based on molecular representations to predict DILI. It is found that a single molecular representation method is insufficient to complete the task of toxicity prediction, and multiple molecular fingerprint fusion methods have been used as model input. In order to solve the problem of high dimensional and unbalanced DILI prediction data, this paper integrates existing datasets and designs a new algorithm framework, Rotation-Ensemble-GA (R-E-GA). The main idea is to find a feature subset with better predictive performance after rotating the fusion vector of high-dimensional molecular representation in the feature space. Then, an Adaboost-type ensemble learning method is integrated into R-E-GA to improve the prediction accuracy. The experimental results show that the performance of R-E-GA is better than other state-of-art algorithms including ensemble learning-based and graph neural network-based methods. Through five-fold cross-validation, the R-E-GA obtains an ACC of 0.77, an F1 score of 0.769, and an AUC of 0.842.


Assuntos
Algoritmos , Doença Hepática Induzida por Substâncias e Drogas , Doença Hepática Induzida por Substâncias e Drogas/diagnóstico , Doença Hepática Induzida por Substâncias e Drogas/etiologia , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
6.
Small ; 17(4): e2006374, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33377273

RESUMO

Heterostructures are attracting increasing attention in the field of sodium-ion batteries. However, it is still unclear whether any two monophase components can be used to construct a high-performance heterostructure for sodium-ion batteries, as well as the kind of heterostructures that can boost electrochemical performances. In this study, based on classical semiconductor theories on antiblocking and blocking interfaces, attempts are made to answer the abovementioned queries. For this purpose, NiTe2 -ZnTe antiblocking and CoTe2 -ZnTe blocking heterostructures are synthesized through a bimetal-hexamine framework-derived strategy. The NiTe2 -ZnTe antiblocking heterostructure exhibits excellent high-rate and cycling performances, while the CoTe2 -ZnTe blocking heterostructure performs poorly, even compared to their monophase components. Further, kinetic measurements and theoretical calculation confirm that antiblocking heterointerfaces can boost Na-ion diffusion efficiency and decrease the diffusion barrier, which can be attributed to the highly conductive antiblocking heterointerfaces generated due to electron transfer from NiTe2 to ZnTe. Therefore, this study provides a new perspective to design heterostructures more efficiently, with significantly better Na-ion storage performance.

7.
Rapid Commun Mass Spectrom ; 32(13): 1068-1074, 2018 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-29504640

RESUMO

RATIONALE: A liquid chromatography/tandem mass spectrometry (LC/MS/MS) method for quantification of caspofungin in dried blood spots (DBS) was developed and validated. METHODS: The DBS samples were prepared by spotting whole blood onto Whatman 903 filter paper, drying at room temperature and extracting with 50% methanol and further cleaned by protein precipitation with acetonitrile. Roxithromycin was selected as internal standard, and the separation of the analytes with endogenous ingredients was accomplished on a Hypersil GOLD aQ column with a mobile phase composed of 0.1% formic acid (v/v) and methanol in gradient mode. The detection of the analytes was performed on a triple quadrupole mass spectrometer in positive electrospray ionization mode, and the following selective reaction monitoring (SRM) transitions were monitored: m/z 547.6 â†’ 538.7 and 837.4→ 679.4 for quantification of caspofungin and the internal standard, respectively. RESULTS: The total analytical time was 8 min for each run. The calibration curve exhibited a good linearity over the range from 0.2 to 20 µg/mL and the lower limit of quantification (LLOQ) was 0.2 µg/mL for caspofungin in DBS. The recoveries of caspofungin ranged from 62.64% to 76.69%, and no obvious matrix effect was observed. The intra- and inter-day precision and accuracy were within acceptable limits, and caspofungin in DBS was stable after storage at room temperature for 24 h and at -80°C for 30 days. There was no evident effect of the hematocrit value on the analysis of caspofungin. CONCLUSIONS: The proposed method presents an alternative to the conventional venous sampling method, and was successfully utilized for pharmacokinetics study of caspofungin in ICU patients.


Assuntos
Antifúngicos/sangue , Cromatografia Líquida de Alta Pressão/métodos , Teste em Amostras de Sangue Seco/métodos , Equinocandinas/sangue , Lipopeptídeos/sangue , Espectrometria de Massas em Tandem/métodos , Acetonitrilas/química , Antifúngicos/isolamento & purificação , Caspofungina , Precipitação Química , Equinocandinas/isolamento & purificação , Humanos , Limite de Detecção , Lipopeptídeos/isolamento & purificação , Metanol/química , Reprodutibilidade dos Testes
8.
Biomed Eng Online ; 13: 169, 2014 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-25514966

RESUMO

BACKGROUND: Intensity inhomogeneity occurs in many medical images, especially in vessel images. Overcoming the difficulty due to image inhomogeneity is crucial for the segmentation of vessel image. METHODS: This paper proposes a localized hybrid level-set method for the segmentation of 3D vessel image. The proposed method integrates both local region information and boundary information for vessel segmentation, which is essential for the accurate extraction of tiny vessel structures. The local intensity information is firstly embedded into a region-based contour model, and then incorporated into the level-set formulation of the geodesic active contour model. Compared with the preset global threshold based method, the use of automatically calculated local thresholds enables the extraction of the local image information, which is essential for the segmentation of vessel images. RESULTS: Experiments carried out on the segmentation of 3D vessel images demonstrate the strengths of using locally specified dynamic thresholds in our level-set method. Furthermore, both qualitative comparison and quantitative validations have been performed to evaluate the effectiveness of our proposed model. CONCLUSIONS: Experimental results and validations demonstrate that our proposed model can achieve more promising segmentation results than the original hybrid method does.


Assuntos
Vasos Sanguíneos/patologia , Diagnóstico por Imagem/métodos , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Algoritmos , Automação , Humanos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Software
9.
ACS Appl Mater Interfaces ; 16(7): 8403-8416, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38334116

RESUMO

Cancer immunotherapy is expected to achieve tumor treatment mainly by stimulating the patient's own immune system to kill tumor cells. However, the low immunogenicity of the tumor and the poor efficiency of tumor antigen presentation result in a variety of solid tumors that do not respond to immunotherapy. Herein, we designed a proton-gradient-driven porphyrin-based liposome (PBL) with highly efficient Toll-like receptor 7 (TLR7) agonist (imiquimod, R837) encapsulation (R837@PBL). R837@PBL rapidly released R837 in the acid microenvironment to activate the TLR in the endosome inner membrane to promote bone-marrow-derived dendritic cell maturation and enhance antigen presentation. R837@PBL upon laser irradiation triggered immunogenic cell death of tumor cells and tumor-associated antigen release after subcutaneous injection, activated TLR7, formed in situ tumor nanoadjuvants, and enhanced the antigen presentation efficiency. Photoimmunotherapy promoted the infiltration of cytotoxic T lymphocytes into tumor tissues, inhibited the growth of the treated and abscopal tumors, and exerted highly effective photoimmunotherapeutic effects. Hence, our designed in situ tumor nanoadjuvants are expected to be an effective treatment for treated and abscopal tumors, providing a novel approach for synergistic photoimmunotherapy of tumors.


Assuntos
Neoplasias , Porfirinas , Humanos , Imiquimode/farmacologia , Lipossomos/farmacologia , Receptor 7 Toll-Like/agonistas , Prótons , Porfirinas/farmacologia , Neoplasias/terapia , Imunoterapia , Adjuvantes Imunológicos/farmacologia , Antígenos de Neoplasias , Microambiente Tumoral , Linhagem Celular Tumoral
10.
Artigo em Inglês | MEDLINE | ID: mdl-37285251

RESUMO

Detecting pneumonia, especially coronavirus disease 2019 (COVID-19), from chest X-ray (CXR) images is one of the most effective ways for disease diagnosis and patient triage. The application of deep neural networks (DNNs) for CXR image classification is limited due to the small sample size of the well-curated data. To tackle this problem, this article proposes a distance transformation-based deep forest framework with hybrid-feature fusion (DTDF-HFF) for accurate CXR image classification. In our proposed method, hybrid features of CXR images are extracted in two ways: hand-crafted feature extraction and multigrained scanning. Different types of features are fed into different classifiers in the same layer of the deep forest (DF), and the prediction vector obtained at each layer is transformed to form distance vector based on a self-adaptive scheme. The distance vectors obtained by different classifiers are fused and concatenated with the original features, then input into the corresponding classifier at the next layer. The cascade grows until DTDF-HFF can no longer gain benefits from the new layer. We compare the proposed method with other methods on the public CXR datasets, and the experimental results show that the proposed method can achieve state-of-the art (SOTA) performance. The code will be made publicly available at https://github.com/hongqq/DTDF-HFF.

11.
Cell Rep Methods ; 3(2): 100411, 2023 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-36936075

RESUMO

Combination therapy is a promising approach in treating multiple complex diseases. However, the large search space of available drug combinations exacerbates challenge for experimental screening. To predict synergistic drug combinations in different cancer cell lines, we propose an improved deep forest-based method, ForSyn, and design two forest types embedded in ForSyn. ForSyn handles imbalanced and high-dimensional data in medium-/small-scale datasets, which are inherent characteristics of drug combination datasets. Compared with 12 state-of-the-art methods, ForSyn ranks first on four metrics for eight datasets with different feature combinations. We conduct a systematic analysis to identify the most appropriate configuration parameters. We validate the predictive value of ForSyn with cell-based experiments on several previously unexplored drug combinations. Finally, a systematic analysis of feature importance is performed on the top contributing features extracted by ForSyn. The resulting key genes may play key roles on corresponding cancers.


Assuntos
Biologia Computacional , Neoplasias , Humanos , Biologia Computacional/métodos , Neoplasias/tratamento farmacológico , Combinação de Medicamentos , Linhagem Celular
12.
World J Surg Oncol ; 10: 252, 2012 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-23170979

RESUMO

BACKGROUND: The histopathological and molecular heterogeneity of normal tissue adjacent to cancerous tissue (NTAC) and normal tissue adjacent to benign tissue (NTAB), and the availability of limited specimens make deciphering the mechanisms of carcinogenesis challenging. Our goal was to identify histogenetic biomarkers that could be reliably used to define a transforming fingerprint using RNA in situ hybridization. METHODS: We evaluated 15 tumor-related RNA in situ hybridization biomarkers using tumor microarray and samples of seven tumor-adjacent normal tissues from 314 patients. Biomarkers were determined using comprehensive statistical methods (significance of support vector machine-based artificial intelligence and area under curve scoring of classification distribution). RESULTS: TP53 was found to be a most reliable index (P <10(-7); area under curve >87%) for distinguishing NTAC from NTAB, according to the results of a significance panel (BCL10, BECN1, BRCA2, FITH, PTCH11 and TP53). CONCLUSIONS: The genetic alterations in TP53 between NTAC and NTAB may provide new insight into the field of cancerization and tumor transformation.


Assuntos
Biomarcadores Tumorais/análise , Proteína Supressora de Tumor p53/análise , Transformação Celular Neoplásica , Genes p53 , Humanos , Hibridização In Situ
13.
Comput Biol Med ; 135: 104534, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34246156

RESUMO

In conventional medical image printing methods, volumetric medical data needs to be conversed into STereo Lithography (STL) format, the most commonly used format for representing geometric models for 3D printing. However, this STL conversion process is not only time consuming, but more importantly, it often leads to the loss of accuracy. It has become a critical factor hindering the printing efficiency and precision of organ models. By examining the key characteristics of discrete medical volume data, this paper proposes a direct slicing technique for printing implicitly represented 3D medical models. The proposed method mainly consists of three algorithms: (1) A layer-based contour extraction algorithm for discrete volume data; (2) An inner shell construction algorithm based on discrete point differential indentation; (3) An infill generation algorithm based on the constructed virtual contour and scan lines. The proposed method has been applied to the slicing of several organ models for experiments, and the ratios of time cost and memory cost between the conventional method and the proposed method are about 4-100 and 1.1 to 1.4 respectively, which demonstrate that the proposed method has a great improvement in both time and space performance when compared with the conventional STL-based method. Our technique extends the direct input format of geometric models for additive manufacturing. That is, discrete volume data can be used as a direct input for additive manufacturing without conversion to STL format.


Assuntos
Algoritmos , Impressão Tridimensional
14.
Bioinformatics ; 25(3): 331-7, 2009 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-19088122

RESUMO

MOTIVATION: Feature selection approaches have been widely applied to deal with the small sample size problem in the analysis of micro-array datasets. For the multiclass problem, the proposed methods are based on the idea of selecting a gene subset to distinguish all classes. However, it will be more effective to solve a multiclass problem by splitting it into a set of two-class problems and solving each problem with a respective classification system. RESULTS: We propose a genetic programming (GP)-based approach to analyze multiclass microarray datasets. Unlike the traditional GP, the individual proposed in this article consists of a set of small-scale ensembles, named as sub-ensemble (denoted by SE). Each SE consists of a set of trees. In application, a multiclass problem is divided into a set of two-class problems, each of which is tackled by a SE first. The SEs tackling the respective two-class problems are combined to construct a GP individual, so each individual can deal with a multiclass problem directly. Effective methods are proposed to solve the problems arising in the fusion of SEs, and a greedy algorithm is designed to keep high diversity in SEs. This GP is tested in five datasets. The results show that the proposed method effectively implements the feature selection and classification tasks.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Classificação/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Tamanho da Amostra
15.
Comput Methods Programs Biomed ; 196: 105598, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32599337

RESUMO

BACKGROUND AND OBJECTIVE: High-quality vascular modeling is crucial for blood flow simulations, i.e., computational fluid dynamics (CFD). As without an accurate geometric representation of the smooth vascular surface, it is impossible to make meaningful blood flow simulations. The purpose of this work is to develop high-quality vascular modeling and modification method for blood flow computations. METHODS: We develop a new technique for the accurate geometric modeling and modification of vasculatures using implicit extrusion surfaces (IES). In the proposed method, the skeleton of the vascular structure is subdivided into short curve segments, each of which is then represented implicitly locally as the intersection of two mutually orthogonal implicit surfaces defined by distance functions. A set of contour points is extracted and fitted with an implicit curve for accurately specifying the vessel cross-section profile, which is then extruded locally along the skeleton to fill the gaps between two vascular tube cross sections. We also present a new implicit geometric editing technique to modify the constructed vascular model with pathology for virtual stenting. RESULTS: Experimental results and validations show that accurate vascular models with highly smooth surfaces can be generated by the proposed method. In addition, we conduct some blood flow simulations to indicate the effectiveness of proposed method for hemodynamic simulations. CONCLUSIONS: The proposed technique can achieve precise geometric models of vasculatures with any required degree of smoothness for reliable blood flow simulations.


Assuntos
Hemodinâmica , Modelos Cardiovasculares , Simulação por Computador
16.
Environ Pollut ; 267: 115500, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33254722

RESUMO

In predicting palm oil mill effluent (POME) degradation efficiency, previous developed quadratic model quantitatively evaluated the effects of O2 flowrate, TiO2 loadings and initial concentration of POME in labscale photocatalytic system, which however suffered from low generalization due to the overfitting behaviour. Evidently, high RMSE (131.61) and low R2 (-630.49) obtained indicates its insufficiency in describing POME degradation at unseen factor ranges, hence verified the fact of poor generalization. To overcome this issue, several models were developed via machine learning-assisted techniques, namely Gaussian Process Regression (GPR), Linear Regression (LR), Decision Tree (DT), Supported Vector Machine (SVM) and Regression Tree Ensemble (RTE), subsequently being assessed systematically. To achieve high generalization, all models were subjected to 'train-all-test-all' strategy, 5-fold and 10-fold cross validation. Specifically, GPR model was furnished with high accuracy in 'train-all-test-all' strategy, judging from its low RMSE (1.0394) and high R2 (0.9962), which however menaced by the risk of overfitting. In contrast, despite relatively poorer RMSE and R2 (1.7964 and 0.9886) obtained in 5-fold cross validation, GPR model was rendered with highest generalization, while sufficiently preserving its accuracy in development process. Besides, SVM and RTE models were also demonstrated promising R2 (0.9372 and 0.9208), which however shadowed by their high RMSEs (4.2174 and 4.7366). Furthermore, the extraordinary generalization of GPR model was coincidentally verified in 10-fold cross validation. The lowest RMSE (2.1624) and highest R2 (0.9835) obtained with feature number of 36 asserted its sufficiency in both generalization and accuracy prospect. Other models were all rendered with slight lower R2 (> 0.9), plausibly due to the higher RMSE (> 4.0). According to GPR model, optimized POME degradation (52.52%) can be obtained at 70 mL/min of O2, 70.0 g/L of TiO2 and 250 ppm of POME concentration, with only ∼3% error as compared to the actual data.


Assuntos
Resíduos Industriais , Eliminação de Resíduos Líquidos , Resíduos Industriais/análise , Aprendizado de Máquina , Óleo de Palmeira , Óleos de Plantas
17.
Protein Pept Lett ; 15(5): 488-93, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18537739

RESUMO

This paper proposes an efficient ensemble system to tackle the protein secondary structure prediction problem with neural networks as base classifiers. The experimental results show that the multi-layer system can lead to better results. When deploying more accurate classifiers, the higher accuracy of the ensemble system can be obtained.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Estrutura Secundária de Proteína , Proteínas/química , Conformação Proteica , Dobramento de Proteína
18.
Comput Biol Med ; 38(5): 601-10, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18394595

RESUMO

We address the microarray dataset based cancer classification using a newly proposed multiple classifier system (MCS), referred to as Rotation Forest. To the best of our knowledge, it is the first time that Rotation Forest has been applied to the microarray dataset classification. In the framework of Rotation Forest, a linear transformation method is required to project data into new feature space for each classifier, and then the base classifiers are trained in different new spaces so as to enhance both the accuracies of base classifiers and the diversity in the ensemble system. Principal component analysis (PCA), non-parametric discriminant analysis (NDA) and random projections (RP) were applied to feature transformation in the original Rotation Forest. In this paper, we use independent component analysis (ICA) as a new transformation method since it can better describe the property of microarray data. The breast cancer dataset and prostate dataset are deployed to validate the efficiency of Rotation Forest. In all the experiments, it can be found that Rotation Forest outperforms other MCSs, such as Bagging and Boosting. In addition, the experimental results also revealed that ICA can further improve the performance of Rotation Forest compared with the original transformation methods.


Assuntos
Inteligência Artificial , Neoplasias da Mama/classificação , Modelos Estatísticos , Reconhecimento Automatizado de Padrão/métodos , Neoplasias da Próstata/classificação , Algoritmos , Neoplasias da Mama/genética , Feminino , Humanos , Masculino , Neoplasias da Próstata/genética
19.
Comput Math Methods Med ; 2015: 193406, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25810748

RESUMO

Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.


Assuntos
Regulação Neoplásica da Expressão Gênica , Neoplasias/diagnóstico , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Área Sob a Curva , Inteligência Artificial , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Aprendizado de Máquina , Modelos Estatísticos , Neoplasias/patologia , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes
20.
Comput Biol Med ; 43(6): 729-37, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23668348

RESUMO

In this paper, a genetic algorithm (GA) based ensemble support vector machine (SVM) classifier built on gene pairs (GA-ESP) is proposed. The SVMs (base classifiers of the ensemble system) are trained on different informative gene pairs. These gene pairs are selected by the top scoring pair (TSP) criterion. Each of these pairs projects the original microarray expression onto a 2-D space. Extensive permutation of gene pairs may reveal more useful information and potentially lead to an ensemble classifier with satisfactory accuracy and interpretability. GA is further applied to select an optimized combination of base classifiers. The effectiveness of the GA-ESP classifier is evaluated on both binary-class and multi-class datasets.


Assuntos
Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Genes , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Máquina de Vetores de Suporte , Transcriptoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA