Pesquisa | BVS Aleitamento Materno

Sequential selection of variables using short permutation procedures and multiple adjustments: An application to genomic data.

Azevedo Costa, Marcelo; de Souza Rodrigues, Thiago; da Costa, André Gabriel Fc; Natowicz, René; Pádua Braga, Antônio.

Stat Methods Med Res ; 26(2): 997-1020, 2017 04.

Artigo em Inglês | MEDLINE | ID: mdl-25575544

RESUMO

This work proposes a sequential methodology for selecting variables in classification problems in which the number of predictors is much larger than the sample size. The methodology includes a Monte Carlo permutation procedure that conditionally tests the null hypothesis of no association among the outcomes and the available predictors. In order to improve computing aspects, we propose a new parametric distribution, the Truncated and Zero Inflated Gumbel Distribution. The final application is to find compact classification models with improved performance for genomic data. Results using real data sets show that the proposed methodology selects compact models with optimized classification performances.

Assuntos

Genômica/estatística & dados numéricos , Algoritmos , Bioestatística/métodos , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Simulação por Computador , Interpretação Estatística de Dados , Bases de Dados Factuais/estatística & dados numéricos , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Modelos Estatísticos , Método de Monte Carlo , Análise Multivariada , Tamanho da Amostra

A Ranking Approach for Probe Selection and Classification of Microarray Data with Artificial Neural Networks.

Faria, Alexandre Wagner Chagas; da Silva, Alisson Marques; de Souza Rodrigues, Thiago; Costa, Marcelo Azevedo; Braga, Antonio Padua.

J Comput Biol ; 22(10): 953-61, 2015 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-26418055

RESUMO

Acute leukemia classification into its myeloid and lymphoblastic subtypes is usually accomplished according to the morphology of the tumor. Nevertheless, the subtypes may have similar histopathological appearance, making screening procedures difficult. In addition, approximately one-third of acute myeloid leukemias are characterized by aberrant cytoplasmic localization of nucleophosmin (NPMc(+)), where the majority has a normal karyotype. This work is based on two DNA microarray datasets, available publicly, to differentiate leukemia subtypes. The datasets were split into training and test sets, and feature selection methods were applied. Artificial neural network classifiers were developed to compare the feature selection methods. For the first dataset, 50 genes selected using the best classifier was able to classify all patients in the test set. For the second dataset, five genes yielded 97.5% accuracy in the test set.

Assuntos

Perfilação da Expressão Gênica/métodos , Leucemia Mieloide/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Algoritmos , Diagnóstico Diferencial , Regulação Neoplásica da Expressão Gênica , Humanos , Leucemia Mieloide/classificação , Redes Neurais de Computação , Leucemia-Linfoma Linfoblástico de Células Precursoras/classificação , Sensibilidade e Especificidade

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA