Pesquisa | Portal Regional da BVS

Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula.

IEEE/ACM Trans Comput Biol Bioinform ; 8(6): 1580-91, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-21383421

RESUMO

Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.

Assuntos

Teorema de Bayes , Biologia Computacional/métodos , Modelos Logísticos , Polimorfismo de Nucleotídeo Único , Genótipo , Método de Monte Carlo

Detecting high-order interactions of single nucleotide polymorphisms using genetic programming.

Nunkesser, Robin; Bernholt, Thorsten; Schwender, Holger; Ickstadt, Katja; Wegener, Ingo.

Bioinformatics ; 23(24): 3280-8, 2007 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-18006552

RESUMO

MOTIVATION: Not individual single nucleotide polymorphisms (SNPs), but high-order interactions of SNPs are assumed to be responsible for complex diseases such as cancer. Therefore, one of the major goals of genetic association studies concerned with such genotype data is the identification of these high-order interactions. This search is additionally impeded by the fact that these interactions often are only explanatory for a relatively small subgroup of patients. Most of the feature selection methods proposed in the literature, unfortunately, fail at this task, since they can either only identify individual variables or interactions of a low order, or try to find rules that are explanatory for a high percentage of the observations. In this article, we present a procedure based on genetic programming and multi-valued logic that enables the identification of high-order interactions of categorical variables such as SNPs. This method called GPAS cannot only be used for feature selection, but can also be employed for discrimination. RESULTS: In an application to the genotype data from the GENICA study, an association study concerned with sporadic breast cancer, GPAS is able to identify high-order interactions of SNPs leading to a considerably increased breast cancer risk for different subsets of patients that are not found by other feature selection methods. As an application to a subset of the HapMap data shows, GPAS is not restricted to association studies comprising several 10 SNPs, but can also be employed to analyze whole-genome data. AVAILABILITY: Software can be downloaded from http://ls2-www.cs.uni-dortmund.de/~nunkesser/#Software

Assuntos

Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Análise Mutacional de DNA/métodos , Proteínas de Neoplasias/genética , Polimorfismo de Nucleotídeo Único/genética , Sequência de Bases , Mapeamento Cromossômico , Predisposição Genética para Doença/genética , Humanos , Dados de Sequência Molecular , Programação Linear

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA