Inference on differences between classes using cluster-specific contrasts of mixed effects.
Biostatistics
; 16(1): 98-112, 2015 Jan.
Article
em En
| MEDLINE
| ID: mdl-24963011
The detection of differentially expressed (DE) genes, that is, genes whose expression levels vary between two or more classes representing different experimental conditions (say, diseases), is one of the most commonly studied problems in bioinformatics. For example, the identification of DE genes between distinct disease phenotypes is an important first step in understanding and developing treatment drugs for the disease. We present a novel approach to the problem of detecting DE genes that is based on a test statistic formed as a weighted (normalized) cluster-specific contrast in the mixed effects of the mixture model used in the first instance to cluster the gene profiles into a manageable number of clusters. The key factor in the formation of our test statistic is the use of gene-specific mixed effects in the cluster-specific contrast. It thus means that the (soft) assignment of a given gene to a cluster is not crucial. This is because in addition to class differences between the (estimated) fixed effects terms for a cluster, gene-specific class differences also contribute to the cluster-specific contributions to the final form of the test statistic. The proposed test statistic can be used where the primary aim is to rank the genes in order of evidence against the null hypothesis of no DE. We also show how a P-value can be calculated for each gene for use in multiple hypothesis testing where the intent is to control the false discovery rate (FDR) at some desired level. With the use of publicly available and simulated datasets, we show that the proposed contrast-based approach outperforms other methods commonly used for the detection of DE genes both in a ranking context with lower proportion of false discoveries and in a multiple hypothesis testing context with higher power for a specified level of the FDR.
Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Análise por Conglomerados
/
Expressão Gênica
/
Interpretação Estatística de Dados
/
Perfilação da Expressão Gênica
/
Modelos Genéticos
Tipo de estudo:
Prognostic_studies
Limite:
Female
/
Humans
Idioma:
En
Revista:
Biostatistics
Ano de publicação:
2015
Tipo de documento:
Article
País de afiliação:
Austrália