Pesquisa | BVS IEC

Clustering.

McLachlan, G J; Bean, R W; Ng, S K.

Methods Mol Biol ; 1526: 345-362, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-27896751

RESUMO

Clustering techniques are used to arrange genes in some natural way, that is, to organize genes into groups or clusters with similar behavior across relevant tissue samples (or cell lines). These techniques can also be applied to tissues rather than genes. Methods such as hierarchical agglomerative clustering, k-means clustering, the self-organizing map, and model-based methods have been used. Here we focus on mixtures of normals to provide a model-based clustering of tissue samples (gene signatures) and of gene profiles, including time-course gene expression data.

Assuntos

Análise por Conglomerados , Biologia Computacional/métodos , Algoritmos , Animais , Perfilação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Software

A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays.

McLachlan, G J; Bean, R W; Jones, L Ben-Tovim.

Bioinformatics ; 22(13): 1608-15, 2006 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-16632494

RESUMO

MOTIVATION: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. RESULTS: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.

Assuntos

Neoplasias da Mama/genética , Neoplasias do Colo/genética , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Infecções por HIV/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Teorema de Bayes , Neoplasias da Mama/metabolismo , Neoplasias do Colo/metabolismo , Interpretação Estatística de Dados , Infecções por HIV/metabolismo , Humanos , Modelos Estatísticos , Reprodutibilidade dos Testes

A mixture model-based approach to the clustering of microarray expression data.

McLachlan, G J; Bean, R W; Peel, D.

Bioinformatics ; 18(3): 413-22, 2002 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-11934740

RESUMO

MOTIVATION: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. RESULTS: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. AVAILABILITY: EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/

Assuntos

Algoritmos , Neoplasias do Colo/classificação , Regulação Neoplásica da Expressão Gênica/genética , Leucemia Mieloide Aguda/classificação , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Leucemia-Linfoma Linfoblástico de Células Precursoras/classificação , Análise por Conglomerados , Neoplasias do Colo/genética , Interpretação Estatística de Dados , Bases de Dados Genéticas , Estudos de Viabilidade , Expressão Gênica/genética , Humanos , Leucemia Mieloide Aguda/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Sensibilidade e Especificidade , Processos Estocásticos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA