Pesquisa | Secretaria de Estado da Saúde

Simultaneous clustering of multiple gene expression and physical interaction datasets.

Narayanan, Manikandan; Vetta, Adrian; Schadt, Eric E; Zhu, Jun.

PLoS Comput Biol ; 6(4): e1000742, 2010 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-20419151

RESUMO

Many genome-wide datasets are routinely generated to study different aspects of biological systems, but integrating them to obtain a coherent view of the underlying biology remains a challenge. We propose simultaneous clustering of multiple networks as a framework to integrate large-scale datasets on the interactions among and activities of cellular components. Specifically, we develop an algorithm JointCluster that finds sets of genes that cluster well in multiple networks of interest, such as coexpression networks summarizing correlations among the expression profiles of genes and physical networks describing protein-protein and protein-DNA interactions among genes or gene-products. Our algorithm provides an efficient solution to a well-defined problem of jointly clustering networks, using techniques that permit certain theoretical guarantees on the quality of the detected clustering relative to the optimal clustering. These guarantees coupled with an effective scaling heuristic and the flexibility to handle multiple heterogeneous networks make our method JointCluster an advance over earlier approaches. Simulation results showed JointCluster to be more robust than alternate methods in recovering clusters implanted in networks with high false positive rates. In systematic evaluation of JointCluster and some earlier approaches for combined analysis of the yeast physical network and two gene expression datasets under glucose and ethanol growth conditions, JointCluster discovers clusters that are more consistently enriched for various reference classes capturing different aspects of yeast biology or yield better coverage of the analysed genes. These robust clusters, which are supported across multiple genomic datasets and diverse reference classes, agree with known biology of yeast under these growth conditions, elucidate the genetic control of coordinated transcription, and enable functional predictions for a number of uncharacterized genes.

Assuntos

Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Família Multigênica , Simulação por Computador , Etanol/metabolismo , Redes Reguladoras de Genes , Glucose/metabolismo , Mapeamento de Interação de Proteínas , Leveduras/genética , Leveduras/metabolismo

Predicting direct protein interactions from affinity purification mass spectrometry data.

Kim, Ethan Dh; Sabharwal, Ashish; Vetta, Adrian R; Blanchette, Mathieu.

Algorithms Mol Biol ; 5: 34, 2010 Oct 29.

Artigo em Inglês | MEDLINE | ID: mdl-21034440

RESUMO

BACKGROUND: Affinity purification followed by mass spectrometry identification (AP-MS) is an increasingly popular approach to observe protein-protein interactions (PPI) in vivo. One drawback of AP-MS, however, is that it is prone to detecting indirect interactions mixed with direct physical interactions. Therefore, the ability to distinguish direct interactions from indirect ones is of much interest. RESULTS: We first propose a simple probabilistic model for the interactions captured by AP-MS experiments, under which the problem of separating direct interactions from indirect ones is formulated. Then, given idealized quantitative AP-MS data, we study the problem of identifying the most likely set of direct interactions that produced the observed data. We address this challenging graph theoretical problem by first characterizing signatures that can identify weakly connected nodes as well as dense regions of the network. The rest of the direct PPI network is then inferred using a genetic algorithm.Our algorithm shows good performance on both simulated and biological networks with very high sensitivity and specificity. Then the algorithm is used to predict direct interactions from a set of AP-MS PPI data from yeast, and its performance is measured against a high-quality interaction dataset. CONCLUSIONS: As the sensitivity of AP-MS pipeline improves, the fraction of indirect interactions detected will also increase, thereby making the ability to distinguish them even more desirable. Despite the simplicity of our model for indirect interactions, our method provides a good performance on the test networks.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa