Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Mol Cell ; 46(6): 884-92, 2012 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-22749401

RESUMEN

Alternative splicing plays a key role in the expansion of proteomic and regulatory complexity, yet the functions of the vast majority of differentially spliced exons are not known. In this study, we observe that brain and other tissue-regulated exons are significantly enriched in flexible regions of proteins that likely form conserved interaction surfaces. These proteins participate in significantly more interactions in protein-protein interaction (PPI) networks than other proteins. Using LUMIER, an automated PPI assay, we observe that approximately one-third of analyzed neural-regulated exons affect PPIs. Inclusion of these exons stimulated and repressed different partner interactions at comparable frequencies. This assay further revealed functions of individual exons, including a role for a neural-specific exon in promoting an interaction between Bridging Integrator 1 (Bin1)/Amphiphysin II and Dynamin 2 (Dnm2) that facilitates endocytosis. Collectively, our results provide evidence that regulated alternative exons frequently remodel interactions to establish tissue-dependent PPI networks.


Asunto(s)
Empalme Alternativo , Mapas de Interacción de Proteínas , Proteínas/metabolismo , Proteínas Adaptadoras Transductoras de Señales/genética , Proteínas Adaptadoras Transductoras de Señales/metabolismo , Sitios de Unión , Células Cultivadas , Dinamina II/genética , Dinamina II/metabolismo , Exones , Células HEK293 , Humanos , Luciferasas de Renilla/genética , Luciferasas de Renilla/metabolismo , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Proteínas/genética , Proteómica , Proteínas Supresoras de Tumor/genética , Proteínas Supresoras de Tumor/metabolismo
2.
Bioinformatics ; 32(10): 1589-91, 2016 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-26801957

RESUMEN

UNLABELLED: ELASPIC is a novel ensemble machine-learning approach that predicts the effects of mutations on protein folding and protein-protein interactions. Here, we present the ELASPIC webserver, which makes the ELASPIC pipeline available through a fast and intuitive interface. The webserver can be used to evaluate the effect of mutations on any protein in the Uniprot database, and allows all predicted results, including modeled wild-type and mutated structures, to be managed and viewed online and downloaded if needed. It is backed by a database which contains improved structural domain definitions, and a list of curated domain-domain interactions for all known proteins, as well as homology models of domains and domain-domain interactions for the human proteome. Homology models for proteins of other organisms are calculated on the fly, and mutations are evaluated within minutes once the homology model is available. AVAILABILITY AND IMPLEMENTATION: The ELASPIC webserver is available online at http://elaspic.kimlab.org CONTACT: pm.kim@utoronto.ca or pi@kimlab.orgSupplementary data: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteoma , Humanos , Mutación , Unión Proteica , Pliegue de Proteína , Estabilidad Proteica , Programas Informáticos
3.
Bioinformatics ; 32(2): 203-10, 2016 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-26411870

RESUMEN

MOTIVATION: Rapid advances in genotyping and genome-wide association studies have enabled the discovery of many new genotype-phenotype associations at the resolution of individual markers. However, these associations explain only a small proportion of theoretically estimated heritability of most diseases. In this work, we propose an integrative mixture model called JBASE: joint Bayesian analysis of subphenotypes and epistasis. JBASE explores two major reasons of missing heritability: interactions between genetic variants, a phenomenon known as epistasis and phenotypic heterogeneity, addressed via subphenotyping. RESULTS: Our extensive simulations in a wide range of scenarios repeatedly demonstrate that JBASE can identify true underlying subphenotypes, including their associated variants and their interactions, with high precision. In the presence of phenotypic heterogeneity, JBASE has higher Power and lower Type 1 Error than five state-of-the-art approaches. We applied our method to a sample of individuals from Mexico with Type 2 diabetes and discovered two novel epistatic modules, including two loci each, that define two subphenotypes characterized by differences in body mass index and waist-to-hip ratio. We successfully replicated these subphenotypes and epistatic modules in an independent dataset from Mexico genotyped with a different platform. AVAILABILITY AND IMPLEMENTATION: JBASE is implemented in C++, supported on Linux and is available at http://www.cs.toronto.edu/∼goldenberg/JBASE/jbase.tar.gz. The genotype data underlying this study are available upon approval by the ethics review board of the Medical Centre Siglo XXI. Please contact Dr Miguel Cruz at mcruzl@yahoo.com for assistance with the application. CONTACT: anna.goldenberg@utoronto.ca SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Epistasis Genética , Fenotipo , Teorema de Bayes , Índice de Masa Corporal , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo , Genotipo , Técnicas de Genotipaje , Humanos , México , Relación Cintura-Cadera
4.
PLoS Comput Biol ; 9(4): e1003030, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23633940

RESUMEN

Intrinsically disordered regions have been associated with various cellular processes and are implicated in several human diseases, but their exact roles remain unclear. We previously defined two classes of conserved disordered regions in budding yeast, referred to as "flexible" and "constrained" conserved disorder. In flexible disorder, the property of disorder has been positionally conserved during evolution, whereas in constrained disorder, both the amino acid sequence and the property of disorder have been conserved. Here, we show that flexible and constrained disorder are widespread in the human proteome, and are particularly common in proteins with regulatory functions. Both classes of disordered sequences are highly enriched in regions of proteins that undergo tissue-specific (TS) alternative splicing (AS), but not in regions of proteins that undergo general (i.e., not tissue-regulated) AS. Flexible disorder is more highly enriched in TS alternative exons, whereas constrained disorder is more highly enriched in exons that flank TS alternative exons. These latter regions are also significantly more enriched in potential phosphosites and other short linear motifs associated with cell signaling. We further show that cancer driver mutations are significantly enriched in regions of proteins associated with TS and general AS. Collectively, our results point to distinct roles for TS alternative exons and flanking exons in the dynamic regulation of protein interaction networks in response to signaling activity, and they further suggest that alternatively spliced regions of proteins are often functionally altered by mutations responsible for cancer.


Asunto(s)
Empalme Alternativo , Proteómica/métodos , Algoritmos , Secuencias de Aminoácidos , Biología Computacional/métodos , Evolución Molecular , Exones , Humanos , Músculos/metabolismo , Mutación , Neoplasias/metabolismo , Fosforilación , Pliegue de Proteína , Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas , Proteoma , Transducción de Señal
5.
Bioinformatics ; 26(18): i625-31, 2010 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-20823331

RESUMEN

MOTIVATION: Recent genomic studies have confirmed that cancer is of utmost phenotypical complexity, varying greatly in terms of subtypes and evolutionary stages. When classifying cancer tissue samples, subnetwork marker approaches have proven to be superior over single gene marker approaches, most importantly in cross-platform evaluation schemes. However, prior subnetwork-based approaches do not explicitly address the great phenotypical complexity of cancer. RESULTS: We explicitly address this and employ density-constrained biclustering to compute subnetwork markers, which reflect pathways being dysregulated in many, but not necessarily all samples under consideration. In breast cancer we achieve substantial improvements over all cross-platform applicable approaches when predicting TP53 mutation status in a well-established non-cross-platform setting. In colon cancer, we raise prediction accuracy in the most difficult instances from 87% to 93% for cancer versus non-cancer and from 83% to (astonishing) 92%, for with versus without liver metastasis, in well-established cross-platform evaluation schemes. AVAILABILITY: Software is available on request.


Asunto(s)
Biomarcadores de Tumor , Biología Computacional/métodos , Redes Reguladoras de Genes , Neoplasias/genética , Algoritmos , Benchmarking , Neoplasias de la Mama/genética , Neoplasias del Colon/genética , Femenino , Perfilación de la Expresión Génica , Genes p53 , Humanos , Neoplasias/clasificación , Programas Informáticos
6.
Proteomics ; 8(11): 2196-8, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18452226

RESUMEN

High-throughput experiments, most significantly DNA microarrays, provide us with system-scale profiles. Connecting these data with existing biological networks poses a formidable challenge to uncover facts about a cell's proteome. Studies and tools with this purpose are limited to networks with simple structure, such as protein-protein interaction graphs, or do not go much beyond than simply displaying values on the network. We have built a microarray data analysis tool, named PATIKAmad, which can be used to associate microarray data with the pathway models in mechanistic detail, and provides facilities for visualization, clustering, querying, and navigation of biological graphs related with loaded microarray experiments. PATIKAmad is freely available to noncommercial users as a new module of PATIKAweb at http://web.patika.org.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Interpretación Estadística de Datos , Regulación de la Expresión Génica , Internet , Sistema de Señalización de MAP Quinasas , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Reconocimiento de Normas Patrones Automatizadas , Mapeo de Interacción de Proteínas , Proteoma , Proteómica/métodos , Programas Informáticos , Interfaz Usuario-Computador
7.
Cell Rep ; 12(2): 183-9, 2015 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-26146086

RESUMEN

Alternative splicing acts on transcripts from almost all human multi-exon genes. Notwithstanding its ubiquity, fundamental ramifications of splicing on protein expression remain unresolved. The number and identity of spliced transcripts that form stably folded proteins remain the sources of considerable debate, due largely to low coverage of experimental methods and the resulting absence of negative data. We circumvent this issue by developing a semi-supervised learning algorithm, positive unlabeled learning for splicing elucidation (PULSE; http://www.kimlab.org/software/pulse), which uses 48 features spanning various categories. We validated its accuracy on sets of bona fide protein isoforms and directly on mass spectrometry (MS) spectra for an overall AU-ROC of 0.85. We predict that around 32% of "exon skipping" alternative splicing events produce stable proteins, suggesting that the process engenders a significant number of previously uncharacterized proteins. We also provide insights into the distribution of positive isoforms in various functional classes and into the structural effects of alternative splicing.


Asunto(s)
Empalme Alternativo , Proteínas/metabolismo , Aprendizaje Automático Supervisado , Área Bajo la Curva , Exones , Humanos , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/genética , Curva ROC
8.
PLoS One ; 9(9): e107353, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25243403

RESUMEN

Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases.


Asunto(s)
Mutación , Pliegue de Proteína , Estabilidad Proteica , Proteínas/metabolismo , Inteligencia Artificial , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Unión Proteica , Conformación Proteica , Análisis de Secuencia de Proteína , Programas Informáticos
9.
Mol Biosyst ; 8(1): 185-93, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22101230

RESUMEN

Disordered regions within proteins have increasingly been associated with various cellular functions. Identifying the specific roles played by disorder in these functions has proved difficult. However, the development of reliable prediction algorithms has expanded the study of disorder from a few anecdotal examples to a proteome-wide scale. Moreover, the recent omics revolution has provided the sequences of numerous organisms as well as thousands of genome-wide data sets including several types of interactomes. Here, we review the literature regarding genome-wide studies of disorder and examine how these studies give rise to new characterizations and categories of this elusive phenomenon.


Asunto(s)
Pliegue de Proteína , Proteínas/química , Proteínas/metabolismo , Proteómica , Animales , Evolución Molecular , Redes Reguladoras de Genes , Humanos , Unión Proteica
10.
Science ; 338(6114): 1587-93, 2012 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-23258890

RESUMEN

How species with similar repertoires of protein-coding genes differ so markedly at the phenotypic level is poorly understood. By comparing organ transcriptomes from vertebrate species spanning ~350 million years of evolution, we observed significant differences in alternative splicing complexity between vertebrate lineages, with the highest complexity in primates. Within 6 million years, the splicing profiles of physiologically equivalent organs diverged such that they are more strongly related to the identity of a species than they are to organ type. Most vertebrate species-specific splicing patterns are cis-directed. However, a subset of pronounced splicing changes are predicted to remodel protein interactions involving trans-acting regulators. These events likely further contributed to the diversification of splicing and other transcriptomic changes that underlie phenotypic differences among vertebrate species.


Asunto(s)
Empalme Alternativo , Evolución Molecular , Transcriptoma , Vertebrados/genética , Animales , Evolución Biológica , Pollos/genética , Exones , Intrones , Lagartos/genética , Ratones/genética , Ratones Endogámicos C57BL/genética , Zarigüeyas/genética , Fenotipo , Ornitorrinco/genética , Primates/genética , Sitios de Empalme de ARN , Secuencias Reguladoras de Ácido Ribonucleico , Especificidad de la Especie , Xenopus/genética
11.
PLoS One ; 5(10): e13348, 2010 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-21049092

RESUMEN

BACKGROUND: Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. METHODOLOGY/PRINCIPAL FINDINGS: We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. CONCLUSION/SIGNIFICANCE: We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets. AVAILABILITY: Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.


Asunto(s)
Biología Computacional , Redes Reguladoras de Genes
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA