Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 174
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
PLoS Comput Biol ; 15(8): e1007239, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31437145

RESUMEN

Tailored therapy aims to cure cancer patients effectively and safely, based on the complex interactions between patients' genomic features, disease pathology and drug metabolism. Thus, the continual increase in scientific literature drives the need for efficient methods of data mining to improve the extraction of useful information from texts based on patients' genomic features. An important application of text mining to tailored therapy in cancer encompasses the use of mutations and cancer fusion genes as moieties that change patients' cellular networks to develop cancer, and also affect drug metabolism. Fusion proteins, which are derived from the slippage of two parental genes, are produced in cancer by chromosomal aberrations and trans-splicing. Given that the two parental proteins for predicted fusion proteins are known, we used our previously developed method for identifying chimeric protein-protein interactions (ChiPPIs) associated with the fusion proteins. Here, we present a validation approach that receives fusion proteins of interest, predicts their cellular network alterations by ChiPPI and validates them by our new method, ProtFus, using an online literature search. This process resulted in a set of 358 fusion proteins and their corresponding protein interactions, as a training set for a Naïve Bayes classifier, to identify predicted fusion proteins that have reliable evidence in the literature and that were confirmed experimentally. Next, for a test group of 1817 fusion proteins, we were able to identify from the literature 2908 PPIs in total, across 18 cancer types. The described method, ProtFus, can be used for screening the literature to identify unique cases of fusion proteins and their PPIs, as means of studying alterations of protein networks in cancers. Availability: http://protfus.md.biu.ac.il/.


Asunto(s)
Minería de Datos/métodos , Proteínas de Fusión Oncogénica/genética , Mapeo de Interacción de Proteínas/métodos , Algoritmos , Teorema de Bayes , Macrodatos , Biología Computacional , Minería de Datos/estadística & datos numéricos , Bases de Datos Genéticas , Humanos , Mutación , Neoplasias/genética , Neoplasias/terapia , Proteínas de Fusión Oncogénica/química , Proteínas de Fusión Oncogénica/metabolismo , Medicina de Precisión , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Mapas de Interacción de Proteínas
2.
PLoS Comput Biol ; 15(4): e1006888, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30995217

RESUMEN

In response to a need for improved treatments, a number of promising novel targeted cancer therapies are being developed that exploit human synthetic lethal interactions. This is facilitating personalised medicine strategies in cancers where specific tumour suppressors have become inactivated. Mainly due to the constraints of the experimental procedures, relatively few human synthetic lethal interactions have been identified. Here we describe SLant (Synthetic Lethal analysis via Network topology), a computational systems approach to predicting human synthetic lethal interactions that works by identifying and exploiting conserved patterns in protein interaction network topology both within and across species. SLant out-performs previous attempts to classify human SSL interactions and experimental validation of the models predictions suggests it may provide useful guidance for future SSL screenings and ultimately aid targeted cancer therapy development.


Asunto(s)
Mapas de Interacción de Proteínas/genética , Mutaciones Letales Sintéticas , Algoritmos , Animales , Inteligencia Artificial , Biología Computacional , Descubrimiento de Drogas , Ontología de Genes , Genes Esenciales , Humanos , Modelos Biológicos , Terapia Molecular Dirigida , Familia de Multigenes , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/terapia , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Mapas de Interacción de Proteínas/efectos de los fármacos , Biología Sintética , Mutaciones Letales Sintéticas/genética , Proteínas Supresoras de Tumor/genética , Proteínas Supresoras de Tumor/metabolismo
3.
Nat Commun ; 10(1): 1118, 2019 03 08.
Artículo en Inglés | MEDLINE | ID: mdl-30850613

RESUMEN

It remains a significant challenge to define individual protein associations within networks where an individual protein can directly interact with other proteins and/or be part of large complexes, which contain functional modules. Here we demonstrate the topological scoring (TopS) algorithm for the analysis of quantitative proteomic datasets from affinity purifications. Data is analyzed in a parallel fashion where a prey protein is scored in an individual affinity purification by aggregating information from the entire dataset. Topological scores span a broad range of values indicating the enrichment of an individual protein in every bait protein purification. TopS is applied to interaction networks derived from human DNA repair proteins and yeast chromatin remodeling complexes. TopS highlights potential direct protein interactions and modules within complexes. TopS is a rapid method for the efficient and informative computational analysis of datasets, is complementary to existing analysis pipelines, and provides important insights into protein interaction networks.


Asunto(s)
Algoritmos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Mapas de Interacción de Proteínas , Ensamble y Desensamble de Cromatina , Reparación del ADN , Bases de Datos de Proteínas/estadística & datos numéricos , Humanos , Funciones de Verosimilitud , Proteómica/estadística & datos numéricos , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
4.
Brief Bioinform ; 20(1): 274-287, 2019 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-29028906

RESUMEN

The identification of plant-pathogen protein-protein interactions (PPIs) is an attractive and challenging research topic for deciphering the complex molecular mechanism of plant immunity and pathogen infection. Considering that the experimental identification of plant-pathogen PPIs is time-consuming and labor-intensive, computational methods are emerging as an important strategy to complement the experimental methods. In this work, we first evaluated the performance of traditional computational methods such as interolog, domain-domain interaction and domain-motif interaction in predicting known plant-pathogen PPIs. Owing to the low sensitivity of the traditional methods, we utilized Random Forest to build an inter-species PPI prediction model based on multiple sequence encodings and novel network attributes in the established plant PPI network. Critical assessment of the features demonstrated that the integration of sequence information and network attributes resulted in significant and robust performance improvement. Additionally, we also discussed the influence of Gene Ontology and gene expression information on the prediction performance. The Web server implementing the integrated prediction method, named InterSPPI, has been made freely available at http://systbio.cau.edu.cn/intersppi/index.php. InterSPPI could achieve a reasonably high accuracy with a precision of 73.8% and a recall of 76.6% in the independent test. To examine the applicability of InterSPPI, we also conducted cross-species and proteome-wide plant-pathogen PPI prediction tests. Taken together, we hope this work can provide a comprehensive understanding of the current status of plant-pathogen PPI predictions, and the proposed InterSPPI can become a useful tool to accelerate the exploration of plant-pathogen interactions.


Asunto(s)
Proteínas de Plantas/metabolismo , Plantas/metabolismo , Plantas/microbiología , Mapeo de Interacción de Proteínas/métodos , Algoritmos , Arabidopsis/genética , Arabidopsis/metabolismo , Arabidopsis/microbiología , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/inmunología , Proteínas de Arabidopsis/metabolismo , Biología Computacional/métodos , Bases de Datos de Proteínas/estadística & datos numéricos , Perfilación de la Expresión Génica/estadística & datos numéricos , Ontología de Genes , Interacciones Huésped-Patógeno/genética , Interacciones Huésped-Patógeno/inmunología , Aprendizaje Automático , Modelos Biológicos , Enfermedades de las Plantas/genética , Enfermedades de las Plantas/inmunología , Enfermedades de las Plantas/microbiología , Inmunidad de la Planta/genética , Proteínas de Plantas/genética , Proteínas de Plantas/inmunología , Plantas/genética , Mapeo de Interacción de Proteínas/estadística & datos numéricos
5.
J Proteome Res ; 17(11): 3740-3748, 2018 11 02.
Artículo en Inglés | MEDLINE | ID: mdl-30265007

RESUMEN

Metabolic labeling with heavy water followed by LC-MS is a high throughput approach to study proteostasis in vivo. Advances in mass spectrometry and sample processing have allowed consistent detection of thousands of proteins at multiple time points. However, freely available automated bioinformatics tools to analyze and extract protein decay rate constants are lacking. Here, we describe d2ome-a robust, automated software solution for in vivo protein turnover analysis. d2ome is highly scalable, uses innovative approaches to nonlinear fitting, implements Grubbs' outlier detection and removal, uses weighted-averaging of replicates, applies a data dependent elution time windowing, and uses mass accuracy in peak detection. Here, we discuss the application of d2ome in a comparative study of protein turnover in the livers of normal vs Western diet-fed LDLR-/- mice (mouse model of nonalcoholic fatty liver disease), which contained 256 LC-MS experiments. The study revealed reduced stability of 40S ribosomal protein subunits in the Western diet-fed mice.


Asunto(s)
Óxido de Deuterio/metabolismo , Hígado/metabolismo , Enfermedad del Hígado Graso no Alcohólico/metabolismo , Proteoma/metabolismo , Proteínas Ribosómicas/metabolismo , Programas Informáticos , Animales , Cromatografía Liquida , Óxido de Deuterio/química , Dieta Occidental/efectos adversos , Modelos Animales de Enfermedad , Expresión Génica , Semivida , Marcaje Isotópico/métodos , Hígado/química , Hígado/patología , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Enfermedad del Hígado Graso no Alcohólico/etiología , Enfermedad del Hígado Graso no Alcohólico/genética , Enfermedad del Hígado Graso no Alcohólico/patología , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proteolisis , Proteoma/química , Proteoma/genética , Proteoma/aislamiento & purificación , Proteostasis/genética , Receptores de LDL/deficiencia , Receptores de LDL/genética , Proteínas Ribosómicas/química , Proteínas Ribosómicas/genética , Proteínas Ribosómicas/aislamiento & purificación , Espectrometría de Masas en Tándem
6.
Methods Enzymol ; 603: 221-235, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29673528

RESUMEN

Although general anesthesia induced by inhaled anesthetics produces definitive phenotypes (e.g., loss of mobility, amnesia, analgesia), the underlying targets of these drugs are still not clear. Genomics and proteomic techniques are discussed for measurement of global transcriptional and translational changes after inhaled anesthetic exposures. The current discussion focuses primarily on the genomic and proteomic technical methodology. We also include a discussion of network and pathway analyses for data interpretation after identification of the targets.


Asunto(s)
Anestésicos por Inhalación/farmacocinética , Redes Reguladoras de Genes , Biosíntesis de Proteínas , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proteogenómica/métodos , Transcripción Genética , Anestesia General , Anestésicos por Inhalación/farmacología , Animales , Radioisótopos de Carbono , Corteza Cerebral/citología , Corteza Cerebral/efectos de los fármacos , Corteza Cerebral/metabolismo , Electroforesis en Gel Bidimensional/métodos , Feto , Halotano/farmacocinética , Humanos , Isoflurano/farmacocinética , Ratones , Neuronas/citología , Neuronas/efectos de los fármacos , Neuronas/metabolismo , Cultivo Primario de Células , Unión Proteica , Proteogenómica/instrumentación , Ratas , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Sevoflurano/farmacocinética , Coloración y Etiquetado/métodos
7.
Pac Symp Biocomput ; 23: 92-103, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29218872

RESUMEN

The emergence of drug resistance to traditional chemotherapy and newer targeted therapies in cancer patients is a major clinical challenge. Reactivation of the same or compensatory signaling pathways is a common class of drug resistance mechanisms. Employing drug combinations that inhibit multiple modules of reactivated signaling pathways is a promising strategy to overcome and prevent the onset of drug resistance. However, with thousands of available FDA-approved and investigational compounds, it is infeasible to experimentally screen millions of possible drug combinations with limited resources. Therefore, computational approaches are needed to constrain the search space and prioritize synergistic drug combinations for preclinical studies. In this study, we propose a novel approach for predicting drug combinations through investigating potential effects of drug targets on disease signaling network. We first construct a disease signaling network by integrating gene expression data with disease-associated driver genes. Individual drugs that can partially perturb the disease signaling network are then selected based on a drug-disease network "impact matrix", which is calculated using network diffusion distance from drug targets to signaling network elements. The selected drugs are subsequently clustered into communities (subgroups), which are proposed to share similar mechanisms of action. Finally, drug combinations are ranked according to maximal impact on signaling sub-networks from distinct mechanism-based communities. Our method is advantageous compared to other approaches in that it does not require large amounts drug dose response data, drug-induced "omics" profiles or clinical efficacy data, which are not often readily available. We validate our approach using a BRAF-mutant melanoma signaling network and combinatorial in vitro drug screening data, and report drug combinations with diverse mechanisms of action and opportunities for drug repositioning.


Asunto(s)
Quimioterapia Combinada/métodos , Transducción de Señal/efectos de los fármacos , Protocolos de Quimioterapia Combinada Antineoplásica , Biología Computacional/métodos , Combinación de Medicamentos , Reposicionamiento de Medicamentos , Resistencia a Medicamentos , Resistencia a Antineoplásicos , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos , Melanoma/tratamiento farmacológico , Melanoma/genética , Mutación , Neoplasias/tratamiento farmacológico , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proteínas Proto-Oncogénicas B-raf/antagonistas & inhibidores , Proteínas Proto-Oncogénicas B-raf/genética
8.
Pac Symp Biocomput ; 23: 111-122, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29218874

RESUMEN

Discovering disease pathways, which can be defined as sets of proteins associated with a given disease, is an important problem that has the potential to provide clinically actionable insights for disease diagnosis, prognosis, and treatment. Computational methods aid the discovery by relying on protein-protein interaction (PPI) networks. They start with a few known disease-associated proteins and aim to find the rest of the pathway by exploring the PPI network around the known disease proteins. However, the success of such methods has been limited, and failure cases have not been well understood. Here we study the PPI network structure of 519 disease pathways. We find that 90% of pathways do not correspond to single well-connected components in the PPI network. Instead, proteins associated with a single disease tend to form many separate connected components/regions in the network. We then evaluate state-of-the-art disease pathway discovery methods and show that their performance is especially poor on diseases with disconnected pathways. Thus, we conclude that network connectivity structure alone may not be sufficient for disease pathway discovery. However, we show that higher-order network structures, such as small subgraphs of the pathway, provide a promising direction for the development of new methods.


Asunto(s)
Enfermedad/etiología , Mapas de Interacción de Proteínas , Algoritmos , Biología Computacional/métodos , Humanos , Mapeo de Interacción de Proteínas/métodos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proteoma , Proteómica/métodos , Proteómica/estadística & datos numéricos , Transducción de Señal
9.
Brief Bioinform ; 19(5): 995-1007, 2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-28369159

RESUMEN

Various techniques have been developed for identifying the most probable interactants of a protein under a given biological context. In this article, we dissect the effects of the choice of the protein-protein interaction network (PPI) and the manipulation of PPI settings on the network neighborhood of the influenza A virus (IAV) network, as well as hits in genome-wide small interfering RNA screen results for IAV host factors. We investigate the potential of context filtering, which uses text mining evidence linked to PPI edges, as a complement to the edge confidence scores typically provided in PPIs for filtering, for obtaining more biologically relevant network neighborhoods. Here, we estimate the maximum performance of context filtering to isolate a Kyoto Encyclopedia of Genes and Genomes (KEGG) network Ki from a union of KEGG networks and its network neighborhood. The work gives insights on the use of human PPIs in network neighborhood approaches for functional inference.


Asunto(s)
Mapas de Interacción de Proteínas , Algoritmos , Biología Computacional/métodos , Minería de Datos , Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Interacciones Microbiota-Huesped/genética , Interacciones Microbiota-Huesped/fisiología , Humanos , Virus de la Influenza A/genética , Virus de la Influenza A/patogenicidad , Virus de la Influenza A/fisiología , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Mapas de Interacción de Proteínas/genética , ARN Interferente Pequeño/genética
10.
Curr Protoc Bioinformatics ; 60: 8.2.1-8.2.14, 2017 12 08.
Artículo en Inglés | MEDLINE | ID: mdl-29220074

RESUMEN

The authors provide an overview of physical protein-protein interaction prediction, covering the main strategies for predicting interactions, approaches for assessing predictions, and online resources for accessing predictions. This unit focuses on the main advancements in each of these areas over the last decade. The methods and resources that are presented here are not an exhaustive set, but characterize the current state of the field-highlighting key challenges and achievements. © 2017 by John Wiley & Sons, Inc.


Asunto(s)
Mapas de Interacción de Proteínas , Animales , Biología Computacional , Genómica , Humanos , Aprendizaje Automático , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Mapeo de Interacción de Proteínas/estadística & datos numéricos
11.
PLoS Comput Biol ; 13(12): e1005905, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29281622

RESUMEN

Peptide-protein interactions contribute a significant fraction of the protein-protein interactome. Accurate modeling of these interactions is challenging due to the vast conformational space associated with interactions of highly flexible peptides with large receptor surfaces. To address this challenge we developed a fragment based high-resolution peptide-protein docking protocol. By streamlining the Rosetta fragment picker for accurate peptide fragment ensemble generation, the PIPER docking algorithm for exhaustive fragment-receptor rigid-body docking and Rosetta FlexPepDock for flexible full-atom refinement of PIPER docked models, we successfully addressed the challenge of accurate and efficient global peptide-protein docking at high-resolution with remarkable accuracy, as validated on a small but representative set of peptide-protein complex structures well resolved by X-ray crystallography. Our approach opens up the way to high-resolution modeling of many more peptide-protein interactions and to the detailed study of peptide-protein association in general. PIPER-FlexPepDock is freely available to the academic community as a server at http://piperfpd.furmanlab.cs.huji.ac.il.


Asunto(s)
Algoritmos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Biología Computacional , Simulación por Computador , Cristalografía por Rayos X , Modelos Moleculares , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , Fragmentos de Péptidos/química , Conformación Proteica , Programas Informáticos
13.
BMC Genomics ; 18(Suppl 2): 209, 2017 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-28361692

RESUMEN

BACKGROUND: Active modules are connected regions in biological network which show significant changes in expression over particular conditions. The identification of such modules is important since it may reveal the regulatory and signaling mechanisms that associate with a given cellular response. RESULTS: In this paper, we propose a novel active module identification algorithm based on a memetic algorithm. We propose a novel encoding/decoding scheme to ensure the connectedness of the identified active modules. Based on the scheme, we also design and incorporate a local search operator into the memetic algorithm to improve its performance. CONCLUSION: The effectiveness of proposed algorithm is validated on both small and large protein interaction networks.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Mapas de Interacción de Proteínas , Humanos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Transducción de Señal
14.
Biochemistry ; 56(11): 1573-1584, 2017 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-28267310

RESUMEN

A major biochemical goal is the ability to mimic nature in engineering highly specific protein-protein interactions (PPIs). We previously devised a computational interactome screen to identify eight peptides that form four heterospecific dimers despite 32 potential off-targets. To expand the speed and utility of our approach and the PPI toolkit, we have developed new software to derive much larger heterospecific sets (≥24 peptides) while directing against antiparallel off-targets. It works by predicting Tm values for every dimer on the basis of core, electrostatic, and helical propensity components. These guide interaction specificity, allowing heterospecific coiled coil (CC) sets to be incrementally assembled. Prediction accuracy is experimentally validated using circular dichroism and size exclusion chromatography. Thermal denaturation data from a 22-CC training set were used to improve software prediction accuracy and verified using a 136-CC test set consisting of eight predicted heterospecific dimers and 128 off-targets. The resulting software, qCIPA, individually now weighs core a-a' (II/NN/NI) and electrostatic g-e'+1 (EE/EK/KK) components. The expanded data set has resulted in emerging sequence context rules for otherwise energetically equivalent CCs; for example, introducing intrahelical electrostatic charge blocks generated increased stability for designed CCs while concomitantly decreasing the stability of off-target CCs. Coupled with increased prediction accuracy and speed, the approach can be applied to a wide range of downstream chemical and synthetic biology applications, in addition more generally to impose specificity on structurally unrelated PPIs.


Asunto(s)
Modelos Estadísticos , Péptidos/química , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Programas Informáticos , Biblioteca de Péptidos , Péptidos/metabolismo , Dominios y Motivos de Interacción de Proteínas , Multimerización de Proteína , Estructura Secundaria de Proteína , Electricidad Estática , Termodinámica
15.
J Biosci ; 42(3): 383-396, 2017 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-29358552

RESUMEN

Protein complexes are known to play a major role in controlling cellular activity in a living being. Identifying complexes from raw protein-protein interactions (PPIs) is an important area of research. Earlier work has been limited mostly to yeast and a few other model organisms. Such protein complex identification methods, when applied to large human PPIs often give poor performance. We introduce a novel method called ComFiR to detect such protein complexes and further rank diseased complexes based on a query disease. We have shown that it has better performance in identifying protein complexes from human PPI data. This method is evaluated in terms of positive predictive value, sensitivity and accuracy. We have introduced a ranking approach and showed its application on Alzheimer's disease.


Asunto(s)
Algoritmos , Enfermedad de Alzheimer/metabolismo , Biología Computacional/métodos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/patología , Bases de Datos de Proteínas , Humanos , Unión Proteica
16.
J Comput Biol ; 24(2): 172-182, 2017 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-27508455

RESUMEN

The selection of relevant genes for breast cancer metastasis is critical for the treatment and prognosis of cancer patients. Although much effort has been devoted to the gene selection procedures by use of different statistical analysis methods or computational techniques, the interpretation of the variables in the resulting survival models has been limited so far. This article proposes a new Random Forest (RF)-based algorithm to identify important variables highly related with breast cancer metastasis, which is based on the important scores of two variable selection algorithms, including the mean decrease Gini (MDG) criteria of Random Forest and the GeneRank algorithm with protein-protein interaction (PPI) information. The new gene selection algorithm can be called PPIRF. The improved prediction accuracy fully illustrated the reliability and high interpretability of gene list selected by the PPIRF approach.


Asunto(s)
Algoritmos , Neoplasias de la Mama/genética , Regulación Neoplásica de la Expresión Génica , Proteínas de Neoplasias/genética , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/mortalidad , Neoplasias de la Mama/patología , Conjuntos de Datos como Asunto , Femenino , Humanos , Estimación de Kaplan-Meier , Metástasis de la Neoplasia , Proteínas de Neoplasias/metabolismo , Curva ROC
17.
Proteins ; 85(3): 378-390, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-27701780

RESUMEN

Computational protein-protein docking is of great importance for understanding protein interactions at the structural level. Critical assessment of prediction of interactions (CAPRI) experiments provide the protein docking community with a unique opportunity to blindly test methods based on real-life cases and help accelerate methodology development. For CAPRI Rounds 28-35, we used an automatic docking pipeline integrating the coarse-grained co-evolution-based potential InterEvScore. This score was developed to exploit the information contained in the multiple sequence alignments of binding partners and selectively recognize co-evolved interfaces. Together with Zdock/Frodock for rigid-body docking, SOAP-PP for atomic potential and Rosetta applications for structural refinement, this pipeline reached high performance on a majority of targets. For protein-peptide docking and interfacial water position predictions, we also explored different means of taking evolutionary information into account. Overall, our group ranked 1st by correctly predicting 10 targets, composed of 1 High, 7 Medium and 2 Acceptable predictions. Excellent and Outstanding levels of accuracy were reached for each of the two water prediction targets, respectively. Altogether, in 15 out of 18 targets in total, evolutionary information, either through co-evolution or conservation analyses, could provide key constraints to guide modeling towards the most likely assemblies. These results open promising perspectives regarding the way evolutionary information can be valuable to improve docking prediction accuracy. Proteins 2017; 85:378-390. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/métodos , Simulación del Acoplamiento Molecular , Péptidos/química , Mapeo de Interacción de Proteínas/métodos , Proteínas/química , Agua/química , Algoritmos , Secuencia de Aminoácidos , Benchmarking , Sitios de Unión , Unión Proteica , Conformación Proteica , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proyectos de Investigación , Alineación de Secuencia , Programas Informáticos
18.
J Comput Biol ; 24(2): 183-192, 2017 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-27529135

RESUMEN

BACKGROUND: There are many computational approaches to predict the protein-protein interactions using support vector machines (SVMs) with high performance. In fact, performance of currently reported methods are significantly over-estimated and affected by the object repetitiveness in the datasets used. OBJECTIVE: To study the effect of object repetitiveness of datasets on predicting results. METHOD: We present novel methods to construct different positive datasets with or without repeating proteins using graph maximum matching in the protein-protein interaction datasets and corresponding series of negative datasets with different proteins repetitiveness are constructed using graph adjacency matrix. The relationship between the SVM prediction results and the repeated proteins (repeat numbers and repeat rates) and the distributions of repeated proteins in the datasets are analyzed. RESULTS: Protein repetitiveness of positive and negative datasets can affect the prediction result: high protein repetitiveness of positive or negative datasets yield high performance prediction result. CONCLUSION: This indicate that dealing with object repetitiveness of datasets is a key issue in protein-protein interactions prediction using SVMs since real world data contain certain degrees of repeat proteins.


Asunto(s)
Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proteínas/química , Máquina de Vectores de Soporte , Conjuntos de Datos como Asunto , Humanos , Saccharomyces cerevisiae/genética
19.
Pac Symp Biocomput ; 22: 27-38, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-27896959

RESUMEN

Automated annotation of protein function has become a critical task in the post-genomic era. Network-based approaches and homology-based approaches have been widely used and recently tested in large-scale community-wide assessment experiments. It is natural to integrate network data with homology information to further improve the predictive performance. However, integrating these two heterogeneous, high-dimensional and noisy datasets is non-trivial. In this work, we introduce a novel protein function prediction algorithm ProSNet. An integrated heterogeneous network is first built to include molecular networks of multiple species and link together homologous proteins across multiple species. Based on this integrated network, a dimensionality reduction algorithm is introduced to obtain compact low-dimensional vectors to encode proteins in the network. Finally, we develop machine learning classification algorithms that take the vectors as input and make predictions by transferring annotations both within each species and across different species. Extensive experiments on five major species demonstrate that our integration of homology with molecular networks substantially improves the predictive performance over existing approaches.


Asunto(s)
Algoritmos , Anotación de Secuencia Molecular/estadística & datos numéricos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Animales , Biología Computacional , Humanos , Aprendizaje Automático , Ratones , Homología de Secuencia de Aminoácido
20.
Proteins ; 85(1): 137-154, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-27802579

RESUMEN

Cells are interactive living systems where proteins movements, interactions and regulation are substantially free from centralized management. How protein physico-chemical and geometrical properties determine who interact with whom remains far from fully understood. We show that characterizing how a protein behaves with many potential interactors in a complete cross-docking study leads to a sharp identification of its cellular/true/native partner(s). We define a sociability index, or S-index, reflecting whether a protein likes or not to pair with other proteins. Formally, we propose a suitable normalization function that accounts for protein sociability and we combine it with a simple interface-based (ranking) score to discriminate partners from non-interactors. We show that sociability is an important factor and that the normalization permits to reach a much higher discriminative power than shape complementarity docking scores. The social effect is also observed with more sophisticated docking algorithms. Docking conformations are evaluated using experimental binding sites. These latter approximate in the best possible way binding sites predictions, which have reached high accuracy in recent years. This makes our analysis helpful for a global understanding of partner identification and for suggesting discriminating strategies. These results contradict previous findings claiming the partner identification problem being solvable solely with geometrical docking. Proteins 2016; 85:137-154. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Modelos Estadísticos , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Proteínas/química , Sitios de Unión , Escherichia coli/química , Humanos , Simulación del Acoplamiento Molecular , Unión Proteica , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Estructura Secundaria de Proteína , Programas Informáticos , Virus/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA