Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Bioinformatics ; 37(6): 885-887, 2021 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32871004

RESUMO

MOTIVATION: Causal biological interaction networks represent cellular regulatory pathways. Their fusion with other biological data enables insights into disease mechanisms and novel opportunities for drug discovery. RESULTS: We developed Causal Network of Diseases (CaNDis), a web server for the exploration of a human causal interaction network, which we expanded with data on diseases and FDA-approved drugs, on the basis of which we constructed a disease-disease network in which the links represent the similarity between diseases. We show how CaNDis can be used to identify candidate genes with known and novel roles in disease co-occurrence and drug-drug interactions. AVAILABILITYAND IMPLEMENTATION: CaNDis is freely available to academic users at http://candis.ijs.si and http://candis.insilab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Preparações Farmacêuticas , Software , Biologia Computacional , Computadores , Humanos , Internet
2.
J Chem Inf Model ; 62(6): 1573-1584, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35289616

RESUMO

The protein data bank (PDB) is a rich source of protein ligand structures, but ligands are not explicitly used in current docking algorithms. We have developed ProBiS-Dock, a docking algorithm complementary to the ProBiS-Dock Database (J. Chem. Inf. Model. 2021, 61, 4097-4107) that treats small molecules and proteins as fully flexible entities and allows conformational changes in both after ligand binding. A new scoring function is described that consists of a binding site-specific scoring function (ProBiS-Score) and a general statistical scoring function. ProBiS-Dock enables rapid docking of small molecules to proteins and has been successfully validated in silico against standard benchmarks. It enables rapid search for new active ligands by leveraging existing knowledge in the PDB. The potential of the software for drug development has been confirmed in vitro by the discovery of new inhibitors of human indoleamine 2,3-dioxygenase 1, an enzyme that is an attractive target for cancer therapy and catalyzes the first rate-determining step of l-tryptophan metabolism via the kynurenine pathway. The software is freely available to academic users at http://insilab.org/probisdock.


Assuntos
Algoritmos , Proteínas , Sítios de Ligação , Humanos , Ligantes , Ligação Proteica , Conformação Proteica , Proteínas/química , Software
3.
J Chem Inf Model ; 61(8): 4097-4107, 2021 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-34319727

RESUMO

We have developed a new system, ProBiS-Dock, which can be used to determine the different types of protein binding sites for small ligands. The binding sites identified this way are then used to construct a new binding site database, the ProBiS-Dock Database, that allows for the ranking of binding sites according to their utility for drug development. The newly constructed database currently has more than 1.4 million binding sites and offers the possibility to investigate potential drug targets originating from different biological species. The interactive ProBiS-Dock Database, a web server and repository that consists of all small-molecule ligand binding sites in all of the protein structures in the Protein Data Bank, is freely available at http://probis-dock-database.insilab.org. The ProBiS-Dock Database will be regularly updated to keep pace with the growth of the Protein Data Bank, and our anticipation is that it will be useful in drug discovery.


Assuntos
Desenho de Fármacos , Proteínas , Sítios de Ligação , Bases de Dados de Proteínas , Ligantes , Ligação Proteica , Proteínas/metabolismo , Software
4.
Int J Mol Sci ; 22(7)2021 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-33805429

RESUMO

Bois noir is the most widespread phytoplasma grapevine disease in Europe. It is associated with 'Candidatus Phytoplasma solani', but molecular interactions between the causal pathogen and its host plant are not well understood. In this work, we combined the analysis of high-throughput RNA-Seq and sRNA-Seq data with interaction network analysis for finding new cross-talks among pathways involved in infection of grapevine cv. Zweigelt with 'Ca. P. solani' in early and late growing seasons. While the early growing season was very dynamic at the transcriptional level in asymptomatic grapevines, the regulation at the level of small RNAs was more pronounced later in the season when symptoms developed in infected grapevines. Most differentially expressed small RNAs were associated with biotic stress. Our study also exposes the less-studied role of hormones in disease development and shows that hormonal balance was already perturbed before symptoms development in infected grapevines. Analysis at the level of communities of genes and mRNA-microRNA interaction networks revealed several new genes (e.g., expansins and cryptdin) that have not been associated with phytoplasma pathogenicity previously. These novel actors may present a new reference framework for research and diagnostics of phytoplasma diseases of grapevine.


Assuntos
Interações Hospedeiro-Patógeno/genética , Phytoplasma/patogenicidade , RNA Mensageiro/genética , Vitis/genética , Vitis/microbiologia , Parede Celular/genética , Parede Celular/microbiologia , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , MicroRNAs , Doenças das Plantas/microbiologia , Reguladores de Crescimento de Plantas/genética , Reguladores de Crescimento de Plantas/metabolismo , RNA de Plantas , Análise de Sequência de RNA , Estresse Fisiológico/genética , Vitis/crescimento & desenvolvimento
5.
Molecules ; 26(10)2021 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-34070140

RESUMO

COVID-19 represents a new potentially life-threatening illness caused by severe acute respiratory syndrome coronavirus 2 or SARS-CoV-2 pathogen. In 2021, new variants of the virus with multiple key mutations have emerged, such as B.1.1.7, B.1.351, P.1 and B.1.617, and are threatening to render available vaccines or potential drugs ineffective. In this regard, we highlight 3CLpro, the main viral protease, as a valuable therapeutic target that possesses no mutations in the described pandemically relevant variants. 3CLpro could therefore provide trans-variant effectiveness that is supported by structural studies and possesses readily available biological evaluation experiments. With this in mind, we performed a high throughput virtual screening experiment using CmDock and the "In-Stock" chemical library to prepare prioritisation lists of compounds for further studies. We coupled the virtual screening experiment to a machine learning-supported classification and activity regression study to bring maximal enrichment and available structural data on known 3CLpro inhibitors to the prepared focused libraries. All virtual screening hits are classified according to 3CLpro inhibitor, viral cysteine protease or remaining chemical space based on the calculated set of 208 chemical descriptors. Last but not least, we analysed if the current set of 3CLpro inhibitors could be used in activity prediction and observed that the field of 3CLpro inhibitors is drastically under-represented compared to the chemical space of viral cysteine protease inhibitors. We postulate that this methodology of 3CLpro inhibitor library preparation and compound prioritisation far surpass the selection of compounds from available commercial "corona focused libraries".


Assuntos
Antivirais/química , Proteases 3C de Coronavírus , Inibidores de Cisteína Proteinase/química , SARS-CoV-2/enzimologia , Bibliotecas de Moléculas Pequenas , Proteases 3C de Coronavírus/antagonistas & inibidores , Proteases 3C de Coronavírus/química , Humanos
6.
Nucleic Acids Res ; 45(W1): W253-W259, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28498966

RESUMO

Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org.


Assuntos
Variação Genética , Proteínas/química , Proteínas/genética , Software , Sítios de Ligação , Neoplasias Encefálicas/genética , Bases de Dados de Proteínas , Inibidores Enzimáticos/química , Genes p53 , Estudo de Associação Genômica Ampla , Glioblastoma/genética , Humanos , Indolamina-Pirrol 2,3,-Dioxigenase/química , Indolamina-Pirrol 2,3,-Dioxigenase/genética , Internet , Ligantes , Mutação de Sentido Incorreto , Polimorfismo de Nucleotídeo Único , Proteínas/metabolismo
7.
Front Psychiatry ; 14: 1205119, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37817830

RESUMO

Introduction: Patients with schizophrenia typically exhibit deficits in working memory (WM) associated with abnormalities in brain activity. Alterations in the encoding, maintenance and retrieval phases of sequential WM tasks are well established. However, due to the heterogeneity of symptoms and complexity of its neurophysiological underpinnings, differential diagnosis remains a challenge. We conducted an electroencephalographic (EEG) study during a visual WM task in fifteen schizophrenia patients and fifteen healthy controls. We hypothesized that EEG abnormalities during the task could be identified, and patients successfully classified by an interpretable machine learning algorithm. Methods: We tested a custom dense attention network (DAN) machine learning model to discriminate patients from control subjects and compared its performance with simpler and more commonly used machine learning models. Additionally, we analyzed behavioral performance, event-related EEG potentials, and time-frequency representations of the evoked responses to further characterize abnormalities in patients during WM. Results: The DAN model was significantly accurate in discriminating patients from healthy controls, ACC = 0.69, SD = 0.05. There were no significant differences between groups, conditions, or their interaction in behavioral performance or event-related potentials. However, patients showed significantly lower alpha suppression in the task preparation, memory encoding, maintenance, and retrieval phases F(1,28) = 5.93, p = 0.022, η2 = 0.149. Further analysis revealed that the two highest peaks in the attention value vector of the DAN model overlapped in time with the preparation and memory retrieval phases, as well as with two of the four significant time-frequency ROIs. Discussion: These results highlight the potential utility of interpretable machine learning algorithms as an aid in diagnosis of schizophrenia and other psychiatric disorders presenting oscillatory abnormalities.

8.
Mach Learn ; 110(5): 989-1028, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34720391

RESUMO

Learning from texts has been widely adopted throughout industry and science. While state-of-the-art neural language models have shown very promising results for text classification, they are expensive to (pre-)train, require large amounts of data and tuning of hundreds of millions or more parameters. This paper explores how automatically evolved text representations can serve as a basis for explainable, low-resource branch of models with competitive performance that are subject to automated hyperparameter tuning. We present autoBOT (automatic Bags-Of-Tokens), an autoML approach suitable for low resource learning scenarios, where both the hardware and the amount of data required for training are limited. The proposed approach consists of an evolutionary algorithm that jointly optimizes various sparse representations of a given text (including word, subword, POS tag, keyword-based, knowledge graph-based and relational features) and two types of document embeddings (non-sparse representations). The key idea of autoBOT is that, instead of evolving at the learner level, evolution is conducted at the representation level. The proposed method offers competitive classification performance on fourteen real-world classification tasks when compared against a competitive autoML approach that evolves ensemble models, as well as state-of-the-art neural language models such as BERT and RoBERTa. Moreover, the approach is explainable, as the importance of the parts of the input space is part of the final solution yielded by the proposed optimization procedure, offering potential for meta-transfer learning.

9.
Front Res Metr Anal ; 6: 644614, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33928210

RESUMO

PubMed is the largest resource of curated biomedical knowledge to date, entailing more than 25 million documents. Large quantities of novel literature prevent a single expert from keeping track of all potentially relevant papers, resulting in knowledge gaps. In this article, we present CHEMMESHNET, a newly developed PubMed-based network comprising more than 10,000,000 associations, constructed from expert-curated MeSH annotations of chemicals based on all currently available PubMed articles. By learning latent representations of concepts in the obtained network, we demonstrate in a proof of concept study that purely literature-based representations are sufficient for the reconstruction of a large part of the currently known network of physical, empirically determined protein-protein interactions. We demonstrate that simple linear embeddings of node pairs, when coupled with a neural network-based classifier, reliably reconstruct the existing collection of empirically confirmed protein-protein interactions. Furthermore, we demonstrate how pairs of learned representations can be used to prioritize potentially interesting novel interactions based on the common chemical context. Highly ranked interactions are qualitatively inspected in terms of potential complex formation at the structural level and represent potentially interesting new knowledge. We demonstrate that two protein-protein interactions, prioritized by structure-based approaches, also emerge as probable with regard to the trained machine-learning model.

10.
Comput Biol Med ; 130: 104197, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33429140

RESUMO

Machine learning methods are commonly used for predicting molecular properties to accelerate material and drug design. An important part of this process is deciding how to represent the molecules. Typically, machine learning methods expect examples represented by vectors of values, and many methods for calculating molecular feature representations have been proposed. In this paper, we perform a comprehensive comparison of different molecular features, including traditional methods such as fingerprints and molecular descriptors, and recently proposed learnable representations based on neural networks. Feature representations are evaluated on 11 benchmark datasets, used for predicting properties and measures such as mutagenicity, melting points, activity, solubility, and IC50. Our experiments show that several molecular features work similarly well over all benchmark datasets. The ones that stand out most are Spectrophores, which give significantly worse performance than other features on most datasets. Molecular descriptors from the PaDEL library seem very well suited for predicting physical properties of molecules. Despite their simplicity, MACCS fingerprints performed very well overall. The results show that learnable representations achieve competitive performance compared to expert based representations. However, task-specific representations (graph convolutions and Weave methods) rarely offer any benefits, even though they are computationally more demanding. Lastly, combining different molecular feature representations typically does not give a noticeable improvement in performance compared to individual feature representations.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Desenho de Fármacos
11.
PeerJ Comput Sci ; 7: e559, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34239970

RESUMO

Platforms that feature user-generated content (social media, online forums, newspaper comment sections etc.) have to detect and filter offensive speech within large, fast-changing datasets. While many automatic methods have been proposed and achieve good accuracies, most of these focus on the English language, and are hard to apply directly to languages in which few labeled datasets exist. Recent work has therefore investigated the use of cross-lingual transfer learning to solve this problem, training a model in a well-resourced language and transferring to a less-resourced target language; but performance has so far been significantly less impressive. In this paper, we investigate the reasons for this performance drop, via a systematic comparison of pre-trained models and intermediate training regimes on five different languages. We show that using a better pre-trained language model results in a large gain in overall performance and in zero-shot transfer, and that intermediate training on other languages is effective when little target-language data is available. We then use multiple analyses of classifier confidence and language model vocabulary to shed light on exactly where these gains come from and gain insight into the sources of the most typical mistakes.

12.
Microb Biotechnol ; 14(4): 1269-1281, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34106516

RESUMO

Listeria monocytogenes is a highly pathogenic foodborne bacterium that is ubiquitous in the natural environment and capable of forming persistent biofilms in food processing environments. This species has a rich repertoire of surface structures that enable it to survive, adapt and persist in various environments and promote biofilm formation. We review current understanding and advances on how L. monocytogenes organizes its surface for biofilm formation on surfaces associated with food processing settings, because they may be an important target for development of novel antibiofilm compounds. A synthesis of the current knowledge on the role of Listeria surfactome, comprising peptidoglycan, teichoic acids and cell wall proteins, during biofilm formation on abiotic surfaces is provided. We consider indications gained from genome-wide studies and discuss surfactome structures with established mechanistic aspects in biofilm formation. Additionally, we look at the analogies to the species L. innocua, which is closely related to L. monocytogenes and often used as its model (surrogate) organism.


Assuntos
Listeria monocytogenes , Biofilmes , Parede Celular , Manipulação de Alimentos , Listeria monocytogenes/genética
13.
Plants (Basel) ; 10(4)2021 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-33805409

RESUMO

Understanding temporal biological phenomena is a challenging task that can be approached using network analysis. Here, we explored whether network reconstruction can be used to better understand the temporal dynamics of bois noir, which is associated with 'Candidatus Phytoplasma solani', and is one of the most widespread phytoplasma diseases of grapevine in Europe. We proposed a methodology that explores the temporal network dynamics at the community level, i.e., densely connected subnetworks. The methodology offers both insights into the functional dynamics via enrichment analysis at the community level, and analyses of the community dissipation, as a measure that accounts for community degradation. We validated this methodology with cases on experimental temporal expression data of uninfected grapevines and grapevines infected with 'Ca. P. solani'. These data confirm some known gene communities involved in this infection. They also reveal several new gene communities and their potential regulatory networks that have not been linked to 'Ca. P. solani' to date. To confirm the capabilities of the proposed method, selected predictions were empirically evaluated.

14.
Mach Learn ; 109(11): 2161-2193, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33191975

RESUMO

Mining complex data in the form of networks is of increasing interest in many scientific disciplines. Network communities correspond to densely connected subnetworks, and often represent key functional parts of real-world systems. This paper proposes the embedding-based Silhouette community detection (SCD), an approach for detecting communities, based on clustering of network node embeddings, i.e. real valued representations of nodes derived from their neighborhoods. We investigate the performance of the proposed SCD approach on 234 synthetic networks, as well as on a real-life social network. Even though SCD is not based on any form of modularity optimization, it performs comparably or better than state-of-the-art community detection algorithms, such as the InfoMap and Louvain. Further, we demonstrate that SCD's outputs can be used along with domain ontologies in semantic subgroup discovery, yielding human-understandable explanations of communities detected in a real-life protein interaction network. Being embedding-based, SCD is widely applicable and can be tested out-of-the-box as part of many existing network learning and exploration pipelines.

15.
Mach Learn ; 109(7): 1465-1507, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32704202

RESUMO

Data preprocessing is an important component of machine learning pipelines, which requires ample time and resources. An integral part of preprocessing is data transformation into the format required by a given learning algorithm. This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single table data representation, focusing on the propositionalization and embedding data transformation approaches. While both approaches aim at transforming data into tabular data format, they use different terminology and task definitions, are perceived to address different goals, and are used in different contexts. This paper contributes a unifying framework that allows for improved understanding of these two data transformation techniques by presenting their unified definitions, and by explaining the similarities and differences between the two approaches as variants of a unified complex data transformation task. In addition to the unifying framework, the novelty of this paper is a unifying methodology combining propositionalization and embeddings, which benefits from the advantages of both in solving complex data transformation and learning tasks. We present two efficient implementations of the unifying methodology: an instance-based PropDRM approach, and a feature-based PropStar approach to data transformation and learning, together with their empirical evaluation on several relational problems. The results show that the new algorithms can outperform existing relational learners and can solve much larger problems.

16.
Mol Inform ; 37(6-7): e1700144, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29418080

RESUMO

Many biological phenomena can be represented as complex networks. Using a protein binding site comparison approach, we generated a network of ion binding sites on the scale of all known protein structures from the Protein Data Bank. We found that this ion binding site similarity network is scale-free, indicating a network in which a few ion binding site scaffolds are the network hubs, and these are connected to hundreds of nodes, whereas the vast majority of nodes have only a few neighbors. Enrichment and statistical analysis of the network components and communities yielded insights into underlying processes from the functional and the structural perspective. Largest components and communities were observed to be closely related to basic metabolic processes and some of the most common structural folds, which, from the evolutionary point of view, indicates that they may be the oldest ones. Further, we derived the first comprehensive map of ion interchangeability, based on binding site similarity. Several highly interchangeable protein-ion binding site pairs emerged (e.g., Ca2+ and Mg2+ ), as well as structurally distinct ones. The constructed network of ion binding site similarities will aid in understanding the general principles of protein-ion binding sites structure, function and evolution. We demonstrate potential uses of the network on proteins involved in cancer development and immune response, where individual ions play prominent roles in disease development.


Assuntos
Íons/farmacologia , Simulação de Acoplamento Molecular/métodos , Proteoma/química , Análise de Sequência de Proteína/métodos , Animais , Sítios de Ligação , Calgranulina B/química , Calgranulina B/genética , Calgranulina B/metabolismo , Evolução Molecular , Humanos , Ligação Proteica , Proteoma/genética , Proteoma/metabolismo
17.
Mol Inform ; 36(9)2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28452122

RESUMO

Protein interactions (PI) underlie complex biological processes. Protein interaction partners include DNA, RNA, ions, small chemical compounds, and proteins (protein-protein interactions; PPI). Analysis of sequence variants within regions corresponding to experimentally validated PI sites presents novel opportunities for understanding of complex diseases. Such information has not been systematically collected due to the fact that datasets are dispersed throughout databases and publications. Sequence variants and PI regions were obtained from the UniProt database. The location of the variants was compared to start and end positions of each PPI. Associations of sequence variants with phenotype were obtained from databases including COSMIC, GAD, PharmGKB, and dbSNP. We developed a catalogue of 603 sequence variants located within regions corresponding to experimentally validated PI sites, mostly PPI regions. These sequence variants were previously associated with risk for cancer, reproduction, ageing, renal, and immune system diseases. The developed catalogue connects information from different research papers and databases, represents a new layer of information and enables designing new hypotheses. It provides a baseline for prioritization of sequence variants, which may affect protein function and binding sites. The study contributes to the development of the proteogenomics field and provides new insights for understanding molecular mechanisms underlying disease development.


Assuntos
Simulação de Acoplamento Molecular/métodos , Polimorfismo Genético , Mapeamento de Interação de Proteínas/métodos , Proteoma/química , Análise de Sequência de Proteína/métodos , Sítios de Ligação , Predisposição Genética para Doença , Humanos , Ligação Proteica , Proteoma/genética , Proteoma/metabolismo
18.
J Cheminform ; 9(1): 62, 2017 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-29234984

RESUMO

We describe a novel freely available web server Base of Bioisosterically Exchangeable Replacements (BoBER), which implements an interface to a database of bioisosteric and scaffold hopping replacements. Bioisosterism and scaffold hopping are key concepts in drug design and optimization, and can be defined as replacements of biologically active compound's fragments with other fragments to improve activity, reduce toxicity, change bioavailability or to diversify the scaffold space. Our web server enables fast and user-friendly searches for bioisosteric and scaffold replacements which were obtained by mining the whole Protein Data Bank. The working of the web server is presented on an existing MurF inhibitor as example. BoBER web server enables medicinal chemists to quickly search for and get new and unique ideas about possible bioisosteric or scaffold hopping replacements that could be used to improve hit or lead drug-like compounds.

19.
Comput Biol Med ; 79: 30-35, 2016 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-27744178

RESUMO

BACKGROUND: Protein-protein interactions (PPI) play an important role in function of all organisms and enable understanding of underlying metabolic processes. Computational predictions of PPIs are an important aspect in proteomics, as experimental methods may result in high degree of false positive results and are more expensive. Although there are many databases collecting predicted PPIs, exploration of genetics information underlying PPI interactions has not been investigated thoroughly. The aim of the present study was to identify genomic locations corresponding to regions involved in predicted PPIs and to collect non-synonymous polymorphisms (nsSNPs) located within those regions; which we termed PPI-SNPs. METHODS: Predicted PPIs were obtained from PiSITE database (http://pisite.hgc.jp). Non-synonymous SNPs mapped on protein structural data (PDBs) were obtained from the UCSC server. Polymorphism locations on protein structures were mapped to predicted PPI regions. DAVID tool was used for pathway enrichment and gene cluster analysis (https://david.ncifcrf.gov/). RESULTS: We collected 544 polymorphisms located within predicted PPI sites that map to 197 genes. We identified 9 SNPs, previously associated with diseases, but not yet associated with PPI sites. We also found examples in which polymorphisms located within predicted PPI regions are also occurring within previously experimentally validated PPIs and within experimentally determined functional domains. CONCLUSIONS: Our study provides the first catalog of nsSNPs located within predicted PPIs. These prioritized SNPs present the basis for planning experimental validation of SNPs that cause gain or loss of PPIs. Our implementation is expandable, as datasets used are constantly updated.


Assuntos
Genômica/métodos , Polimorfismo de Nucleotídeo Único/genética , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/genética , Sequência de Aminoácidos , Bases de Dados de Proteínas , Humanos , Modelos Moleculares
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA