Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
BMC Bioinformatics ; 25(1): 129, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38532339

RESUMO

BACKGROUND: The RNA-Recognition motif (RRM) is a protein domain that binds single-stranded RNA (ssRNA) and is present in as much as 2% of the human genome. Despite this important role in biology, RRM-ssRNA interactions are very challenging to study on the structural level because of the remarkable flexibility of ssRNA. In the absence of atomic-level experimental data, the only method able to predict the 3D structure of protein-ssRNA complexes with any degree of accuracy is ssRNA'TTRACT, an ssRNA fragment-based docking approach using ATTRACT. However, since ATTRACT parameters are not ssRNA-specific and were determined in 2010, there is substantial opportunity for enhancement. RESULTS: Here we present HIPPO, a composite RRM-ssRNA scoring potential derived analytically from contact frequencies in near-native versus non-native docking models. HIPPO consists of a consensus of four distinct potentials, each extracted from a distinct reference pool of protein-trinucleotide docking decoys. To score a docking pose with one potential, for each pair of RNA-protein coarse-grained bead types, each contact is awarded or penalised according to the relative frequencies of this contact distance range among the correct and incorrect poses of the reference pool. Validated on a fragment-based docking benchmark of 57 experimentally solved RRM-ssRNA complexes, HIPPO achieved a threefold or higher enrichment for half of the fragments, versus only a quarter with the ATTRACT scoring function. In particular, HIPPO drastically improved the chance of very high enrichment (12-fold or higher), a scenario where the incremental modelling of entire ssRNA chains from fragments becomes viable. However, for the latter result, more research is needed to make it directly practically applicable. Regardless, our approach already improves upon the state of the art in RRM-ssRNA modelling and is in principle extendable to other types of protein-nucleic acid interactions.


Assuntos
Proteínas , RNA , Humanos , Ligação Proteica , Proteínas/química , RNA/química , Simulação de Acoplamento Molecular , Conformação Proteica
2.
Open Res Eur ; 3: 97, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37645489

RESUMO

Background: Data management is fast becoming an essential part of scientific practice, driven by open science and FAIR (findable, accessible, interoperable, and reusable) data sharing requirements. Whilst data management plans (DMPs) are clear to data management experts and data stewards, understandings of their purpose and creation are often obscure to the producers of the data, which in academic environments are often PhD students. Methods: Within the RNAct EU Horizon 2020 ITN project, we engaged the 10 RNAct early-stage researchers (ESRs) in a training project aimed at formulating a DMP. To do so, we used the Data Stewardship Wizard (DSW) framework and modified the existing Life Sciences Knowledge Model into a simplified version aimed at training young scientists, with computational or experimental backgrounds, in core data management principles. We collected feedback from the ESRs during this exercise. Results: Here, we introduce our new life-sciences training DMP template for young scientists. We report and discuss our experiences as principal investigators (PIs) and ESRs during this project and address the typical difficulties that are encountered in developing and understanding a DMP. Conclusions: We found that the DS-wizard can also be an appropriate tool for DMP training, to get terminology and concepts across to researchers. A full training in addition requires an upstream step to present basic DMP concepts and a downstream step to publish a dataset in a (public) repository. Overall, the DS-Wizard tool was essential for our DMP training and we hope our efforts can be used in other projects.

3.
Sci Rep ; 13(1): 3643, 2023 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-36871056

RESUMO

The search for an effective drug is still urgent for COVID-19 as no drug with proven clinical efficacy is available. Finding the new purpose of an approved or investigational drug, known as drug repurposing, has become increasingly popular in recent years. We propose here a new drug repurposing approach for COVID-19, based on knowledge graph (KG) embeddings. Our approach learns "ensemble embeddings" of entities and relations in a COVID-19 centric KG, in order to get a better latent representation of the graph elements. Ensemble KG-embeddings are subsequently used in a deep neural network trained for discovering potential drugs for COVID-19. Compared to related works, we retrieve more in-trial drugs among our top-ranked predictions, thus giving greater confidence in our prediction for out-of-trial drugs. For the first time to our knowledge, molecular docking is then used to evaluate the predictions obtained from drug repurposing using KG embedding. We show that Fosinopril is a potential ligand for the SARS-CoV-2 nsp13 target. We also provide explanations of our predictions thanks to rules extracted from the KG and instanciated by KG-derived explanatory paths. Molecular evaluation and explanatory paths bring reliability to our results and constitute new complementary and reusable methods for assessing KG-based drug repurposing.


Assuntos
COVID-19 , Humanos , SARS-CoV-2 , Reposicionamento de Medicamentos , Simulação de Acoplamento Molecular , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes , Aprendizagem
4.
J Biomed Inform ; 135: 104212, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36182054

RESUMO

Machine learning is now an essential part of any biomedical study but its integration into real effective Learning Health Systems, including the whole process of Knowledge Discovery from Data (KDD), is not yet realised. We propose an original extension of the KDD process model that involves an inductive database. We designed for the first time a generic model of Inductive Clinical DataBase (ICDB) aimed at hosting both patient data and learned models. We report experiments conducted on patient data in the frame of a project dedicated to fight heart failure. The results show how the ICDB approach allows to identify biomarker combinations, specific and predictive of heart fibrosis phenotype, that put forward hypotheses relative to underlying mechanisms. Two main scenarios were considered, a local-to-global KDD scenario and a trans-cohort alignment scenario. This promising proof of concept enables us to draw the contours of a next-generation Knowledge Discovery Environment (KDE).


Assuntos
Mineração de Dados , Descoberta do Conhecimento , Bases de Dados Factuais
5.
JACC Cardiovasc Imaging ; 15(2): 193-208, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34538625

RESUMO

OBJECTIVES: This study sought to identify homogenous echocardiographic phenotypes in community-based cohorts and assess their association with outcomes. BACKGROUND: Asymptomatic cardiac dysfunction leads to a high risk of long-term cardiovascular morbidity and mortality; however, better echocardiographic classification of asymptomatic individuals remains a challenge. METHODS: Echocardiographic phenotypes were identified using K-means clustering in the first generation of the STANISLAS (Yearly non-invasive follow-up of Health status of Lorraine insured inhabitants) cohort (N = 827; mean age: 60 ± 5 years; men: 48%), and their associations with vascular function and circulating biomarkers were also assessed. These phenotypes were externally validated in the Malmö Preventive Project cohort (N = 1,394; mean age: 67 ± 6 years; men: 70%), and their associations with the composite of cardiovascular mortality (CVM) or heart failure hospitalization (HFH) were assessed as well. RESULTS: Three echocardiographic phenotypes were identified as "mostly normal (MN)" (n = 334), "diastolic changes (D)" (n = 323), and "diastolic changes with structural remodeling (D/S)" (n = 170). The D and D/S phenotypes had similar ages, body mass indices, cardiovascular risk factors, vascular impairments, and diastolic function changes. The D phenotype consisted mainly of women and featured increased levels of inflammatory biomarkers, whereas the D/S phenotype, consisted predominantly of men, displayed the highest values of left ventricular mass, volume, and remodeling biomarkers. The phenotypes were predicted based on a simple algorithm including e', left ventricular mass and volume (e'VM algorithm). In the Malmö cohort, subgroups derived from e'VM algorithm were significantly associated with a higher risk of CVM and HFH (adjusted HR in the D phenotype = 1.87; 95% CI: 1.04 to 3.37; adjusted HR in the D/S phenotype = 3.02; 95% CI: 1.71 to 5.34). CONCLUSIONS: Among asymptomatic, middle-aged individuals, echocardiographic data-driven classification based on the simple e'VM algorithm identified profiles with different long-term HF risk. (4th Visit at 17 Years of Cohort STANISLAS-Stanislas Ancillary Study ESCIF [STANISLASV4]; NCT01391442).


Assuntos
Ecocardiografia , Insuficiência Cardíaca , Idoso , Feminino , Insuficiência Cardíaca/diagnóstico por imagem , Insuficiência Cardíaca/epidemiologia , Humanos , Incidência , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Fenótipo , Valor Preditivo dos Testes , Prognóstico , Volume Sistólico , Função Ventricular Esquerda
6.
Cells ; 10(12)2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34943948

RESUMO

Glioblastoma (GBM) is the most common brain tumor in adults, which is very aggressive, with a very poor prognosis that affects men twice as much as women, suggesting that female hormones (estrogen) play a protective role. With an in silico approach, we highlighted that the expression of the membrane G-protein-coupled estrogen receptor (GPER) had an impact on GBM female patient survival. In this context, we explored for the first time the role of the GPER agonist G-1 on GBM cell proliferation. Our results suggested that G-1 exposure had a cytostatic effect, leading to reversible G2/M arrest, due to tubulin polymerization blockade during mitosis. However, the observed effect was independent of GPER. Interestingly, G-1 potentiated the efficacy of temozolomide, the current standard chemotherapy treatment, since the combination of both treatments led to prolonged mitotic arrest, even in a temozolomide less-sensitive cell line. In conclusion, our results suggested that G-1, in combination with standard chemotherapy, might be a promising way to limit the progression and aggressiveness of GBM.


Assuntos
Ciclopentanos/farmacologia , Glioblastoma/tratamento farmacológico , Quinolinas/farmacologia , Receptores de Estrogênio/genética , Receptores Acoplados a Proteínas G/genética , Temozolomida/farmacologia , Tubulina (Proteína)/genética , Animais , Apoptose/efeitos dos fármacos , Proliferação de Células/efeitos dos fármacos , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Glioblastoma/genética , Glioblastoma/patologia , Humanos , Camundongos , Mitose/efeitos dos fármacos , Receptores Acoplados a Proteínas G/agonistas , Ensaios Antitumorais Modelo de Xenoenxerto
7.
BMC Med Inform Decis Mak ; 21(1): 171, 2021 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-34039343

RESUMO

BACKGROUND: Adverse drug reactions (ADRs) are statistically characterized within randomized clinical trials and postmarketing pharmacovigilance, but their molecular mechanism remains unknown in most cases. This is true even for hepatic or skin toxicities, which are classically monitored during drug design. Aside from clinical trials, many elements of knowledge about drug ingredients are available in open-access knowledge graphs, such as their properties, interactions, or involvements in pathways. In addition, drug classifications that label drugs as either causative or not for several ADRs, have been established. METHODS: We propose in this paper to mine knowledge graphs for identifying biomolecular features that may enable automatically reproducing expert classifications that distinguish drugs causative or not for a given type of ADR. In an Explainable AI perspective, we explore simple classification techniques such as Decision Trees and Classification Rules because they provide human-readable models, which explain the classification itself, but may also provide elements of explanation for molecular mechanisms behind ADRs. In summary, (1) we mine a knowledge graph for features; (2) we train classifiers at distinguishing, on the basis of extracted features, drugs associated or not with two commonly monitored ADRs: drug-induced liver injuries (DILI) and severe cutaneous adverse reactions (SCAR); (3) we isolate features that are both efficient in reproducing expert classifications and interpretable by experts (i.e., Gene Ontology terms, drug targets, or pathway names); and (4) we manually evaluate in a mini-study how they may be explanatory. RESULTS: Extracted features reproduce with a good fidelity classifications of drugs causative or not for DILI and SCAR (Accuracy = 0.74 and 0.81, respectively). Experts fully agreed that 73% and 38% of the most discriminative features are possibly explanatory for DILI and SCAR, respectively; and partially agreed (2/3) for 90% and 77% of them. CONCLUSION: Knowledge graphs provide sufficiently diverse features to enable simple and explainable models to distinguish between drugs that are causative or not for ADRs. In addition to explaining classifications, most discriminative features appear to be good candidates for investigating ADR mechanisms further.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Reconhecimento Automatizado de Padrão , Sistemas de Notificação de Reações Adversas a Medicamentos , Inteligência Artificial , Estudos de Viabilidade , Humanos , Farmacovigilância
8.
Sci Rep ; 11(1): 4202, 2021 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-33603019

RESUMO

The choice of the most appropriate unsupervised machine-learning method for "heterogeneous" or "mixed" data, i.e. with both continuous and categorical variables, can be challenging. Our aim was to examine the performance of various clustering strategies for mixed data using both simulated and real-life data. We conducted a benchmark analysis of "ready-to-use" tools in R comparing 4 model-based (Kamila algorithm, Latent Class Analysis, Latent Class Model [LCM] and Clustering by Mixture Modeling) and 5 distance/dissimilarity-based (Gower distance or Unsupervised Extra Trees dissimilarity followed by hierarchical clustering or Partitioning Around Medoids, K-prototypes) clustering methods. Clustering performances were assessed by Adjusted Rand Index (ARI) on 1000 generated virtual populations consisting of mixed variables using 7 scenarios with varying population sizes, number of clusters, number of continuous and categorical variables, proportions of relevant (non-noisy) variables and degree of variable relevance (low, mild, high). Clustering methods were then applied on the EPHESUS randomized clinical trial data (a heart failure trial evaluating the effect of eplerenone) allowing to illustrate the differences between different clustering techniques. The simulations revealed the dominance of K-prototypes, Kamila and LCM models over all other methods. Overall, methods using dissimilarity matrices in classical algorithms such as Partitioning Around Medoids and Hierarchical Clustering had a lower ARI compared to model-based methods in all scenarios. When applying clustering methods to a real-life clinical dataset, LCM showed promising results with regard to differences in (1) clinical profiles across clusters, (2) prognostic performance (highest C-index) and (3) identification of patient subgroups with substantial treatment benefit. The present findings suggest key differences in clustering performance between the tested algorithms (limited to tools readily available in R). In most of the tested scenarios, model-based methods (in particular the Kamila and LCM packages) and K-prototypes typically performed best in the setting of heterogeneous data.

9.
Yearb Med Inform ; 29(1): 188-192, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32823315

RESUMO

OBJECTIVES: Summarize recent research and select the best papers published in 2019 in the field of Bioinformatics and Translational Informatics (BTI) for the corresponding section of the International Medical Informatics Association Yearbook. METHODS: A literature review was performed for retrieving from PubMed papers indexed with keywords and free terms related to BTI. Independent review allowed the section editors to select a list of 15 candidate best papers which were subsequently peer-reviewed. A final consensus meeting gathering the whole Yearbook editorial committee was organized to finally decide on the selection of the best papers. RESULTS: Among the 931 retrieved papers covering the various subareas of BTI, the review process selected four best papers. The first paper presents a logical modeling of cancer pathways. Using their tools, the authors are able to identify two known behaviours of tumors. The second paper describes a deep-learning approach to predicting resistance to antibiotics in Mycobacterium tuberculosis. The authors of the third paper introduce a Genomic Global Positioning System (GPS) enabling comparison of genomic data with other individuals or genomics databases while preserving privacy. The fourth paper presents a multi-omics and temporal sequence-based approach to provide a better understanding of the sequence of events leading to Alzheimer's Disease. CONCLUSIONS: Thanks to the normalization of open data and open science practices, research in BTI continues to develop and mature. Noteworthy achievements are sophisticated applications of leading edge machine-learning methods dedicated to personalized medicine.


Assuntos
Biologia Computacional , Genômica , Biologia Computacional/ética , Humanos , Aprendizado de Máquina , Informática Médica , Pesquisa Translacional Biomédica
10.
Sci Data ; 7(1): 3, 2020 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-31896797

RESUMO

Pharmacogenomics (PGx) studies how individual gene variations impact drug response phenotypes, which makes PGx-related knowledge a key component towards precision medicine. A significant part of the state-of-the-art knowledge in PGx is accumulated in scientific publications, where it is hardly reusable by humans or software. Natural language processing techniques have been developed to guide experts who curate this amount of knowledge. But existing works are limited by the absence of a high quality annotated corpus focusing on PGx domain. In particular, this absence restricts the use of supervised machine learning. This article introduces PGxCorpus, a manually annotated corpus, designed to fill this gap and to enable the automatic extraction of PGx relationships from text. It comprises 945 sentences from 911 PubMed abstracts, annotated with PGx entities of interest (mainly gene variations, genes, drugs and phenotypes), and relationships between those. In this article, we present the corpus itself, its construction and a baseline experiment that illustrates how it may be leveraged to synthesize and summarize PGx knowledge.


Assuntos
Curadoria de Dados , Farmacogenética , Aprendizado de Máquina Supervisionado , Humanos , PubMed
11.
Gastroenterology ; 158(1): 76-94.e2, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31593701

RESUMO

Since 2010, substantial progress has been made in artificial intelligence (AI) and its application to medicine. AI is explored in gastroenterology for endoscopic analysis of lesions, in detection of cancer, and to facilitate the analysis of inflammatory lesions or gastrointestinal bleeding during wireless capsule endoscopy. AI is also tested to assess liver fibrosis and to differentiate patients with pancreatic cancer from those with pancreatitis. AI might also be used to establish prognoses of patients or predict their response to treatments, based on multiple factors. We review the ways in which AI may help physicians make a diagnosis or establish a prognosis and discuss its limitations, knowing that further randomized controlled studies will be required before the approval of AI techniques by the health authorities.


Assuntos
Inteligência Artificial , Diagnóstico por Computador/métodos , Gastroenterologia/métodos , Gastroenteropatias/diagnóstico , Hepatopatias/diagnóstico , Tomada de Decisão Clínica/métodos , Sistemas de Apoio a Decisões Clínicas , Árvores de Decisões , Gastroenteropatias/mortalidade , Gastroenteropatias/terapia , Humanos , Hepatopatias/mortalidade , Hepatopatias/terapia , Prognóstico , Resultado do Tratamento
12.
Int J Lab Hematol ; 41(6): 726-730, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31523903

RESUMO

INTRODUCTION: The confirmation time interval for the presence of antiphospholipid antibodies (aPL) has been extended to 12 weeks as epiphenomenal antibodies may disappear after 6 weeks. Our aim was to analyse extended persistence of aPL positivity beyond the 12-week interval. METHODS: We retrospectively analysed our database of 23 856 aPL test samples collected between 2005 and 2017 from 17 367 consecutive patients. Two groups of patients were identified among aPL-positive patients, confirmed at 12 weeks: with or without extended persistence beyond confirmatory testing. Percentages of extended persistence are given according to the initial aPL positivity profiles, and baseline laboratory variables are compared between the two groups. RESULTS: Three hundred and twenty-seven patients confirmed aPL-positive had subsequent testing. The vast majority of them displayed extended persistence in the long term: 89.6% and up to 97.9% for patients with initial triple positivity. In extended persistent positive patients, there were more LA-positive initial samples, and baseline LA test values and IgG aCL titres were higher than in nonpersistent positive patients. CONCLUSION: Data from a large database of an aPL referral laboratory showed that the time interval of 12 weeks defining persistence of aPL positivity was appropriate for the majority of patients. Furthermore, we found baseline features associated with extended persistence.


Assuntos
Anticorpos Antifosfolipídeos/sangue , Adulto , Síndrome Antifosfolipídica/sangue , Síndrome Antifosfolipídica/imunologia , Feminino , Humanos , Inibidor de Coagulação do Lúpus/sangue , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Fatores de Tempo
13.
Yearb Med Inform ; 28(1): 190-193, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31419831

RESUMO

OBJECTIVES: To summarize recent research and select the best papers published in 2018 in the field of Bioinformatics and Translational Informatics (BTI) for the corresponding section of the International Medical Informatics Association (IMIA) Yearbook. METHODS: A literature review was performed for retrieving from PubMed papers indexed with keywords and free terms related to BTI. Independent review allowed the two section editors to select a list of 14 candidate best papers which were subsequently peer-reviewed. A final consensus meeting gathering the whole IMIA Yearbook editorial committee was organized to finally decide on the selection of the best papers. RESULTS: Among the 636 retrieved papers published in 2018 in the various subareas of BTI, the review process selected four best papers. The first paper presents a computational method to identify molecular markers for targeted treatment of acute myeloid leukemia using multi-omics data (genome-wide gene expression profiles) and in vitro sensitivity to 160 chemotherapy drugs. The second paper describes a deep neural network approach to predict the survival of patients suffering from glioma on the basis of digitalised pathology images and genomics biomarkers. The authors of the third paper adopt a pan-cancer approach to take benefit of multi-omics data for drug repurposing. The fourth paper presents a graph-based semi-supervised method to accurate phenotype classification applied to ovarian cancer. CONCLUSIONS: Thanks to the normalization of open data and open science practices, research in BTI continues to develop and mature. Noteworthy achievements are sophisticated applications of leading edge machine-learning methods dedicated to personalized medicine.


Assuntos
Inteligência Artificial , Biologia Computacional , Pesquisa Translacional Biomédica , Biologia Computacional/ética , Humanos , Aprendizado de Máquina , Informática Médica , Neoplasias/genética , Neoplasias/patologia , Prognóstico
14.
Artigo em Inglês | MEDLINE | ID: mdl-29109696

RESUMO

Fetal and neonatal exposure to long-chain alkylphenols has been suspected to promote breast developmental disorders and consequently to increase breast cancer risk. However, disease predisposition from developmental exposures remains unclear. In this work, human MCF-10A mammary epithelial cells were exposed in vitro to a low dose of a realistic (4-nonylphenol + 4-tert-octylphenol) mixture. Transcriptome and cell-phenotype analyses combined to functional and signaling network modeling indicated that long-chain alkylphenols triggered enhanced proliferation, migration ability, and apoptosis resistance and shed light on the underlying molecular mechanisms which involved the human estrogen receptor alpha 36 (ERα36) variant. A male mouse-inherited transgenerational model of exposure to three environmentally relevant doses of the alkylphenol mix was set up in order to determine whether and how it would impact on mammary gland architecture. Mammary glands from F3 progeny obtained after intrabuccal chronic exposure of C57BL/6J P0 pregnant mice followed by F1-F3 male inheritance displayed an altered histology which correlated with the phenotypes observed in vitro in human mammary epithelial cells. Since cellular phenotypes are similar in vivo and in vitro and involve the unique ERα36 human variant, such consequences of alkylphenol exposure could be extrapolated from mouse model to human. However, transient alkylphenol treatments combined to ERα36 overexpression in mammary epithelial cells were not sufficient to trigger tumorigenesis in xenografted Nude mice. Therefore, it remains to be determined if low-dose alkylphenol transgenerational exposure and subsequent abnormal mammary gland development could account for an increased breast cancer susceptibility.

15.
J Biomed Semantics ; 8(1): 29, 2017 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-28830518

RESUMO

BACKGROUND: Patient data, such as electronic health records or adverse event reporting systems, constitute an essential resource for studying Adverse Drug Events (ADEs). We explore an original approach to identify frequently associated ADEs in subgroups of patients. RESULTS: Because ADEs have complex manifestations, we use formal concept analysis and its pattern structures, a mathematical framework that allows generalization using domain knowledge formalized in medical ontologies. Results obtained with three different settings and two different datasets show that this approach is flexible and allows extraction of association rules at various levels of generalization. CONCLUSIONS: The chosen approach permits an expressive representation of a patient ADEs. Extracted association rules point to distinct ADEs that occur in a same group of patients, and could serve as a basis for a recommandation system. The proposed representation is flexible and can be extended to make use of additional ontologies and various patient records.


Assuntos
Ontologias Biológicas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Reconhecimento Automatizado de Padrão , Registros Eletrônicos de Saúde , Humanos , Fenótipo
16.
Methods Mol Biol ; 1415: 91-105, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27115629

RESUMO

Comparing and classifying protein domain interactions according to their three-dimensional (3D) structures can help to understand protein structure-function and evolutionary relationships. Additionally, structural knowledge of existing domain-domain interactions can provide a useful way to find structural templates with which to model the 3D structures of unsolved protein complexes. Here we present a straightforward guide to using the "Kbdock" protein domain structure database and its associated web site for exploring and comparing protein domain-domain interactions (DDIs) and domain-peptide interactions (DPIs) at the Pfam domain family level. We also briefly explain how the Kbdock web site works, and we provide some notes and suggestions which should help to avoid some common pitfalls when working with 3D protein domain structures.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Proteínas/metabolismo , Internet , Modelos Moleculares , Simulação de Acoplamento Molecular , Ligação Proteica , Domínios Proteicos , Domínios e Motivos de Interação entre Proteínas , Mapas de Interação de Proteínas , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína
17.
Biology (Basel) ; 4(2): 327-43, 2015 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-25860777

RESUMO

While the number of solved 3D protein structures continues to grow rapidly, the structural rules that distinguish protein-protein interactions between different structural families are still not clear. Here, we classify and analyse the secondary structural features and promiscuity of a comprehensive non-redundant set of domain family binding sites (DFBSs) and hetero domain-domain interactions (DDIs) extracted from our updated KBDOCK resource. We have partitioned 4001 DFBSs into five classes using their propensities for three types of secondary structural elements ("α" for helices, "ß" for strands, and "γ" for irregular structure) and we have analysed how frequently these classes occur in DDIs. Our results show that ß elements are not highly represented in DFBSs compared to α and γ elements. At the DDI level, all classes of binding sites tend to preferentially bind to the same class of binding sites and α/ß contacts are significantly disfavored. Very few DFBSs are promiscuous: 80% of them interact with just one Pfam domain. About 50% of our Pfam domains bear only one single-partner DFBS and are therefore monogamous in their interactions with other domains. Conversely, promiscuous Pfam domains bear several DFBSs among which one or two are promiscuous, thereby multiplying the promiscuity of the concerned protein.

18.
PLoS One ; 9(1): e85667, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24465643

RESUMO

Nonribosomal peptides represent a large variety of natural active compounds produced by microorganisms. Due to their specific biosynthesis pathway through large assembly lines called NonRibosomal Peptide Synthetases (NRPSs), they often display complex structures with cycles and branches. Moreover they often contain non proteogenic or modified monomers, such as the D-monomers produced by epimerization. We investigate here some sequence specificities of the condensation (C) and epimerization (E) domains of NRPS that can be used to predict the possible isomeric state (D or L) of each monomer in a putative peptide. We show that C- and E- domains can be divided into 2 sub-regions called Up-Seq and Down-Seq. The Up-Seq region corresponds to an InterPro domain (IPR001242) and is shared by C- and E-domains. The Down-Seq region is specific to the enzymatic activity of the domain. Amino-acid signatures (represented as sequence logos) previously described for complete C-and E-domains have been restricted to the Down-Seq region and amplified thanks to additional sequences. Moreover a new Down-Seq signature has been found for Ct-domains found in fungi and responsible for terminal cyclization of the peptides. The identification of these signatures has been included in a workflow named Florine, aimed to predict nonribosomal peptides from NRPS sequence analyses. In some cases, the prediction of isomery is guided by genus-specific rules. Florine was used on a Pseudomonas genome to allow the determination of the type of pyoverdin produced, the update of syringafactin structure and the identification of novel putative products.


Assuntos
Proteínas de Bactérias/química , DNA Bacteriano/química , Peptídeo Sintases/química , Peptídeos/química , Pseudomonas/química , Software , Sequência de Aminoácidos , Proteínas de Bactérias/genética , DNA Bacteriano/genética , Anotação de Sequência Molecular , Dados de Sequência Molecular , Oligopeptídeos/química , Oligopeptídeos/genética , Biossíntese de Peptídeos Independentes de Ácido Nucleico/genética , Peptídeo Sintases/genética , Peptídeos/genética , Multimerização Proteica , Estrutura Terciária de Proteína , Pseudomonas/genética
19.
Nucleic Acids Res ; 42(Database issue): D389-95, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24271397

RESUMO

Comparing, classifying and modelling protein structural interactions can enrich our understanding of many biomolecular processes. This contribution describes Kbdock (http://kbdock.loria.fr/), a database system that combines the Pfam domain classification with coordinate data from the PDB to analyse and model 3D domain-domain interactions (DDIs). Kbdock can be queried using Pfam domain identifiers, protein sequences or 3D protein structures. For a given query domain or pair of domains, Kbdock retrieves and displays a non-redundant list of homologous DDIs or domain-peptide interactions in a common coordinate frame. Kbdock may also be used to search for and visualize interactions involving different, but structurally similar, Pfam families. Thus, structural DDI templates may be proposed even when there is little or no sequence similarity to the query domains.


Assuntos
Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Sítios de Ligação , Internet , Modelos Moleculares , Simulação de Acoplamento Molecular , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Proteínas/classificação , Alinhamento de Sequência , Análise de Sequência de Proteína
20.
Proteins ; 81(12): 2150-8, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24123156

RESUMO

Protein docking algorithms aim to calculate the three-dimensional (3D) structure of a protein complex starting from its unbound components. Although ab initio docking algorithms are improving, there is a growing need to use homology modeling techniques to exploit the rapidly increasing volumes of structural information that now exist. However, most current homology modeling approaches involve finding a pair of complete single-chain structures in a homologous protein complex to use as a 3D template, despite the fact that protein complexes are often formed from one or more domain-domain interactions (DDIs). To model 3D protein complexes by domain-domain homology, we have developed a case-based reasoning approach called KBDOCK which systematically identifies and reuses domain family binding sites from our database of nonredundant DDIs. When tested on 54 protein complexes from the Protein Docking Benchmark, our approach provides a near-perfect way to model single-domain protein complexes when full-homology templates are available, and it extends our ability to model more difficult cases when only partial or incomplete templates exist. These promising early results highlight the need for a new and diverse docking benchmark set, specifically designed to assess homology docking approaches.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Algoritmos , Sítios de Ligação , Bases de Dados de Proteínas , Modelos Moleculares , Simulação de Acoplamento Molecular , Linguagens de Programação , Conformação Proteica , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA