Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Front Mol Biosci ; 10: 1204157, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37475887

RESUMEN

Predicting pathogenicity of missense variants in molecular diagnostics remains a challenge despite the available wealth of data, such as evolutionary information, and the wealth of tools to integrate that data. We describe DeepRank-Mut, a configurable framework designed to extract and learn from physicochemically relevant features of amino acids surrounding missense variants in 3D space. For each variant, various atomic and residue-level features are extracted from its structural environment, including sequence conservation scores of the surrounding amino acids, and stored in multi-channel 3D voxel grids which are then used to train a 3D convolutional neural network (3D-CNN). The resultant model gives a probabilistic estimate of whether a given input variant is disease-causing or benign. We find that the performance of our 3D-CNN model, on independent test datasets, is comparable to other widely used resources which also combine sequence and structural features. Based on the 10-fold cross-validation experiments, we achieve an average accuracy of 0.77 on the independent test datasets. We discuss the contribution of the variant neighborhood in the model's predictive power, in addition to the impact of individual features on the model's performance. Two key features: evolutionary information of residues in the variant neighborhood and their solvent accessibilities were observed to influence the predictions. We also highlight how predictions are impacted by the underlying disease mechanisms of missense mutations and offer insights into understanding these to improve pathogenicity predictions. Our study presents aspects to take into consideration when adopting deep learning approaches for protein structure-guided pathogenicity predictions.

2.
Protein Sci ; 29(1): 330-344, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31724231

RESUMEN

We describe a series of databases and tools that directly or indirectly support biomedical research on macromolecules, with focus on their applicability in protein structure bioinformatics research. DSSP, that determines secondary structures of proteins, has been updated to work well with extremely large structures in multiple formats. The PDBREPORT database that lists anomalies in protein structures has been remade to remove many small problems. These reports are now available as PDF-formatted files with a computer-readable summary. The VASE software has been added to analyze and visualize HSSP multiple sequence alignments for protein structures. The Lists collection of databases has been extended with a series of databases, most noticeably with a database that gives each protein structure a grade for usefulness in protein structure bioinformatics projects. The PDB-REDO collection of reanalyzed and re-refined protein structures that were solved by X-ray crystallography has been improved by dealing better with sugar residues and with hydrogen bonds, and adding many missing surface loops. All academic software underlying these protein structure bioinformatics applications and databases are now publicly accessible, either directly from the authors or from the GitHub software repository.


Asunto(s)
Biología Computacional/métodos , Recolección de Datos/métodos , Proteínas/química , Bases de Datos de Proteínas , Modelos Moleculares , Estructura Secundaria de Proteína , Programas Informáticos
3.
Hum Mutat ; 40(8): 1030-1038, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31116477

RESUMEN

The growing availability of human genetic variation has given rise to novel methods of measuring genetic tolerance that better interpret variants of unknown significance. We recently developed a concept based on protein domain homology in the human genome to improve variant interpretation. For this purpose, we mapped population variation from the Exome Aggregation Consortium (ExAC) and pathogenic mutations from the Human Gene Mutation Database (HGMD) onto Pfam protein domains. The aggregation of these variation data across homologous domains into meta-domains allowed us to generate amino acid resolution of genetic intolerance profiles for human protein domains. Here, we developed MetaDome, a fast and easy-to-use web server that visualizes meta-domain information and gene-wide profiles of genetic tolerance. We updated the underlying data of MetaDome to contain information from 56,319 human transcripts, 71,419 protein domains, 12,164,292 genetic variants from gnomAD, and 34,076 pathogenic mutations from ClinVar. MetaDome allows researchers to easily investigate their variants of interest for the presence or absence of variation at corresponding positions within homologous domains. We illustrate the added value of MetaDome by an example that highlights how it may help in the interpretation of variants of unknown significance. The MetaDome web server is freely accessible at https://stuart.radboudumc.nl/metadome.


Asunto(s)
Biología Computacional/métodos , Variación Genética , Proteínas/química , Proteínas/genética , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Internet , Dominios Proteicos , Programas Informáticos , Homología Estructural de Proteína
4.
Protein Eng Des Sel ; 30(6): 441-447, 2017 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-28475759

RESUMEN

The NewProt protein engineering portal is a one-stop-shop for in silico protein engineering. It gives access to a large number of servers that compute a wide variety of protein structure characteristics supporting work on the modification of proteins through the introduction of (multiple) point mutations. The results can be inspected through multiple visualizers. The HOPE software is included to indicate mutations with possible undesired side effects. The Hotspot Wizard software is embedded for the design of mutations that modify a proteins' activity, specificity, or stability. The NewProt portal is freely accessible at http://newprot.cmbi.umcn.nl/ and http://newprot.fluidops.net/.


Asunto(s)
Bases de Datos de Proteínas , Internet , Ingeniería de Proteínas/métodos , Proteínas , Programas Informáticos , Modelos Moleculares , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Interfaz Usuario-Computador
5.
Nucleic Acids Res ; 43(Database issue): D364-8, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25352545

RESUMEN

We present a series of databanks (http://swift.cmbi.ru.nl/gv/facilities/) that hold information that is computationally derived from Protein Data Bank (PDB) entries and that might augment macromolecular structure studies. These derived databanks run parallel to the PDB, i.e. they have one entry per PDB entry. Several of the well-established databanks such as HSSP, PDBREPORT and PDB_REDO have been updated and/or improved. The software that creates the DSSP databank, for example, has been rewritten to better cope with π-helices. A large number of databanks have been added to aid computational structural biology; some examples are lists of residues that make crystal contacts, lists of contacting residues using a series of contact definitions or lists of residue accessibilities. PDB files are not the optimal presentation of the underlying data for many studies. We therefore made a series of databanks that hold PDB files in an easier to use or more consistent representation. The BDB databank holds X-ray PDB files with consistently represented B-factors. We also added several visualization tools to aid the users of our databanks.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Biología Computacional , Conformación Proteica , Programas Informáticos
6.
Nucleic Acids Res ; 39(Database issue): D309-19, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21045054

RESUMEN

The GPCRDB is a Molecular Class-Specific Information System (MCSIS) that collects, combines, validates and disseminates large amounts of heterogeneous data on G protein-coupled receptors (GPCRs). The GPCRDB contains experimental data on sequences, ligand-binding constants, mutations and oligomers, as well as many different types of computationally derived data such as multiple sequence alignments and homology models. The GPCRDB provides access to the data via a number of different access methods. It offers visualization and analysis tools, and a number of query systems. The data is updated automatically on a monthly basis. The GPCRDB can be found online at http://www.gpcr.org/7tm/.


Asunto(s)
Bases de Datos de Proteínas , Receptores Acoplados a Proteínas G/química , Ligandos , Mutación , Receptores Acoplados a Proteínas G/genética , Receptores Acoplados a Proteínas G/metabolismo , Alineación de Secuencia , Análisis de Secuencia de Proteína , Homología Estructural de Proteína , Interfaz Usuario-Computador
7.
Eur Biophys J ; 39(4): 551-63, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19718498

RESUMEN

Homology modelling is normally the technique of choice when experimental structure data are not available but three-dimensional coordinates are needed, for example, to aid with detailed interpretation of results of spectroscopic studies. Herein, the state of the art of homology modelling will be described in the light of a series of recent developments, and an overview will be given of the problems and opportunities encountered in this field. The major topic, the accuracy and precision of homology models, will be discussed extensively due to its influence on the reliability of conclusions drawn from the combination of homology models and spectroscopic data. Three real-world examples will illustrate how both homology modelling and spectroscopy can be beneficial for (bio)medical research.


Asunto(s)
Modelos Moleculares , Homología de Secuencia , Análisis Espectral/métodos , Secuencia de Aminoácidos , Humanos , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/metabolismo , Marcadores de Spin
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA