Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Genome Biol ; 24(1): 182, 2023 08 07.
Artículo en Inglés | MEDLINE | ID: mdl-37550700

RESUMEN

BACKGROUND: Genetic variation in the human genome is a major determinant of individual disease risk, but the vast majority of missense variants have unknown etiological effects. Here, we present a robust learning framework for leveraging saturation mutagenesis experiments to construct accurate computational predictors of proteome-wide missense variant pathogenicity. RESULTS: We train cross-protein transfer (CPT) models using deep mutational scanning (DMS) data from only five proteins and achieve state-of-the-art performance on clinical variant interpretation for unseen proteins across the human proteome. We also improve predictive accuracy on DMS data from held-out proteins. High sensitivity is crucial for clinical applications and our model CPT-1 particularly excels in this regime. For instance, at 95% sensitivity of detecting human disease variants annotated in ClinVar, CPT-1 improves specificity to 68%, from 27% for ESM-1v and 55% for EVE. Furthermore, for genes not used to train REVEL, a supervised method widely used by clinicians, we show that CPT-1 compares favorably with REVEL. Our framework combines predictive features derived from general protein sequence models, vertebrate sequence alignments, and AlphaFold structures, and it is adaptable to the future inclusion of other sources of information. We find that vertebrate alignments, albeit rather shallow with only 100 genomes, provide a strong signal for variant pathogenicity prediction that is complementary to recent deep learning-based models trained on massive amounts of protein sequence data. We release predictions for all possible missense variants in 90% of human genes. CONCLUSIONS: Our results demonstrate the utility of mutational scanning data for learning properties of variants that transfer to unseen proteins.


Asunto(s)
Aprendizaje Automático , Proteoma , Humanos , Proteoma/genética , Secuencia de Aminoácidos , Mutación , Mutación Missense , Biología Computacional/métodos
2.
Pac Symp Biocomput ; 27: 22-33, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34890133

RESUMEN

There is significant interest in developing machine learning methods to model protein-ligand interactions but a scarcity of experimentally resolved protein-ligand structures to learn from. Protein self-contacts are a much larger source of structural data that could be leveraged, but currently it is not well understood how this data source differs from the target domain. Here, we characterize the 3D geometric patterns of protein self-contacts as probability distributions. We then present a flexible statistical framework to assess the transferability of these patterns to protein-ligand contacts. We observe that the level of transferability from protein self-contacts to protein-ligand contacts depends on contact type, with many contact types exhibiting high transferability. We then demonstrate the potential of leveraging information from these geometric patterns to aid in ligand pose-selection problems in protein-ligand docking. We publicly release our extracted data on geometric interaction patterns to enable further exploration of this problem.


Asunto(s)
Biología Computacional , Proteínas , Humanos , Ligandos , Aprendizaje Automático , Unión Proteica , Proteínas/metabolismo
3.
Ultramicroscopy ; 227: 113302, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34062386

RESUMEN

A computational method was developed to recover the three-dimensional coordinates of gold nanoparticles specifically attached to a protein complex from tilt-pair images collected by electron microscopy. The program was tested on a simulated dataset and applied to a real dataset comprising tilt-pair images recorded by cryo electron microscopy of RNA polymerase II in a complex with four gold-labeled single-chain antibody fragments. The positions of the gold nanoparticles were determined, and comparison of the coordinates among the tetrameric particles revealed the range of motion within the protein complexes.


Asunto(s)
Oro/química , Procesamiento de Imagen Asistido por Computador/métodos , Fragmentos de Inmunoglobulinas , Nanopartículas del Metal/química , ARN Polimerasa II , Microscopía por Crioelectrón/métodos , Fragmentos de Inmunoglobulinas/química , Fragmentos de Inmunoglobulinas/metabolismo , Modelos Moleculares , Unión Proteica , ARN Polimerasa II/química , ARN Polimerasa II/metabolismo
4.
Proteins ; 89(5): 493-501, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33289162

RESUMEN

Predicting the structure of multi-protein complexes is a grand challenge in biochemistry, with major implications for basic science and drug discovery. Computational structure prediction methods generally leverage predefined structural features to distinguish accurate structural models from less accurate ones. This raises the question of whether it is possible to learn characteristics of accurate models directly from atomic coordinates of protein complexes, with no prior assumptions. Here we introduce a machine learning method that learns directly from the 3D positions of all atoms to identify accurate models of protein complexes, without using any precomputed physics-inspired or statistical terms. Our neural network architecture combines multiple ingredients that together enable end-to-end learning from molecular structures containing tens of thousands of atoms: a point-based representation of atoms, equivariance with respect to rotation and translation, local convolutions, and hierarchical subsampling operations. When used in combination with previously developed scoring functions, our network substantially improves the identification of accurate structural models among a large set of possible models. Our network can also be used to predict the accuracy of a given structural model in absolute terms. The architecture we present is readily applicable to other tasks involving learning on 3D structures of large atomic systems.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Proteínas/química , Ligandos , Modelos Moleculares , Conformación Proteica , Proteínas/ultraestructura , Rotación
5.
Phys Rev E ; 101(1-1): 012304, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32069576

RESUMEN

Films made from random nanowire arrays are an attractive choice for electronics requiring flexible transparent conductive films. However, thus far there has been no unified theory for predicting their electrical conductivity. In particular, the effects of orientation distribution on network conductivity remain poorly understood. We present a simplified analytical model for random nanowire network electrical conductivity that accurately captures the effects of arbitrary nanowire orientation distributions on conductivity. Our model is an upper bound and converges to the true conductivity as nanowire density grows. The model replaces Monte Carlo sampling with an asymptotically faster computation and in practice can be computed much more quickly than standard computational models. The success of our approximation provides theoretical insight into how nanowire orientation affects electrical conductivity.

6.
Sci Rep ; 5: 10219, 2015 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-25976936

RESUMEN

A computational model was developed to analyze electrical conductivity of random metal nanowire networks. It was demonstrated for the first time through use of this model that a performance gain in random metal nanowire networks can be achieved by slightly restricting nanowire orientation. It was furthermore shown that heavily ordered configurations do not outperform configurations with some degree of randomness; randomness in the case of metal nanowire orientations acts to increase conductivity.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...