Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
BMC Bioinformatics ; 24(1): 433, 2023 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-37964216

RESUMEN

BACKGROUND: Determining a protein's quaternary state, i.e. the number of monomers in a functional unit, is a critical step in protein characterization. Many proteins form multimers for their activity, and over 50% are estimated to naturally form homomultimers. Experimental quaternary state determination can be challenging and require extensive work. To complement these efforts, a number of computational tools have been developed for quaternary state prediction, often utilizing experimentally validated structural information. Recently, dramatic advances have been made in the field of deep learning for predicting protein structure and other characteristics. Protein language models, such as ESM-2, that apply computational natural-language models to proteins successfully capture secondary structure, protein cell localization and other characteristics, from a single sequence. Here we hypothesize that information about the protein quaternary state may be contained within protein sequences as well, allowing us to benefit from these novel approaches in the context of quaternary state prediction. RESULTS: We generated ESM-2 embeddings for a large dataset of proteins with quaternary state labels from the curated QSbio dataset. We trained a model for quaternary state classification and assessed it on a non-overlapping set of distinct folds (ECOD family level). Our model, named QUEEN (QUaternary state prediction using dEEp learNing), performs worse than approaches that include information from solved crystal structures. However, it successfully learned to distinguish multimers from monomers, and predicts the specific quaternary state with moderate success, better than simple sequence similarity-based annotation transfer. Our results demonstrate that complex, quaternary state related information is included in such embeddings. CONCLUSIONS: QUEEN is the first to investigate the power of embeddings for the prediction of the quaternary state of proteins. As such, it lays out strengths as well as limitations of a sequence-based protein language model approach, compared to structure-based approaches. Since it does not require any structural information and is fast, we anticipate that it will be of wide use both for in-depth investigation of specific systems, as well as for studies of large sets of protein sequences. A simple colab implementation is available at: https://colab. RESEARCH: google.com/github/Furman-Lab/QUEEN/blob/main/QUEEN_prediction_notebook.ipynb .


Asunto(s)
Lenguaje , Proteínas , Proteínas/química , Secuencia de Aminoácidos , Estructura Secundaria de Proteína , Transporte de Proteínas
2.
Proc Natl Acad Sci U S A ; 119(18): e2121153119, 2022 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-35482919

RESUMEN

Peptide docking can be perceived as a subproblem of protein­protein docking. However, due to the short length and flexible nature of peptides, many do not adopt one defined conformation prior to binding. Therefore, to tackle a peptide docking problem, not only the relative orientation, but also the bound conformation of the peptide needs to be modeled. Traditional peptide-centered approaches use information about peptide sequences to generate representative conformer ensembles, which can then be rigid-body docked to the receptor. Alternatively, one may look at this problem from the viewpoint of the receptor, namely, that the protein surface defines the peptide-bound conformation. Here, we present PatchMAN (Patch-Motif AligNments), a global peptide-docking approach that uses structural motifs to map the receptor surface with backbone scaffolds extracted from protein structures. On a nonredundant set of protein­peptide complexes, starting from free receptor structures, PatchMAN successfully models and identifies near-native peptide­protein complexes in 58%/84% within 2.5 Å/5 Å interface backbone RMSD, with corresponding sampling in 81%/100% of the cases, outperforming other approaches. PatchMAN leverages the observation that structural units of peptides with their binding pocket can be found not only within interfaces, but also within monomers. We show that the bound peptide conformation is sampled based on the structural context of the receptor only, without taking into account any sequence information. Beyond peptide docking, this approach opens exciting new avenues to study principles of peptide­protein association, and to the design of new peptide binders. PatchMAN is available as a server at https://furmanlab.cs.huji.ac.il/patchman/.


Asunto(s)
Proteínas de la Membrana , Péptidos , Fenómenos Biofísicos , Proteínas de la Membrana/metabolismo , Péptidos/química , Unión Proteica , Conformación Proteica
3.
Nat Commun ; 13(1): 176, 2022 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-35013344

RESUMEN

Highly accurate protein structure predictions by deep neural networks such as AlphaFold2 and RoseTTAFold have tremendous impact on structural biology and beyond. Here, we show that, although these deep learning approaches have originally been developed for the in silico folding of protein monomers, AlphaFold2 also enables quick and accurate modeling of peptide-protein interactions. Our simple implementation of AlphaFold2 generates peptide-protein complex models without requiring multiple sequence alignment information for the peptide partner, and can handle binding-induced conformational changes of the receptor. We explore what AlphaFold2 has memorized and learned, and describe specific examples that highlight differences compared to state-of-the-art peptide docking protocol PIPER-FlexPepDock. These results show that AlphaFold2 holds great promise for providing structural insight into a wide range of peptide-protein complexes, serving as a starting point for the detailed characterization and manipulation of these interactions.


Asunto(s)
Redes Neurales de la Computación , Péptidos/química , Pliegue de Proteína , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Sitios de Unión , Modelos Moleculares , Simulación del Acoplamiento Molecular , Péptidos/metabolismo , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Proteínas/metabolismo
4.
Nat Commun ; 12(1): 6947, 2021 11 29.
Artículo en Inglés | MEDLINE | ID: mdl-34845212

RESUMEN

Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.


Asunto(s)
Sustancias Macromoleculares/química , Simulación del Acoplamiento Molecular , Proteínas/química , Programas Informáticos/normas , Benchmarking , Sitios de Unión , Humanos , Ligandos , Sustancias Macromoleculares/metabolismo , Unión Proteica , Proteínas/metabolismo , Reproducibilidad de los Resultados
5.
Structure ; 26(11): 1546-1554.e2, 2018 11 06.
Artículo en Inglés | MEDLINE | ID: mdl-30293812

RESUMEN

At resolutions worse than 3.5 Å, the electron density is weak or nonexistent at the locations of the side chains. Consequently, the assignment of the protein sequences to their correct positions along the backbone is a difficult problem. In this work, we propose a fully automated computational approach to assign sequence at low resolution. It is based on our surprising observation that standard reciprocal-space indicators, such as the initial unrefined R value, are sensitive enough to detect an erroneous sequence assignment of even a single backbone position. Our approach correctly determines the amino acid type for 15%, 13%, and 9% of the backbone positions in crystallographic datasets with resolutions of 4.0 Å, 4.5 Å, and 5.0 Å, respectively. We implement these findings in an application for threading a sequence onto a backbone structure. For the three resolution ranges, the application threads 83%, 81%, and 64% of the sequences exactly as in the deposited PDB structures.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Proteínas/genética , Secuencia de Aminoácidos , Cristalografía por Rayos X , Modelos Moleculares , Conformación Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...