Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 119(41): e2210249119, 2022 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-36191203

RESUMO

Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Algoritmos , Fenômenos Biofísicos , Cinética , Método de Monte Carlo
2.
Bioinformatics ; 37(7): 943-950, 2021 05 17.
Artigo em Inglês | MEDLINE | ID: mdl-32840574

RESUMO

MOTIVATION: Despite the progress made in studying protein-ligand interactions and the widespread application of docking and affinity prediction tools, improving their precision and efficiency still remains a challenge. Computational approaches based on the scoring of docking conformations with statistical potentials constitute a popular alternative to more accurate but costly physics-based thermodynamic sampling methods. In this context, a minimalist and fast sidechain-free knowledge-based potential with a high docking and screening power can be very useful when screening a big number of putative docking conformations. RESULTS: Here, we present a novel coarse-grained potential defined by a 3D joint probability distribution function that only depends on the pairwise orientation and position between protein backbone and ligand atoms. Despite its extreme simplicity, our approach yields very competitive results with the state-of-the-art scoring functions, especially in docking and screening tasks. For example, we observed a twofold improvement in the median 5% enrichment factor on the DUD-E benchmark compared to Autodock Vina results. Moreover, our results prove that a coarse sidechain-free potential is sufficient for a very successful docking pose prediction. AVAILABILITYAND IMPLEMENTATION: The standalone version of KORP-PL with the corresponding tests and benchmarks are available at https://team.inria.fr/nano-d/korp-pl/ and https://chaconlab.org/modeling/korp-pl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas , Software , Bases de Conhecimento , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/metabolismo
3.
Bioinformatics ; 37(16): 2332-2339, 2021 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-33620450

RESUMO

MOTIVATION: Effective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance. RESULTS: For the first time, we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows us to efficiently introduce both convolution and pooling operations and train the network in an end-to-end fashion without precomputed descriptors. The resultant model, VoroCNN, predicts local qualities of 3D protein folds. The prediction results are competitive to state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in recognition of protein binding interfaces. AVAILABILITY AND IMPLEMENTATION: The model, data and evaluation tests are available at https://team.inria.fr/nano-d/software/vorocnn/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

4.
Appl Opt ; 61(12): 3337-3348, 2022 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-35471429

RESUMO

We present a compact 3D diffractive microscope that can be inserted directly in a cell incubator for long-term observation of developing organisms. Our setup is particularly simple and robust, since it does not include any moving parts and is compatible with commercial cell culture containers. It has been designed to image large specimens (>100×100×100µm3) with subcellular resolution. The sample's optical properties [refractive index (RI) and absorption] are reconstructed in 3D from intensity-only images recorded with different illumination angles produced by an LED array. The reconstruction is performed using the beam propagation method embedded inside a deep-learning network where the layers encode the optical properties of the object. This deep neural network is trained for a given multiangle intensity acquisition. After training, the weights of the neural network deliver the 3D distribution of the optical properties of the sample. The effect of spherical aberrations due to the sample holder/air interfaces are taken into account in the forward model. Using this approach, we performed time-lapse 3D imaging of preimplantation mouse embryos over six days. Images of embryos from a single cell (low-scattering regime) to the blastocyst stage (highly scattering regime) were successfully reconstructed. Due to its subcellular resolution, our system can provide quantitative information on the embryos' development and viability. Hence, this technology opens what we believe to be novel opportunities for 3D label-free live-cell imaging of whole embryos or organoids over long observation times.


Assuntos
Aprendizado Profundo , Animais , Camundongos , Refratometria , Imagem com Lapso de Tempo , Tomografia , Tomografia Computadorizada por Raios X
5.
Proteins ; 89(12): 1770-1786, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34519095

RESUMO

The potential of deep learning has been recognized in the protein structure prediction community for some time, and became indisputable after CASP13. In CASP14, deep learning has boosted the field to unanticipated levels reaching near-experimental accuracy. This success comes from advances transferred from other machine learning areas, as well as methods specifically designed to deal with protein sequences and structures, and their abstractions. Novel emerging approaches include (i) geometric learning, that is, learning on representations such as graphs, three-dimensional (3D) Voronoi tessellations, and point clouds; (ii) pretrained protein language models leveraging attention; (iii) equivariant architectures preserving the symmetry of 3D space; (iv) use of large meta-genome databases; (v) combinations of protein representations; and (vi) finally truly end-to-end architectures, that is, differentiable models starting from a sequence and returning a 3D structure. Here, we provide an overview and our opinion of the novel deep learning approaches developed in the last 2 years and widely used in CASP14.


Assuntos
Sequência de Aminoácidos , Conformação Proteica , Proteínas , Software , Biologia Computacional , Bases de Dados de Proteínas , Aprendizado Profundo , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína
6.
PLoS Comput Biol ; 16(4): e1007870, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32339173

RESUMO

Many proteins contain multiple folded domains separated by flexible linkers, and the ability to describe the structure and conformational heterogeneity of such flexible systems pushes the limits of structural biology. Using the three-domain protein TIA-1 as an example, we here combine coarse-grained molecular dynamics simulations with previously measured small-angle scattering data to study the conformation of TIA-1 in solution. We show that while the coarse-grained potential (Martini) in itself leads to too compact conformations, increasing the strength of protein-water interactions results in ensembles that are in very good agreement with experiments. We show how these ensembles can be refined further using a Bayesian/Maximum Entropy approach, and examine the robustness to errors in the energy function. In particular we find that as long as the initial simulation is relatively good, reweighting against experiments is very robust. We also study the relative information in X-ray and neutron scattering experiments and find that refining against the SAXS experiments leads to improvement in the SANS data. Our results suggest a general strategy for studying the conformation of multi-domain proteins in solution that combines coarse-grained simulations with small-angle X-ray scattering data that are generally most easy to obtain. These results may in turn be used to design further small-angle neutron scattering experiments that exploit contrast variation through 1H/2H isotope substitutions.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Espalhamento a Baixo Ângulo , Difração de Raios X , Algoritmos , Biologia Computacional , Nêutrons , Conformação Proteica , Domínios Proteicos , Proteínas/análise , Proteínas/química
7.
Biophys J ; 118(10): 2513-2525, 2020 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-32330413

RESUMO

Large macromolecules, including proteins and their complexes, very often adopt multiple conformations. Some of them can be seen experimentally, for example with x-ray crystallography or cryo-electron microscopy. This structural heterogeneity is not occasional and is frequently linked with specific biological function. Thus, the accurate description of macromolecular conformational transitions is crucial for understanding fundamental mechanisms of life's machinery. We report on a real-time method to predict such transitions by extrapolating from instantaneous eigen motions, computed using the normal mode analysis, to a series of twists. We demonstrate the applicability of our approach to the prediction of a wide range of motions, including large collective opening-closing transitions and conformational changes induced by partner binding. We also highlight particularly difficult cases of very small transitions between crystal and solution structures. Our method guarantees preservation of the protein structure during the transition and allows accessing conformations that are unreachable with classical normal mode analysis. We provide practical solutions to describe localized motions with a few low-frequency modes and to relax some geometrical constraints along the predicted transitions. This work opens the way to the systematic description of protein motions, whatever their degree of collectivity. Our method is freely available as a part of the NOn-Linear rigid Block (NOLB) package.


Assuntos
Proteínas , Microscopia Crioeletrônica , Cristalografia por Raios X , Modelos Moleculares , Conformação Proteica
8.
Biophys J ; 119(3): 605-618, 2020 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-32668232

RESUMO

Small angle neutron scattering (SANS) provides a method to obtain important low-resolution information for integral membrane proteins (IMPs), challenging targets for structural determination. Specific deuteration furnishes a "stealth" carrier for the solubilized IMP. We used SANS to determine a structural envelope of SpNOX, the Streptococcus pneumoniae NADPH oxidase (NOX), a prokaryotic model system for exploring structure and function of eukaryotic NOXes. SpNOX was solubilized in the detergent lauryl maltose neopentyl glycol, which provides optimal SpNOX stability and activity. Using deuterated solvent and protein, the lauryl maltose neopentyl glycol was experimentally undetected in SANS. This affords a cost-effective SANS approach for obtaining novel structural information on IMPs. Combining SANS data with molecular modeling provided a first, to our knowledge, structural characterization of an entire NOX enzyme. It revealed a distinctly less compact structure than that predicted from the docking of homologous crystal structures of the separate transmembrane and dehydrogenase domains, consistent with a flexible linker connecting the two domains.


Assuntos
NADPH Oxidases , Difração de Nêutrons , Proteínas de Membrana , Oxirredução , Espalhamento a Baixo Ângulo
9.
Bioinformatics ; 35(24): 5113-5120, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31161198

RESUMO

MOTIVATION: Thanks to the recent advances in structural biology, nowadays 3D structures of various proteins are solved on a routine basis. A large portion of these structures contain structural repetitions or internal symmetries. To understand the evolution mechanisms of these proteins and how structural repetitions affect the protein function, we need to be able to detect such proteins very robustly. As deep learning is particularly suited to deal with spatially organized data, we applied it to the detection of proteins with structural repetitions. RESULTS: We present DeepSymmetry, a versatile method based on 3D convolutional networks that detects structural repetitions in proteins and their density maps. Our method is designed to identify tandem repeat proteins, proteins with internal symmetries, symmetries in the raw density maps, their symmetry order and also the corresponding symmetry axes. Detection of symmetry axes is based on learning 6D Veronese mappings of 3D vectors, and the median angular error of axis determination is less than one degree. We demonstrate the capabilities of our method on benchmarks with tandem-repeated proteins and also with symmetrical assemblies. For example, we have discovered about 7800 putative tandem repeat proteins in the PDB. AVAILABILITY AND IMPLEMENTATION: The method is available at https://team.inria.fr/nano-d/software/deepsymmetry. It consists of a C++ executable that transforms molecular structures into volumetric density maps, and a Python code based on the TensorFlow framework for applying the DeepSymmetry model to these maps. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Proteínas , Software , Sequências de Repetição em Tandem
10.
Bioinformatics ; 35(16): 2801-2808, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30590384

RESUMO

MOTIVATION: Protein quality assessment (QA) is a crucial element of protein structure prediction, a fundamental and yet open problem in structural bioinformatics. QA aims at ranking predicted protein models to select the best candidates. The assessment can be performed based either on a single model or on a consensus derived from an ensemble of models. The latter strategy can yield very high performance but substantially depends on the pool of available candidate models, which limits its applicability. Hence, single-model QA methods remain an important research target, also because they can assist the sampling of candidate models. RESULTS: We present a novel single-model QA method called SBROD. The SBROD (Smooth Backbone-Reliant Orientation-Dependent) method uses only the backbone protein conformation, and hence it can be applied to scoring coarse-grained protein models. The proposed method deduces its scoring function from a training set of protein models. The SBROD scoring function is composed of four terms related to different structural features: residue-residue orientations, contacts between backbone atoms, hydrogen bonding and solvent-solute interactions. It is smooth with respect to atomic coordinates and thus is potentially applicable to continuous gradient-based optimization of protein conformations. Furthermore, it can also be used for coarse-grained protein modeling and computational protein design. SBROD proved to achieve similar performance to state-of-the-art single-model QA methods on diverse datasets (CASP11, CASP12 and MOULDER). AVAILABILITY AND IMPLEMENTATION: The standalone application implemented in C++ and Python is freely available at https://gitlab.inria.fr/grudinin/sbrod and supported on Linux, MacOS and Windows. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Grãos/química , Modelos Moleculares , Conformação Proteica
11.
Bioinformatics ; 35(18): 3313-3319, 2019 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-30874723

RESUMO

MOTIVATION: Protein model quality assessment (QA) is a crucial and yet open problem in structural bioinformatics. The current best methods for single-model QA typically combine results from different approaches, each based on different input features constructed by experts in the field. Then, the prediction model is trained using a machine-learning algorithm. Recently, with the development of convolutional neural networks (CNN), the training paradigm has changed. In computer vision, the expert-developed features have been significantly overpassed by automatically trained convolutional filters. This motivated us to apply a three-dimensional (3D) CNN to the problem of protein model QA. RESULTS: We developed Ornate (Oriented Routed Neural network with Automatic Typing)-a novel method for single-model QA. Ornate is a residue-wise scoring function that takes as input 3D density maps. It predicts the local (residue-wise) and the global model quality through a deep 3D CNN. Specifically, Ornate aligns the input density map, corresponding to each residue and its neighborhood, with the backbone topology of this residue. This circumvents the problem of ambiguous orientations of the initial models. Also, Ornate includes automatic identification of atom types and dynamic routing of the data in the network. Established benchmarks (CASP 11 and CASP 12) demonstrate the state-of-the-art performance of our approach among single-model QA methods. AVAILABILITY AND IMPLEMENTATION: The method is available at https://team.inria.fr/nano-d/software/Ornate/. It consists of a C++ executable that transforms molecular structures into volumetric density maps, and a Python code based on the TensorFlow framework for applying the Ornate model to these maps. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Proteínas
12.
J Comput Aided Mol Des ; 34(2): 191-200, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31784861

RESUMO

The D3R Grand Challenge 4 provided a brilliant opportunity to test macrocyclic docking protocols on a diverse high-quality experimental data. We participated in both pose and affinity prediction exercises. Overall, we aimed to use an automated structure-based docking pipeline built around a set of tools developed in our team. This exercise again demonstrated a crucial importance of the correct local ligand geometry for the overall success of docking. Starting from the second part of the pose prediction stage, we developed a stable pipeline for sampling macrocycle conformers. This resulted in the subangstrom average precision of our pose predictions. In the affinity prediction exercise we obtained average results. However, we could improve these when using docking poses submitted by the best predictors. Our docking tools including the Convex-PL scoring function are available at https://team.inria.fr/nano-d/software/.


Assuntos
Desenho de Fármacos , Compostos Macrocíclicos/farmacologia , Simulação de Acoplamento Molecular , Proteínas/metabolismo , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Ligantes , Compostos Macrocíclicos/química , Ligação Proteica , Conformação Proteica , Proteínas/química , Software
13.
Int J Mol Sci ; 21(20)2020 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-33081390

RESUMO

Spreading of the multidrug-resistant (MDR) strains of the one of the most harmful pathogen Mycobacterium tuberculosis (Mtb) generates the need for new effective drugs. SQ109 showed activity against resistant Mtb and already advanced to Phase II/III clinical trials. Fast SQ109 degradation is attributed to the human liver Cytochrome P450s (CYPs). However, no information is available about interactions of the drug with Mtb CYPs. Here, we show that Mtb CYP124, previously assigned as a methyl-branched lipid monooxygenase, binds and hydroxylates SQ109 in vitro. A 1.25 Å-resolution crystal structure of the CYP124-SQ109 complex unambiguously shows two conformations of the drug, both positioned for hydroxylation of the ω-methyl group in the trans position. The hydroxylated SQ109 presumably forms stabilizing H-bonds with its target, Mycobacterial membrane protein Large 3 (MmpL3). We anticipate that Mtb CYPs could function as analogs of drug-metabolizing human CYPs affecting pharmacokinetics and pharmacodynamics of antitubercular (anti-TB) drugs.


Assuntos
Adamantano/análogos & derivados , Antituberculosos/química , Sistema Enzimático do Citocromo P-450/química , Etilenodiaminas/química , Simulação de Acoplamento Molecular , Mycobacterium tuberculosis/enzimologia , Adamantano/química , Adamantano/farmacologia , Antituberculosos/farmacologia , Sítios de Ligação , Sistema Enzimático do Citocromo P-450/metabolismo , Etilenodiaminas/farmacologia , Hidroxilação , Ligação Proteica
14.
Proteins ; 87(12): 1298-1314, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31589784

RESUMO

Small angle X-ray scattering (SAXS) measures comprehensive distance information on a protein's structure, which can constrain and guide computational structure prediction algorithms. Here, we evaluate structure predictions of 11 monomeric and oligomeric proteins for which SAXS data were collected and provided to predictors in the 13th round of the Critical Assessment of protein Structure Prediction (CASP13). The category for SAXS-assisted predictions made gains in certain areas for CASP13 compared to CASP12. Improvements included higher quality data with size exclusion chromatography-SAXS (SEC-SAXS) and better selection of targets and communication of results by CASP organizers. In several cases, we can track improvements in model accuracy with use of SAXS data. For hard multimeric targets where regular folding algorithms were unsuccessful, SAXS data helped predictors to build models better resembling the global shape of the target. For most models, however, no significant improvement in model accuracy at the domain level was registered from use of SAXS data, when rigorously comparing SAXS-assisted models to the best regular server predictions. To promote future progress in this category, we identify successes, challenges, and opportunities for improved strategies in prediction, assessment, and communication of SAXS data to predictors. An important observation is that, for many targets, SAXS data were inconsistent with crystal structures, suggesting that these proteins adopt different conformation(s) in solution. This CASP13 result, if representative of PDB structures and future CASP targets, may have substantive implications for the structure training databases used for machine learning, CASP, and use of prediction models for biology.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Algoritmos , Modelos Moleculares , Dobramento de Proteína , Proteínas/química , Proteínas/genética , Espalhamento a Baixo Ângulo , Soluções/química , Difração de Raios X
15.
Proteins ; 87(12): 1283-1297, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31569265

RESUMO

With the advance of experimental procedures obtaining chemical crosslinking information is becoming a fast and routine practice. Information on crosslinks can greatly enhance the accuracy of protein structure modeling. Here, we review the current state of the art in modeling protein structures with the assistance of experimentally determined chemical crosslinks within the framework of the 13th meeting of Critical Assessment of Structure Prediction approaches. This largest-to-date blind assessment reveals benefits of using data assistance in difficult to model protein structure prediction cases. However, in a broader context, it also suggests that with the unprecedented advance in accuracy to predict contacts in recent years, experimental crosslinks will be useful only if their specificity and accuracy further improved and they are better integrated into computational workflows.


Assuntos
Biologia Computacional/métodos , Reagentes de Ligações Cruzadas/química , Modelos Moleculares , Conformação Proteica , Proteínas/química , Algoritmos , Cromatografia Líquida , Modelos Químicos , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem
16.
Proteins ; 87(12): 1200-1221, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31612567

RESUMO

We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Software , Algoritmos , Sítios de Ligação/genética , Bases de Dados de Proteínas , Modelos Moleculares , Ligação Proteica/genética , Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/genética , Homologia Estrutural de Proteína
17.
J Comput Chem ; 40(27): 2391-2399, 2019 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-31254466

RESUMO

In this study, we propose a novel optimization algorithm, with application to the refinement of molecular complexes. Particularly, we consider optimization problem as the calculation of quasi-static trajectories of rigid bodies influenced by the inverse-inertia-weighted energy gradient and introduce the concept of advancement region that guarantees displacement of a molecule strictly within a relevant region of conformational space. The advancement region helps to avoid typical energy minimization pitfalls, thus, the algorithm is suitable to work with arbitrary energy functions and arbitrary types of molecular complexes without necessary tuning of its hyper-parameters. Our method, called controlled-advancement rigid-body optimization of nanosystems (Carbon), is particularly useful for the large-scale molecular refinement, as for example, the putative binding candidates obtained with protein-protein docking pipelines. Implementation of Carbon with user-friendly interface is available in the SAMSON platform for molecular modeling at https://www.samson-connect.net. © 2019 Wiley Periodicals, Inc.

18.
Bioinformatics ; 34(23): 4046-4053, 2018 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-29931128

RESUMO

Motivation: The computational prediction of a protein structure from its sequence generally relies on a method to assess the quality of protein models. Most assessment methods rank candidate models using heavily engineered structural features, defined as complex functions of the atomic coordinates. However, very few methods have attempted to learn these features directly from the data. Results: We show that deep convolutional networks can be used to predict the ranking of model structures solely on the basis of their raw three-dimensional atomic densities, without any feature tuning. We develop a deep neural network that performs on par with state-of-the-art algorithms from the literature. The network is trained on decoys from the CASP7 to CASP10 datasets and its performance is tested on the CASP11 dataset. Additional testing on decoys from the CASP12, CAMEO and 3DRobot datasets confirms that the network performs consistently well across a variety of protein structures. While the network learns to assess structural decoys globally and does not rely on any predefined features, it can be analyzed to show that it implicitly identifies regions that deviate from the native structure. Availability and implementation: The code and the datasets are available at https://github.com/lamoureux-lab/3DCNN_MQA. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Dobramento de Proteína , Proteínas/química , Algoritmos
19.
Bioinformatics ; 34(16): 2757-2765, 2018 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-29554205

RESUMO

Motivation: The root mean square deviation (RMSD) is one of the most used similarity criteria in structural biology and bioinformatics. Standard computation of the RMSD has a linear complexity with respect to the number of atoms in a molecule, making RMSD calculations time-consuming for the large-scale modeling applications, such as assessment of molecular docking predictions or clustering of spatially proximate molecular conformations. Previously, we introduced the RigidRMSD algorithm to compute the RMSD corresponding to the rigid-body motion of a molecule. In this study, we go beyond the limits of the rigid-body approximation by taking into account conformational flexibility of the molecule. We model the flexibility with a reduced set of collective motions computed with e.g. normal modes or principal component analysis. Results: The initialization of our algorithm is linear in the number of atoms and all the subsequent evaluations of RMSD values between flexible molecular conformations depend only on the number of collective motions that are selected to model the flexibility. Therefore, our algorithm is much faster compared to the standard RMSD computation for large-scale modeling applications. We demonstrate the efficiency of our method on several clustering examples, including clustering of flexible docking results and molecular dynamics (MD) trajectories. We also demonstrate how to use the presented formalism to generate pseudo-random constant-RMSD structural molecular ensembles and how to use these in cross-docking. Availability and implementation: We provide the algorithm written in C++ as the open-source RapidRMSD library governed by the BSD-compatible license, which is available at http://team.inria.fr/nano-d/software/RapidRMSD/. The constant-RMSD structural ensemble application and clustering of MD trajectories is available at http://team.inria.fr/nano-d/software/nolb-normal-modes/. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Simulação de Dinâmica Molecular , Movimento (Física) , Maleabilidade , Proteínas/química , Software , Algoritmos , Análise de Componente Principal , Conformação Proteica , Proteínas/metabolismo
20.
J Struct Biol ; 203(3): 185-194, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29902523

RESUMO

Protein assemblies are often symmetric, as this organization has many advantages compared to individual proteins. Complex protein structures thus very often possess high-order symmetries. Detection and analysis of these symmetries has been a challenging problem and no efficient algorithms have been developed so far. This paper presents the extension of our cyclic symmetry detection method for higher-order symmetries with multiple symmetry axes. These include dihedral and cubic, i.e., tetrahedral, octahedral, and icosahedral, groups. Our method assesses the quality of a particular symmetry group and also determines all of its symmetry axes with a machine precision. The method comprises discrete and continuous optimization steps and is applicable to assemblies with multiple chains in the asymmetric subunits or to those with pseudo-symmetry. We implemented the method in C++ and exhaustively tested it on all 51,358 symmetric assemblies from the Protein Data Bank (PDB). It allowed us to study structural organization of symmetric assemblies solved by X-ray crystallography, and also to assess the symmetry annotation in the PDB. For example, in 1.6% of the cases we detected a higher symmetry group compared to the PDB annotation, and we also detected several cases with incorrect annotation. The method is available at http://team.inria.fr/nano-d/software/ananas. The graphical user interface of the method built for the SAMSON platform is available at http://samson-connect.net.


Assuntos
Conformação Proteica , Proteínas/química , Software , Algoritmos , Cristalografia por Raios X , Bases de Dados de Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA