Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 210
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nat Methods ; 21(1): 122-131, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38066344

RESUMEN

Three-dimensional structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy. Although the resolution of determined cryogenic electron microscopy maps has generally improved, there are still many cases where tracing protein main chains is difficult, even in maps determined at a near-atomic resolution. Here we developed a protein structure modeling method, DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, we integrated AlphaFold2 with the de novo density tracing protocol to combine their complementary strengths and achieved even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign the chain identity to the structure models of homo-multimers, which is not a trivial task for existing methods.


Asunto(s)
Aprendizaje Profundo , Microscopía por Crioelectrón/métodos , Modelos Moleculares , Proteínas/química , Microscopía Electrónica , Conformación Proteica
2.
Nat Methods ; 21(7): 1340-1348, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38918604

RESUMEN

The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.


Asunto(s)
Microscopía por Crioelectrón , Modelos Moleculares , Microscopía por Crioelectrón/métodos , Ligandos , SARS-CoV-2 , COVID-19/virología , Escherichia coli , beta-Galactosidasa/química , beta-Galactosidasa/metabolismo , Conformación Proteica , Reproducibilidad de los Resultados
3.
Nat Methods ; 20(11): 1739-1747, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37783885

RESUMEN

DNA and RNA play fundamental roles in various cellular processes, where their three-dimensional structures provide information critical to understanding the molecular mechanisms of their functions. Although an increasing number of nucleic acid structures and their complexes with proteins are determined by cryogenic electron microscopy (cryo-EM), structure modeling for DNA and RNA remains challenging particularly when the map is determined at a resolution coarser than atomic level. Moreover, computational methods for nucleic acid structure modeling are relatively scarce. Here, we present CryoREAD, a fully automated de novo DNA/RNA atomic structure modeling method using deep learning. CryoREAD identifies phosphate, sugar and base positions in a cryo-EM map using deep learning, which are traced and modeled into a three-dimensional structure. When tested on cryo-EM maps determined at 2.0 to 5.0 Å resolution, CryoREAD built substantially more accurate models than existing methods. We also applied the method to cryo-EM maps of biomolecular complexes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).


Asunto(s)
Aprendizaje Profundo , Ácidos Nucleicos , Microscopía por Crioelectrón/métodos , Modelos Moleculares , ARN , ADN , Conformación Proteica
4.
Mol Biol Evol ; 41(3)2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38376487

RESUMEN

The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long reads and utilized Illumina/10×, optical maps, and Hi-C data for scaffolding, polishing, and manual curation. We also provided long read RNA-seq data to facilitate the annotation of the assembly by NCBI and Ensembl. Additionally, we annotated both haplotypes using TOGA and measured the genome size by flow cytometry. We then compared the blue whale genome with other cetaceans and artiodactyls, including vaquita (Phocoena sinus), the world's smallest cetacean, to investigate blue whale's unique biological traits. We found a dramatic amplification of several genes in the blue whale genome resulting from a recent burst in segmental duplications, though the possible connection between this amplification and giant body size requires further study. We also discovered sites in the insulin-like growth factor-1 gene correlated with body size in cetaceans. Finally, using our assembly to examine the heterozygosity and historical demography of Pacific and Atlantic blue whale populations, we found that the genomes of both populations are highly heterozygous and that their genetic isolation dates to the last interglacial period. Taken together, these results indicate how a high-quality, annotated blue whale genome will serve as an important resource for biology, evolution, and conservation research.


Asunto(s)
Balaenoptera , Neoplasias , Animales , Balaenoptera/genética , Duplicaciones Segmentarias en el Genoma , Genoma , Demografía , Neoplasias/genética
5.
Nat Methods ; 19(9): 1116-1125, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35953671

RESUMEN

An increasing number of protein structures are being determined by cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM density maps is improving in general, there are still many cases where amino acids of a protein are assigned with different levels of confidence. Here we developed a method that identifies potential misassignment of residues in the map, including residue shifts along an otherwise correct main-chain trace. The score, named DAQ, computes the likelihood that the local density corresponds to different amino acids, atoms, and secondary structures, estimated via deep learning, and assesses the consistency of the amino acid assignment in the protein structure model with that likelihood. When DAQ was applied to different versions of model structures in the Protein Data Bank that were derived from the same density maps, a clear improvement in the DAQ score was observed in the newer versions of the models. DAQ also found potential misassignment errors in a substantial number of deposited protein structure models built into cryo-EM maps.


Asunto(s)
Aminoácidos , Proteínas , Microscopía por Crioelectrón , Modelos Moleculares , Conformación Proteica , Estructura Secundaria de Proteína , Proteínas/química
6.
Nat Methods ; 18(2): 156-164, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33542514

RESUMEN

This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density.


Asunto(s)
Microscopía por Crioelectrón/métodos , Modelos Moleculares , Cristalografía por Rayos X , Conformación Proteica , Proteínas/química
7.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37549063

RESUMEN

MOTIVATION: The tertiary structures of an increasing number of biological macromolecules have been determined using cryo-electron microscopy (cryo-EM). However, there are still many cases where the resolution is not high enough to model the molecular structures with standard computational tools. If the resolution obtained is near the empirical borderline (3-4.5 Å), improvement in the map quality facilitates structure modeling. RESULTS: We report EM-GAN, a novel approach that modifies an input cryo-EM map to assist protein structure modeling. The method uses a 3D generative adversarial network (GAN) that has been trained on high- and low-resolution density maps to learn the density patterns, and modifies the input map to enhance its suitability for modeling. The method was tested extensively on a dataset of 65 EM maps in the resolution range of 3-6 Å and showed substantial improvements in structure modeling using popular protein structure modeling tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/kiharalab/EM-GAN, Google Colab: https://tinyurl.com/3ccxpttx.


Asunto(s)
Proteínas , Microscopía por Crioelectrón , Modelos Moleculares , Proteínas/química , Conformación Proteica
8.
Plant Physiol ; 191(1): 142-160, 2023 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-36250895

RESUMEN

The Plant-Conserved Region (P-CR) and the Class-Specific Region (CSR) are two plant-unique sequences in the catalytic core of cellulose synthases (CESAs) for which specific functions have not been established. Here, we used site-directed mutagenesis to replace amino acids and motifs within these sequences predicted to be essential for assembly and function of CESAs. We developed an in vivo method to determine the ability of mutated CesA1 transgenes to complement an Arabidopsis (Arabidopsis thaliana) temperature-sensitive root-swelling1 (rsw1) mutant. Replacement of a Cys residue in the CSR, which blocks dimerization in vitro, rendered the AtCesA1 transgene unable to complement the rsw1 mutation. Examination of the CSR sequences from 33 diverse angiosperm species showed domains of high-sequence conservation in a class-specific manner but with variation in the degrees of disorder, indicating a nonredundant role of the CSR structures in different CESA isoform classes. The Cys residue essential for dimerization was not always located in domains of intrinsic disorder. Expression of AtCesA1 transgene constructs, in which Pro417 and Arg453 were substituted for Ala or Lys in the coiled-coil of the P-CR, were also unable to complement the rsw1 mutation. Despite an expected role for Arg457 in trimerization of CESA proteins, AtCesA1 transgenes with Arg457Ala mutations were able to fully restore the wild-type phenotype in rsw1. Our data support that Cys662 within the CSR and Pro417 and Arg453 within the P-CR of Arabidopsis CESA1 are essential residues for functional synthase complex formation, but our data do not support a specific role for Arg457 in trimerization in native CESA complexes.


Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Aminoácidos Esenciales/genética , Aminoácidos Esenciales/metabolismo , Mutación , Celulosa/metabolismo , Glucosiltransferasas/metabolismo
9.
Methods ; 213: 10-17, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36924867

RESUMEN

Protein-DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/.


Asunto(s)
ADN , Proteínas , Proteínas/química , Sitios de Unión , Unión Proteica , ADN/metabolismo
10.
Proteomics ; 23(17): e2200322, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-36529945

RESUMEN

Proteins and nucleic acids are key components in many processes in living cells, and interactions between proteins and nucleic acids are often crucial pathway components. In many cases, large flexibility of proteins as they interact with nucleic acids is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D atomic structures of such protein-nucleic acid complexes. When such structures are not yet experimentally determined, protein docking can be used to computationally generate useful structure models. However, such docking has long had the limitation that the consideration of flexibility is usually limited to small movements or to small structures. We previously developed a method of flexible protein docking which could model ordered proteins which undergo large-scale conformational changes, which we also showed was compatible with nucleic acids. Here, we elaborate on the ability of that pipeline, Flex-LZerD, to model specifically interactions between proteins and nucleic acids, and demonstrate that Flex-LZerD can model more interactions and types of conformational change than previously shown.


Asunto(s)
Ácidos Nucleicos , Conformación Proteica , Unión Proteica , Ácidos Nucleicos/metabolismo , Proteínas/metabolismo
11.
Proteomics ; 23(17): e2200323, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37365936

RESUMEN

Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.


Asunto(s)
Proteínas , Reproducibilidad de los Resultados , Proteínas/metabolismo , Unión Proteica
12.
Hum Mol Genet ; 30(3-4): 198-212, 2021 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-33517444

RESUMEN

Lowe Syndrome (LS) is a lethal genetic disorder caused by mutations in the OCRL1 gene which encodes the lipid 5' phosphatase Ocrl1. Patients exhibit a characteristic triad of symptoms including eye, brain and kidney abnormalities with renal failure as the most common cause of premature death. Over 200 OCRL1 mutations have been identified in LS, but their specific impact on cellular processes is unknown. Despite observations of heterogeneity in patient symptom severity, there is little understanding of the correlation between genotype and its impact on phenotype. Here, we show that different mutations had diverse effects on protein localization and on triggering LS cellular phenotypes. In addition, some mutations affecting specific domains imparted unique characteristics to the resulting mutated protein. We also propose that certain mutations conformationally affect the 5'-phosphatase domain of the protein, resulting in loss of enzymatic activity and causing common and specific phenotypes (a conformational disease scenario). This study is the first to show the differential effect of patient 5'-phosphatase mutations on cellular phenotypes and introduces a conformational disease component in LS. This work provides a framework that explains symptom heterogeneity and can help stratify patients as well as to produce a more accurate prognosis depending on the nature and location of the mutation within the OCRL1 gene.


Asunto(s)
Modelos Moleculares , Mutación , Síndrome Oculocerebrorrenal/enzimología , Monoéster Fosfórico Hidrolasas/genética , Monoéster Fosfórico Hidrolasas/metabolismo , Línea Celular , Simulación por Computador , Células HEK293 , Humanos , Síndrome Oculocerebrorrenal/genética , Fenotipo , Conformación Proteica , Transporte de Proteínas
13.
Methods ; 204: 55-63, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35609776

RESUMEN

Intrinsically Disordered Proteins (IDPs) are a class of proteins in which at least some region of the protein does not possess any stable structure in solution in the physiological condition but may adopt an ordered structure upon binding to a globular receptor. These IDP-receptor complexes are thus subject to protein complex modeling in which computational techniques are applied to accurately reproduce the IDP ligand-receptor interactions. This often exists in the form of protein docking, in which the 3D structures of both the subunits are known, but the position of the ligand relative to the receptor is not. Here, we evaluate the performance of three IDP-receptor modeling tools with metrics that characterize the IDP-receptor interface at various resolutions. We show that all three methods are able to properly identify the general binding site, as identified by lower resolution metrics, but begin to struggle with higher resolution metrics that capture biophysical interactions.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Sitios de Unión , Proteínas Intrínsecamente Desordenadas/química , Ligandos , Unión Proteica , Conformación Proteica , Dominios Proteicos
14.
Nucleic Acids Res ; 49(W1): W359-W365, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-33963854

RESUMEN

Protein complexes are involved in many important processes in living cells. To understand the mechanisms of these processes, it is necessary to solve the 3D structures of the protein complexes. When protein complex structures have not yet been determined by experiment, protein-protein docking tools can be used to computationally model the structures of these complexes. Here, we present a webserver which provides access to LZerD and Multi-LZerD protein docking tools. The protocol provided by the server have performed consistently among the top in the CAPRI blind evaluation. LZerD docks pairs of structures, while Multi-LZerD can dock three or more structures simultaneously. LZerD uses a soft protein surface representation with 3D Zernike descriptors and explores the binding pose space using geometric hashing. Multi-LZerD performs multi-chain docking by combining pairwise solutions by LZerD. Both methods output full-atom docked models of the input proteins. Users can also input distance constraints between interacting or non-interacting residues as well as residues that locate at the interface or far from the interface. The webserver is equipped with a user-friendly panel that visualizes the distribution and structures of binding poses of top scoring models. The LZerD webserver is available at https://lzerd.kiharalab.org.


Asunto(s)
Simulación del Acoplamiento Molecular/métodos , Complejos Multiproteicos/química , Programas Informáticos , Antígenos CD/química , Proteínas Bacterianas/química , Moléculas de Adhesión Celular/química , Enoil-ACP Reductasa (NADH)/química , Humanos , Internet
15.
BMC Biol ; 20(1): 245, 2022 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-36344967

RESUMEN

BACKGROUND: The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, holds the promise of better translation of research findings to the clinic. RESULTS: We report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including some that affect genes associated with type 2 diabetes and metabolic dysfunctions. We discuss 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse. CONCLUSIONS: Our findings reflect the exceptional level of genomic resolution present in this assembly, which will greatly expand the potential of the Nile rat as a model organism.


Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Animales , Haplotipos , Diabetes Mellitus Tipo 2/genética , Murinae , Genoma , Genómica
16.
Semin Cancer Biol ; 68: 84-91, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-31698087

RESUMEN

A pre-eminent subtype of lung carcinoma, Non-small cell lung cancer accounts for paramount causes of cancer-associated mortality worldwide. Undeterred by the endeavour in the treatment strategies, the overall cure and survival rates for NSCLC remain substandard, particularly in metastatic diseases. Moreover, the emergence of resistance to classic anticancer drugs further deteriorates the situation. These demanding circumstances culminate the need of extended and revamped research for the establishment of upcoming generation cancer therapeutics. Drug repositioning introduces an affordable and efficient strategy to discover novel drug action, especially when integrated with recent systems biology driven stratagem. This review illustrates the trendsetting approaches in repurposing along with their numerous success stories with an emphasize on the NSCLC therapeutics. Indeed, these novel hits, in combination with conventional anticancer agents, will ideally make their way the clinics and strengthen the therapeutic arsenal to combat drug resistance in the near future.


Asunto(s)
Antineoplásicos/uso terapéutico , Carcinoma de Pulmón de Células no Pequeñas/tratamiento farmacológico , Descubrimiento de Drogas , Reposicionamiento de Medicamentos/métodos , Neoplasias Pulmonares/tratamiento farmacológico , Polifarmacología/métodos , Animales , Humanos
17.
Proteins ; 90(1): 83-95, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34309909

RESUMEN

Protein structure docking is the process in which the quaternary structure of a protein complex is predicted from individual tertiary structures of the protein subunits. Protein docking is typically performed in two main steps. The subunits are first docked while keeping them rigid to form the complex, which is then followed by structure refinement. Structure refinement is crucial for a practical use of computational protein docking models, as it is aimed for correcting conformations of interacting residues and atoms at the interface. Here, we benchmarked the performance of eight existing protein structure refinement methods in refinement of protein complex models. We show that the fraction of native contacts between subunits is by far the most straightforward metric to improve. However, backbone dependent metrics, based on the Root Mean Square Deviation proved more difficult to improve via refinement.


Asunto(s)
Biología Computacional/métodos , Simulación del Acoplamiento Molecular/métodos , Conformación Proteica , Proteínas/química , Algoritmos , Benchmarking , Bases de Datos de Proteínas , Proteínas/genética , Proteínas/metabolismo
18.
J Comput Chem ; 43(17): 1140-1150, 2022 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-35475517

RESUMEN

The native structures of proteins, except for notable exceptions of intrinsically disordered proteins, in general take their most stable conformation in the physiological condition to maintain their structural framework so that their biological function can be properly carried out. Experimentally, the stability of a protein can be measured by several means, among which the pulling experiment using the atomic force microscope (AFM) stands as a unique method. AFM directly measures the resistance from unfolding, which can be quantified from the observed force-extension profile. It has been shown that key features observed in an AFM pulling experiment can be well reproduced by computational molecular dynamics simulations. Here, we applied computational pulling for estimating the accuracy of computational protein structure models under the hypothesis that the structural stability would positively correlated with the accuracy, i.e. the closeness to the native, of a model. We used in total 4929 structure models for 24 target proteins from the Critical Assessment of Techniques of Structure Prediction (CASP) and investigated if the magnitude of the break force, that is, the force required to rearrange the model's structure, from the force profile was sufficient information for selecting near-native models. We found that near-native models can be successfully selected by examining their break forces suggesting that high break force indeed indicates high stability of models. On the other hand, there were also near-native models that had relatively low peak forces. The mechanisms of the stability exhibited by the break forces were explored and discussed.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Conformación Proteica , Proteínas/química , Programas Informáticos
19.
Nat Methods ; 16(9): 911-917, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31358979

RESUMEN

Although structures determined at near-atomic resolution are now routinely reported by cryo-electron microscopy (cryo-EM), many density maps are determined at an intermediate resolution, and extracting structure information from these maps is still a challenge. We report a computational method, Emap2sec, that identifies the secondary structures of proteins (α-helices, ß-sheets and other structures) in EM maps at resolutions of between 5 and 10 Å. Emap2sec uses a three-dimensional deep convolutional neural network to assign secondary structure to each grid point in an EM map. We tested Emap2sec on EM maps simulated from 34 structures at resolutions of 6.0 and 10.0 Å, as well as on 43 maps determined experimentally at resolutions of between 5.0 and 9.5 Å. Emap2sec was able to clearly identify the secondary structures in many maps tested, and showed substantially better performance than existing methods.


Asunto(s)
Microscopía por Crioelectrón/métodos , Aprendizaje Profundo , Redes Neurales de la Computación , Estructura Secundaria de Proteína , Proteínas/química , Programas Informáticos , Humanos , Modelos Moleculares
20.
Bioinformatics ; 37(19): 3168-3174, 2021 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-33787852

RESUMEN

MOTIVATION: Protein structure prediction remains as one of the most important problems in computational biology and biophysics. In the past few years, protein residue-residue contact prediction has undergone substantial improvement, which has made it a critical driving force for successful protein structure prediction. Boosting the accuracy of contact predictions has, therefore, become the forefront of protein structure prediction. RESULTS: We show a novel contact map refinement method, ContactGAN, which uses Generative Adversarial Networks (GAN). ContactGAN was able to make a significant improvement over predictions made by recent contact prediction methods when tested on three datasets including protein structure modeling targets in CASP13 and CASP14. We show improvement of precision in contact prediction, which translated into improvement in the accuracy of protein tertiary structure models. On the other hand, observed improvement over trRosetta was relatively small, reasons for which are discussed. ContactGAN will be a valuable addition in the structure prediction pipeline to achieve an extra gain in contact prediction accuracy. AVAILABILITY AND IMPLEMENTATION: https://github.com/kiharalab/ContactGAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA