Búsqueda | Portal de Búsqueda de la BVS Enfermería

1.

Accurate flexible refinement for atomic-level protein structure using cryo-EM density maps and deep learning.

Zhang, Biao; Liu, Dong; Zhang, Yang; Shen, Hong-Bin; Zhang, Gui-Jun.

Brief Bioinform ; 23(2)2022 03 10.

Artículo en Inglés | MEDLINE | ID: mdl-35152277

RESUMEN

With the rapid progress of deep learning in cryo-electron microscopy and protein structure prediction, improving the accuracy of the protein structure model by using a density map and predicted contact/distance map through deep learning has become an urgent need for robust methods. Thus, designing an effective protein structure optimization strategy based on the density map and predicted contact/distance map is critical to improving the accuracy of structure refinement. In this article, a protein structure optimization method based on the density map and predicted contact/distance map by deep-learning technology was proposed in accordance with the result of matching between the density map and the initial model. Physics- and knowledge-based energy functions, integrated with Cryo-EM density map data and deep-learning data, were used to optimize the protein structure in the simulation. The dynamic confidence score was introduced to the iterative process for choosing whether it is a density map or a contact/distance map to dominate the movement in the simulation to improve the accuracy of refinement. The protocol was tested on a large set of 224 non-homologous membrane proteins and generated 214 structural models with correct folds, where 4.5% of structural models were generated from structural models with incorrect folds. Compared with other state-of-the-art methods, the major advantage of the proposed methods lies in the skills for using density map and contact/distance map in the simulation, as well as the new energy function in the re-assembly simulations. Overall, the results demonstrated that this strategy is a valuable approach and ready to use for atomic-level structure refinement using cryo-EM density map and predicted contact/distance map.

Asunto(s)

Aprendizaje Profundo , Microscopía por Crioelectrón/métodos , Proteínas de la Membrana , Modelos Moleculares , Conformación Proteica

2.

DeepUMQA3: a web server for accurate assessment of interface residue accuracy in protein complexes.

Liu, Jun; Liu, Dong; Zhang, Gui-Jun.

Bioinformatics ; 39(10)2023 Oct 03.

Artículo en Inglés | MEDLINE | ID: mdl-37740296

RESUMEN

MOTIVATION: Model quality assessment is a crucial part of protein structure prediction and a gateway to proper usage of models in biomedical applications. Many methods have been proposed for assessing the quality of structural models of protein monomers, but few methods for evaluating protein complex models. As protein complex structure prediction becomes a new challenge, there is an urgent need for model quality assessment methods that can accurately assess the accuracy of interface residues of complex structures. RESULTS: Here, we present DeepUMQA3, a web server for evaluating the accuracy of interface residues of protein complex structures using deep neural networks. For an input complex structure, features are extracted from three levels of overall complex, intra-monomer, and inter-monomer, and an improved deep residual neural network is used to predict per-residue lDDT and interface residue accuracy. DeepUMQA3 ranks first in the blind test of interface residue accuracy estimation in CASP15, with Pearson, Spearman, and AUC of 0.564, 0.535, and 0.755 under the lDDT measurement, which are 17.6%, 23.6%, and 10.9% higher than the second best method, respectively. DeepUMQA3 can also assess the accuracy of all residues in the entire complex and distinguish high- and low-precision residues. AVAILABILITY AND IMPLEMENTATION: The web sever of DeepUMQA3 are freely available at http://zhanglab-bioinf.com/DeepUMQA_server/.

3.

Recent Advances and Challenges in Protein Structure Prediction.

Peng, Chun-Xiang; Liang, Fang; Xia, Yu-Hao; Zhao, Kai-Long; Hou, Ming-Hua; Zhang, Gui-Jun.

J Chem Inf Model ; 64(1): 76-95, 2024 Jan 08.

Artículo en Inglés | MEDLINE | ID: mdl-38109487

RESUMEN

Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.

Asunto(s)

Inteligencia Artificial , Descubrimiento de Drogas , Pliegue de Proteína , Proyectos de Investigación

4.

DeepUMQA: ultrafast shape recognition-based protein model quality assessment using deep learning.

Guo, Sai-Sai; Liu, Jun; Zhou, Xiao-Gen; Zhang, Gui-Jun.

Bioinformatics ; 38(7): 1895-1903, 2022 03 28.

Artículo en Inglés | MEDLINE | ID: mdl-35134108

RESUMEN

MOTIVATION: Protein model quality assessment is a key component of protein structure prediction. In recent research, the voxelization feature was used to characterize the local structural information of residues, but it may be insufficient for describing residue-level topological information. Design features that can further reflect residue-level topology when combined with deep learning methods are therefore crucial to improve the performance of model quality assessment. RESULTS: We developed a deep-learning method, DeepUMQA, based on Ultrafast Shape Recognition (USR) for the residue-level single-model quality assessment. In the framework of the deep residual neural network, the residue-level USR feature was introduced to describe the topological relationship between the residue and overall structure by calculating the first moment of a set of residue distance sets and then combined with 1D, 2D and voxelization features to assess the quality of the model. Experimental results on the CASP13, CASP14 test datasets and CAMEO blind test show that USR could supplement the voxelization features to comprehensively characterize residue structure information and significantly improve model assessment accuracy. The performance of DeepUMQA ranks among the top during the state-of-the-art single-model quality assessment methods, including ProQ2, ProQ3, ProQ3D, Ornate, VoroMQA, ProteinGCN, ResNetQA, QDeep, GraphQA, ModFOLD6, ModFOLD7, ModFOLD8, QMEAN3, QMEANDisCo3 and DeepAccNet. AVAILABILITY AND IMPLEMENTATION: The DeepUMQA server is freely available at http://zhanglab-bioinf.com/DeepUMQA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Aprendizaje Profundo , Proteínas/química , Redes Neurales de la Computación , Biología Computacional/métodos

5.

ATPdock: a template-based method for ATP-specific protein-ligand docking.

Rao, Liang; Jia, Ning-Xin; Hu, Jun; Yu, Dong-Jun; Zhang, Gui-Jun.

Bioinformatics ; 38(2): 556-558, 2022 01 03.

Artículo en Inglés | MEDLINE | ID: mdl-34546290

RESUMEN

MOTIVATION: Accurately identifying protein-ATP binding poses is significantly valuable for both basic structure biology and drug discovery. Although many docking methods have been designed, most of them require a user-defined binding site and are difficult to achieve a high-quality protein-ATP docking result. It is critical to develop a protein-ATP-specific blind docking method without user-defined binding sites. RESULTS: Here, we present ATPdock, a template-based method for docking ATP into protein. For each query protein, if no pocket site is given, ATPdock first identifies its most potential pocket using ATPbind, an ATP-binding site predictor; then, the template pocket, which is most similar to the given or identified pocket, is searched from the database of pocket-ligand structures using APoc, a pocket structural alignment tool; thirdly, the rough docking pose of ATP (rdATP) is generated using LS-align, a ligand structural alignment tool, to align the initial ATP pose to the template ligand corresponding to template pocket; finally, the Metropolis Monte Carlo simulation is used to fine-tune the rdATP under the guidance of AutoDock Vina energy function. Benchmark tests show that ATPdock significantly outperforms other state-of-the-art methods in docking accuracy. AVAILABILITY AND IMPLEMENTATION: https://jun-csbio.github.io/atpdock/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Adenosina Trifosfato , Proteínas , Ligandos , Proteínas/química , Sitios de Unión , Unión Proteica , Adenosina Trifosfato/metabolismo , Simulación del Acoplamiento Molecular

6.

Structural analogue-based protein structure domain assembly assisted by deep learning.

Peng, Chun-Xiang; Zhou, Xiao-Gen; Xia, Yu-Hao; Liu, Jun; Hou, Ming-Hua; Zhang, Gui-Jun.

Bioinformatics ; 38(19): 4513-4521, 2022 09 30.

Artículo en Inglés | MEDLINE | ID: mdl-35962986

RESUMEN

MOTIVATION: With the breakthrough of AlphaFold2, the protein structure prediction problem has made remarkable progress through deep learning end-to-end techniques, in which correct folds could be built for nearly all single-domain proteins. However, the full-chain modelling appears to be lower on average accuracy than that for the constituent domains and requires higher demand on computing hardware, indicating the performance of full-chain modelling still needs to be improved. In this study, we investigate whether the predicted accuracy of the full-chain model can be further improved by domain assembly assisted by deep learning. RESULTS: In this article, we developed a structural analogue-based protein structure domain assembly method assisted by deep learning, named SADA. In SADA, a multi-domain protein structure database was constructed for the full-chain analogue detection using individual domain models. Starting from the initial model constructed from the analogue, the domain assembly simulation was performed to generate the full-chain model through a two-stage differential evolution algorithm guided by the energy function with an inter-residue distance potential predicted by deep learning. SADA was compared with the state-of-the-art domain assembly methods on 356 benchmark proteins, and the average TM-score of SADA models is 8.1% and 27.0% higher than that of DEMO and AIDA, respectively. We also assembled 293 human multi-domain proteins, where the average TM-score of the full-chain model after the assembly by SADA is 1.1% higher than that of the model by AlphaFold2. To conclude, we find that the domains often interact in the similar way in the quaternary orientations if the domains have similar tertiary structures. Furthermore, homologous templates and structural analogues are complementary for multi-domain protein full-chain modelling. AVAILABILITY AND IMPLEMENTATION: http://zhanglab-bioinf.com/SADA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Aprendizaje Profundo , Humanos , Programas Informáticos , Proteínas/química , Bases de Datos de Proteínas , Dominios Proteicos

7.

Improving protein-protein interaction site prediction using deep residual neural network.

Hu, Jun; Dong, Ming; Tang, Yu-Xuan; Zhang, Gui-Jun.

Anal Biochem ; 670: 115132, 2023 06 01.

Artículo en Inglés | MEDLINE | ID: mdl-36997014

RESUMEN

Accurate identification of protein-protein interaction (PPI) sites is significantly important for understanding the mechanism of life and developing new drugs. However, it is expensive and time-consuming to identify PPI sites using wet-lab experiments. Developing computational methods is a new road to identify PPI sites, which can accelerate the procedure of PPI-related research. In this study, we propose a novel deep learning-based method (called D-PPIsite) to improve the accuracy of sequence-based PPI site prediction. In D-PPIsite, four discriminative sequence-driven features, i.e., position specific scoring matrix, relative solvent accessibility, position information and physical properties, are employed to feed into a well-designed deep learning module, consisting of convolutional, squeeze and excitation, and fully connected layers, to learn a prediction model. To reduce the risk of a single prediction model getting stuck in local optima, multiple prediction models with different initialization parameters are selected and integrated into one final model using the mean ensemble strategy. Experimental results on five independent testing data sets demonstrate that the proposed D-PPIsite can achieve an average accuracy of 80.2% and precision of 36.9%, covering 53.5% of all PPI sites while achieving the average Matthews correlation coefficient value (0.330) that is significantly higher than most of existing state-of-the-art prediction methods. We implement a new standalone-version predictor for predicting PPI sites, which is freely available at https://github.com/MingDongup/D-PPIsite for academic use.

Asunto(s)

Redes Neurales de la Computación , Proteínas

8.

E2EDA: Protein Domain Assembly Based on End-to-End Deep Learning.

Zhu, Hai-Tao; Xia, Yu-Hao; Zhang, Gui-Jun.

J Chem Inf Model ; 63(20): 6451-6461, 2023 10 23.

Artículo en Inglés | MEDLINE | ID: mdl-37788318

RESUMEN

With the development of deep learning, almost all single-domain proteins can be predicted at experimental resolution. However, the structure prediction of multi-domain proteins remains a challenge. Achieving end-to-end protein domain assembly and further improving the accuracy of the full-chain modeling by accurately predicting inter-domain orientation while improving the assembly efficiency will provide significant insights into structure-based drug discovery. In this work, we propose an End-to-End Domain Assembly method based on deep learning, named E2EDA. We first develop RMNet, an EfficientNetV2-based deep learning model that fuses multiple features using an attention mechanism to predict inter-domain rigid motion. Then, the predicted rigid motions are transformed into inter-domain spatial transformations to directly assemble the full-chain model. Finally, the scoring strategy RMscore is designed to select the best model from multiple assembled models. The experimental results show that the average TM-score of the model assembled by E2EDA on the benchmark set (282) is 0.827, which is better than those of other domain assembly methods SADA (0.792) and DEMO (0.730). Meanwhile, on our constructed multi-domain data set from AlphaFold DB, the model reassembled by E2EDA is 7.0% higher in TM-score compared to the full-chain model predicted by AlphaFold2, indicating that E2EDA can capture more accurate inter-domain orientations to improve the quality of the model predicted by AlphaFold2. Furthermore, compared to SADA and AlphaFold2, E2EDA reduced the average runtime on the benchmark by 64.7% and 19.2%, respectively, indicating that E2EDA can significantly improve assembly efficiency through an end-to-end approach. The online server is available at http://zhanglab-bioinf.com/E2EDA.

Asunto(s)

Aprendizaje Profundo , Dominios Proteicos , Proteínas/química

9.

Improving DNA 6mA Site Prediction via Integrating Bidirectional Long Short-Term Memory, Convolutional Neural Network, and Self-Attention Mechanism.

Hu, Jun; Tang, Yu-Xuan; Zhou, Yu; Li, Zhe; Rao, Bing; Zhang, Gui-Jun.

J Chem Inf Model ; 63(17): 5689-5700, 2023 09 11.

Artículo en Inglés | MEDLINE | ID: mdl-37603823

RESUMEN

Identifying DNA N6-methyladenine (6mA) sites is significantly important to understanding the function of DNA. Many deep learning-based methods have been developed to improve the performance of 6mA site prediction. In this study, to further improve the performance of 6mA site prediction, we propose a new meta method, called Co6mA, to integrate bidirectional long short-term memory (BiLSTM), convolutional neural networks (CNNs), and self-attention mechanisms (SAM) via assembling two different deep learning-based models. The first model developed in this study is called CBi6mA, which is composed of CNN, BiLSTM, and fully connected modules. The second model is borrowed from LA6mA, which is an existing 6mA prediction method based on BiLSTM and SAM modules. Experimental results on two independent testing sets of different model organisms, i.e., Arabidopsis thaliana and Drosophila melanogaster, demonstrate that Co6mA can achieve an average accuracy of 91.8%, covering 89% of all 6mA samples while achieving an average Matthews correlation coefficient value (0.839), which is higher than the second-best method DeepM6A.

Asunto(s)

Arabidopsis , Drosophila melanogaster , Animales , Memoria a Corto Plazo , ADN , Redes Neurales de la Computación

10.

Improving DNA-Binding Protein Prediction Using Three-Part Sequence-Order Feature Extraction and a Deep Neural Network Algorithm.

Hu, Jun; Zeng, Wen-Wu; Jia, Ning-Xin; Arif, Muhammad; Yu, Dong-Jun; Zhang, Gui-Jun.

J Chem Inf Model ; 63(3): 1044-1057, 2023 02 13.

Artículo en Inglés | MEDLINE | ID: mdl-36719781

RESUMEN

Identification of the DNA-binding protein (DBP) helps dig out information embedded in the DNA-protein interaction, which is significant to understanding the mechanisms of DNA replication, transcription, and repair. Although existing computational methods for predicting the DBPs based on protein sequences have obtained great success, there is still room for improvement since the sequence-order information is not fully mined in these methods. In this study, a new three-part sequence-order feature extraction (called TPSO) strategy is developed to extract more discriminative information from protein sequences for predicting the DBPs. For each query protein, TPSO first divides its primary sequence features into N- and C-terminal fragments and then extracts the numerical pseudo features of three parts including the full sequence and these two fragments, respectively. Based on TPSO, a novel deep learning-based method, called TPSO-DBP, is proposed, which employs the sequence-based single-view features, the bidirectional long short-term memory (BiLSTM) and fully connected (FC) neural networks to learn the DBP prediction model. Empirical outcomes reveal that TPSO-DBP can achieve an accuracy of 87.01%, covering 85.30% of all DBPs, while achieving a Matthew's correlation coefficient value (0.741) that is significantly higher than most existing state-of-the-art DBP prediction methods. Detailed data analyses have indicated that the advantages of TPSO-DBP lie in the utilization of TPSO, which helps extract more concealed prominent patterns, and the deep neural network framework composed of BiLSTM and FC that learns the nonlinear relationships between input features and DBPs. The standalone package and web server of TPSO-DBP are freely available at https://jun-csbio.github.io/TPSO-DBP/.

Asunto(s)

Proteínas de Unión al ADN , Redes Neurales de la Computación , Proteínas de Unión al ADN/metabolismo , Algoritmos , Secuencia de Aminoácidos

11.

Surgical management and outcome of primary intracranial Rosai-Dorfman disease: a single-institute experience and pooled analysis of individual patient data.

Zhang, Gui-Jun; Ma, Xiu-Jian; Zhang, Ya-Ping; Hao, Li-Fang; Wang, Liang; Zhang, Jun-Ting; Wu, Zhen; Li, Da.

Neurosurg Rev ; 46(1): 76, 2023 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-36967440

RESUMEN

Primary intracranial Rosai-Dorfman disease (PIRDD) is considered a nonmalignant nonneoplastic entity, and the outcome is unclear due to its rarity. The study aimed to elaborate the clinic-radiological features, treatment strategies, and progression-free survival (PFS) in patients with PIRDD. Patients with pathologically confirmed PIRDD in our institute were reviewed. Literature of PIRDD, updated until December 2019, was systematically searched in 7 databases (Embase, PubMed, Cochrane database, Web of Science, Wanfang Data Knowledge Service Platform, the VIP Chinese Science and Technology Periodical Database (VIP), and the China National Knowledge Infrastructure (CNKI)). These prior publication data were processed and used according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Clinical-radiological characteristics and adverse factors for PFS were evaluated in the pooled cohort. The pooled cohort of 124 cases (81 male and 43 female), with a mean age of 39.7 years, included 11 cases from our cohort and 113 cases from 80 prior studies. Twenty-nine patients (23.4%) had multiple lesions. Seventy-four patients (59.7%) experienced gross total resection (GTR), 50 patients (40.3%) had non-GTR, 15 patients (12.1%) received postoperative adjuvant radiation, and 23 patients (18.5%) received postoperative steroids. A multivariate Cox regression revealed that GTR (HR = 4.52; 95% CI 1.21-16.86; p = 0.025) significantly improved PFS, and multiple lesions (p = 0.060) tended to increase the hazard of recurrence. Neither radiation (p = 0.258) nor steroids (p = 0.386) were associated with PFS. The overall PFS at 3, 5, and 10 years in the pooled cohort was 88.4%, 79.4%, and 70.6%, respectively. The PFS at 5 and 10 years in patients with GTR was 85.4% and 85.4%, respectively, which was 71.5% and 35.8%, respectively, in patients without GTR. Gross total resection significantly improved PFS and was recommended for PIRDD. Radiation and steroids were sometimes empirically administered for residual, multiple, or recurrent PIRDD, but the effectiveness remained arguable and required further investigation.Systematic review registration number: CRD42020151294.

Asunto(s)

Histiocitosis Sinusal , Humanos , Masculino , Femenino , Adulto , Histiocitosis Sinusal/cirugía , Supervivencia sin Progresión , Radioterapia Adyuvante , Terapia Combinada , Procedimientos Neuroquirúrgicos , Estudios Retrospectivos

12.

A sequential niche multimodal conformational sampling algorithm for protein structure prediction.

Xia, Yu-Hao; Peng, Chun-Xiang; Zhou, Xiao-Gen; Zhang, Gui-Jun.

Bioinformatics ; 37(23): 4357-4365, 2021 12 07.

Artículo en Inglés | MEDLINE | ID: mdl-34245242

RESUMEN

MOTIVATION: Massive local minima on the protein energy landscape often cause traditional conformational sampling algorithms to be easily trapped in local basin regions, because they find it difficult to overcome high-energy barriers. Also, the lowest energy conformation may not correspond to the native structure due to the inaccuracy of energy models. This study investigates whether these two problems can be alleviated by a sequential niche technique without loss of accuracy. RESULTS: A sequential niche multimodal conformational sampling algorithm for protein structure prediction (SNfold) is proposed in this study. In SNfold, a derating function is designed based on the knowledge learned from the previous sampling and used to construct a series of sampling-guided energy functions. These functions then help the sampling algorithm overcome high-energy barriers and avoid the re-sampling of the explored regions. In inaccurate protein energy models, the high-energy conformation that may correspond to the native structure can be sampled with successively updated sampling-guided energy functions. The proposed SNfold is tested on 300 benchmark proteins, 24 CASP13 and 19 CASP14 FM targets. Results show that SNfold correctly folds (TM-score ≥ 0.5) 231 out of 300 proteins. In particular, compared with Rosetta restrained by distance (Rosetta-dist), SNfold achieves higher average TM-score and improves the sampling efficiency by more than 100 times. On several CASP FM targets, SNfold also shows good performance compared with four state-of-the-art servers in CASP. As a plug-in conformational sampling algorithm, SNfold can be extended to other protein structure prediction methods. AVAILABILITY AND IMPLEMENTATION: The source code and executable versions are freely available at https://github.com/iobio-zjut/SNfold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Proteínas , Conformación Proteica , Proteínas/química , Programas Informáticos , Benchmarking

13.

A de novo protein structure prediction by iterative partition sampling, topology adjustment and residue-level distance deviation optimization.

Liu, Jun; Zhao, Kai-Long; He, Guang-Xing; Wang, Liu-Jing; Zhou, Xiao-Gen; Zhang, Gui-Jun.

Bioinformatics ; 38(1): 99-107, 2021 12 22.

Artículo en Inglés | MEDLINE | ID: mdl-34459867

RESUMEN

MOTIVATION: With the great progress of deep learning-based inter-residue contact/distance prediction, the discrete space formed by fragment assembly cannot satisfy the distance constraint well. Thus, the optimal solution of the continuous space may not be achieved. Designing an effective closed-loop continuous dihedral angle optimization strategy that complements the discrete fragment assembly is crucial to improve the performance of the distance-assisted fragment assembly method. RESULTS: In this article, we proposed a de novo protein structure prediction method called IPTDFold based on closed-loop iterative partition sampling, topology adjustment and residue-level distance deviation optimization. First, local dihedral angle crossover and mutation operators are designed to explore the conformational space extensively and achieve information exchange between the conformations in the population. Then, the dihedral angle rotation model of loop region with partial inter-residue distance constraints is constructed, and the rotation angle satisfying the constraints is obtained by differential evolution algorithm, so as to adjust the spatial position relationship between the secondary structures. Finally, the residue distance deviation is evaluated according to the difference between the conformation and the predicted distance, and the dihedral angle of the residue is optimized with biased probability. The final model is generated by iterating the above three steps. IPTDFold is tested on 462 benchmark proteins, 24 FM targets of CASP13 and 20 FM targets of CASP14. Results show that IPTDFold is significantly superior to the distance-assisted fragment assembly method Rosetta_D (Rosetta with distance). In particular, the prediction accuracy of IPTDFold does not decrease as the length of the protein increases. When using the same FastRelax protocol, the prediction accuracy of IPTDFold is significantly superior to that of trRosetta without orientation constraints, and is equivalent to that of the full version of trRosetta. AVAILABILITYAND IMPLEMENTATION: The source code and executable are freely available at https://github.com/iobio-zjut/IPTDFold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Biología Computacional , Proteínas , Biología Computacional/métodos , Proteínas/química , Programas Informáticos , Algoritmos , Estructura Secundaria de Proteína , Conformación Proteica

14.

MMpred: a distance-assisted multimodal conformation sampling for de novo protein structure prediction.

Zhao, Kai-Long; Liu, Jun; Zhou, Xiao-Gen; Su, Jian-Zhong; Zhang, Yang; Zhang, Gui-Jun.

Bioinformatics ; 37(23): 4350-4356, 2021 12 07.

Artículo en Inglés | MEDLINE | ID: mdl-34185079

RESUMEN

MOTIVATION: The mathematically optimal solution in computational protein folding simulations does not always correspond to the native structure, due to the imperfection of the energy force fields. There is therefore a need to search for more diverse suboptimal solutions in order to identify the states close to the native. We propose a novel multimodal optimization protocol to improve the conformation sampling efficiency and modeling accuracy of de novo protein structure folding simulations. RESULTS: A distance-assisted multimodal optimization sampling algorithm, MMpred, is proposed for de novo protein structure prediction. The protocol consists of three stages: The first is a modal exploration stage, in which a structural similarity evaluation model DMscore is designed to control the diversity of conformations, generating a population of diverse structures in different low-energy basins. The second is a modal maintaining stage, where an adaptive clustering algorithm MNDcluster is proposed to divide the populations and merge the modal by adjusting the annealing temperature to locate the promising basins. In the last stage of modal exploitation, a greedy search strategy is used to accelerate the convergence of the modal. Distance constraint information is used to construct the conformation scoring model to guide sampling. MMpred is tested on a large set of 320 non-redundant proteins, where MMpred obtains models with TM-score≥0.5 on 291 cases, which is 28% higher than that of Rosetta guided with the same set of distance constraints. In addition, on 320 benchmark proteins, the enhanced version of MMpred (E-MMpred) has 167 targets better than trRosetta when the best of five models are evaluated. The average TM-score of the best model of E-MMpred is 0.732, which is comparable to trRosetta (0.730). AVAILABILITY AND IMPLEMENTATION: The source code and executable are freely available at https://github.com/iobio-zjut/MMpred. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Biología Computacional , Proteínas , Conformación Proteica , Biología Computacional/métodos , Proteínas/química , Programas Informáticos , Algoritmos

15.

Predicting RNA solvent accessibility from multi-scale context feature via multi-shot neural network.

Fan, Xue-Qiang; Hu, Jun; Tang, Yu-Xuan; Jia, Ning-Xin; Yu, Dong-Jun; Zhang, Gui-Jun.

Anal Biochem ; 654: 114802, 2022 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-35809650

RESUMEN

Knowledge of RNA solvent accessibility has recently become attractive due to the increasing awareness of its importance for key biological process. Accurately predicting the solvent accessibility of RNA is crucial for understanding its 3D structure and biological function. In this study, we develop a novel computational method, termed M2pred, for accurately predicting the solvent accessibility of RNA from sequence-based multi-scale context feature. In M2pred, three single-view features, i.e., base-pairing probabilities, position-specific frequency matrix, and a binary one-hot encoding, are first generated as three feature sources, and immediately concatenated to engender a super feature. Secondly, for the super feature, the matrix-format features of each nucleotide are extracted using an initialized sliding window technique, and regularly stacked into a cube-format feature. Then, using multi-scale context feature extraction strategy, a pyramid feature constructed of contextual feature of four scales related to target nucleotides is extracted from the cube-format feature. Finally, a customized multi-shot neural network framework, which is equipped with four different scales of receptive fields mainly integrating several residual attention blocks, is designed to dig discrimination information from the contextual pyramid feature. Experimental results demonstrate that the proposed M2pred achieve a high prediction performance and outperforms existing state-of-the-art prediction methods of RNA solvent accessibility.

Asunto(s)

Redes Neurales de la Computación , ARN , Nucleótidos , ARN/química , Solventes/química

16.

Construction of a nomogram to reveal the prognostic benefit of spontaneous intracranial hemorrhage among Chinese adults: a population-based study.

Zhang, Gui-Jun; Zhao, Jie-Yi; Zhang, Tao; You, Chao; Wang, Xiao-Yu.

Neurol Sci ; 43(4): 2449-2460, 2022 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-34694512

RESUMEN

BACKGROUND AND PURPOSE: We aimed to build a nomogram, based on patients with spontaneous intracerebral hemorrhage (SICH), to predict the probability of mortality and morbidity at 7 days and 90 days, respectively. METHODS: We performed a retrospective study, with patients at less than 6 h from ictus admitted to the department of neurosurgery in a single institute, from January 2011 to December 2018. A total of 1036 patients with SICH were included, 486 patients (46.9%) were 47-66 years old at diagnosis, and 711 patients (68.6%) were male. The least absolute shrinkage and section operator method was performed to identify the key adverse factors predicting the outcomes in patients with SICH, and multivariate logistic regression analysis was built on these variables, and then the results were visualized by a nomogram. The discrimination of the prognostic models was measured and compared by means of Harrell's concordance index (C-index), calibration curve, area under the curve (AUC), and decision curve analysis (DCA). RESULTS: Multivariate logistic regression analysis revealed that factors affecting 7-day mortality, including the following: age, therapy, Glasgow Coma Scale (GCS) admission, location, ventricle involved, hematoma volume, white blood cell (WBC), uric acid (UA), and L-lactic dehydrogenase (LDH); and factors affecting 90-day mortality, including temperature, therapy, GCS admission, ventricle involved, WBC, international normalized ratio, UA, LDH, and systolic blood pressure. The C-index for the 7-day mortality and 90-day mortality prediction nomogram was 0.9239 (95% CI = 0.9061-0.9416) and 0.9241 (95% CI = 0.9064-0.9418), respectively. The AUC of 7-day mortality was 92.4, as is true of 90-day mortality. The calibration curve and DCA indicated that nomograms in our study had a good prediction ability. For 90-day morbidity, age, marital status, and GCS at 7-day remained statistically significant in multivariate analysis. The C-index for the prediction nomogram was 0.6898 (95% CI = 0.6511-0.7285), and the calibration curve, AUC as well as DCA curve indicated that the nomogram for the prediction of good outcome demonstrated good agreement in this cohort. CONCLUSIONS: Nomograms in this study revealed many novel prognostic demographic and laboratory factors, and the individualized quantitative risk estimation by this model would be more practical for treatment management and patient counseling.

Asunto(s)

Hemorragias Intracraneales , Nomogramas , Adulto , Anciano , China/epidemiología , Humanos , Masculino , Persona de Mediana Edad , Pronóstico , Curva ROC , Estudios Retrospectivos

17.

Survival and treatment of cranial and spinal chordomas: a population-based study.

Zhang, Gui-Jun; Cui, Yu-Shi; Li, Huan.

Neurosurg Rev ; 45(1): 637-647, 2022 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-34156550

RESUMEN

Chordomas are rare, slow-growing malignant tumors. Given the paucity of data of the disease, the treatment strategies are disputed. We collected clinical and survival information of patients with chordoma diagnosed between 1975 and 2016 from the Surveillance, Epidemiology, and End Results database. A total of 1797 patients were initially enrolled, including 762 (42.4%) cranial and 1035 (57.6%) spinal chordoma. A total of 1504 patients were further evaluated after screening. In the cranial group, the surgery (gross total resection (GTR): p = 0.001 for overall survival (OS); p = 0.009 for cancer-specific survival (CSS)), tumor extension (distant metastasis: p = 0.001 for OS; p = 0.002 for CSS), and the age (p < 0.001 for OS) were independent prognostic factors for survival. In the spinal group, the age (p = 0.004), location (p < 0.001), GTR (p < 0.001), and tumor extension (distant metastasis, p < 0.001) were independent prognostic factors for OS; the age (p = 0.007), histological type (p < 0.001), GTR (p < 0.001), radiation (p = 0.018), chemotherapy (p = 0.006), and tumor extension (p < 0.001) were independent prognostic factors for CSS. In this large cohort, a significant association was noted between extent of resection and outcome. Even though adjuvant radiation or chemotherapy did not benefit patients with chordoma, the effect on prognosis can be explored in a further study based on our findings.

Asunto(s)

Cordoma , Neoplasias de la Base del Cráneo , Neoplasias de la Columna Vertebral , Cordoma/diagnóstico , Cordoma/cirugía , Humanos , Pronóstico , Estudios Retrospectivos , Cráneo , Neoplasias de la Columna Vertebral/diagnóstico , Neoplasias de la Columna Vertebral/epidemiología , Neoplasias de la Columna Vertebral/cirugía , Resultado del Tratamiento

18.

Surgical outcomes and prognostic factors of parasagittal meningioma: a single-center experience 165 consecutive cases.

Wang, Bo; Zhang, Gui-Jun; Wu, Zhen; Zhang, Jun-Ting; Liu, Pi-Nan.

Br J Neurosurg ; 36(6): 756-761, 2022 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-33423566

RESUMEN

PURPOSE: This study aimed to estimate the prognostic factors, long-term outcomes, and surgical strategies for parasagittal meningioma (PSM) and provide a better understanding of surgical experience. MATERIALS AND METHODS: Patients (n = 1438) who underwent surgery for meningioma between January 2012 and January 2013 were enrolled in a database. We then identified 165 patients with PSM based on this database. RESULTS: Of the 165 patients with identified PSMs, 103 were female and 62 were male, with a mean age of 49 years. Univariate analysis revealed that male sex (p = .002), non-World Health Organization (WHO) grade I meningioma (p < .001), treatment history (p = .006), surgical time more than 232 minutes (p = .006), and intraoperative bleeding > 300mL3 (p = .019) were associated with decreased progression-free survival (PFS). Multivariate analysis revealed that sex (hazards ratio [HR] = 3.836, 95% confidence interval [CI] = 1.364-10.794; p = .011], tumour grade (HR = 8.479, 95% CI = 3.234-22.230; p < .001), and surgical time (HR = 3.710, 95% CI = 1.057-13.023; p = .041) were independent factors for PFS. Patients with Simpson grade I-II (p = .015), no-treatment history (p = .006), tumour size < 3cm (p = .005), surgical time < 232 minutes (p = .019), intraoperative bleeding < 300mL3 (p < .001), or WHO grade I meningioma (p = .002) had better follow-up conditions. CONCLUSION: Surgery was an effective treatment for PSM, and at the time of final follow-up, patients who received aggressive resection had a substantially higher Karnofsky performance scale score.

Asunto(s)

Neoplasias Meníngeas , Meningioma , Humanos , Masculino , Femenino , Persona de Mediana Edad , Meningioma/patología , Neoplasias Meníngeas/cirugía , Neoplasias Meníngeas/patología , Pronóstico , Estudios Retrospectivos , Resultado del Tratamiento , Recurrencia Local de Neoplasia

19.

CGLFold: a contact-assisted de novo protein structure prediction using global exploration and loop perturbation sampling algorithm.

Liu, Jun; Zhou, Xiao-Gen; Zhang, Yang; Zhang, Gui-Jun.

Bioinformatics ; 36(8): 2443-2450, 2020 04 15.

Artículo en Inglés | MEDLINE | ID: mdl-31860059

RESUMEN

MOTIVATION: Regions that connect secondary structure elements in a protein are known as loops, whose slight change will produce dramatic effect on the entire topology. This study investigates whether the accuracy of protein structure prediction can be improved using a loop-specific sampling strategy. RESULTS: A novel de novo protein structure prediction method that combines global exploration and loop perturbation is proposed in this study. In the global exploration phase, the fragment recombination and assembly are used to explore the massive conformational space and generate native-like topology. In the loop perturbation phase, a loop-specific local perturbation model is designed to improve the accuracy of the conformation and is solved by differential evolution algorithm. These two phases enable a cooperation between global exploration and local exploitation. The filtered contact information is used to construct the conformation selection model for guiding the sampling. The proposed CGLFold is tested on 145 benchmark proteins, 14 free modeling (FM) targets of CASP13 and 29 FM targets of CASP12. The experimental results show that the loop-specific local perturbation can increase the structure diversity and success rate of conformational update and gradually improve conformation accuracy. CGLFold obtains template modeling score ≥ 0.5 models on 95 standard test proteins, 7 FM targets of CASP13 and 9 FM targets of CASP12. AVAILABILITY AND IMPLEMENTATION: The source code and executable versions are freely available at https://github.com/iobio-zjut/CGLFold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Proteínas , Conformación Proteica , Estructura Secundaria de Proteína , Programas Informáticos

20.

Improved protein relative solvent accessibility prediction using deep multi-view feature learning framework.

Fan, Xue-Qiang; Hu, Jun; Jia, Ning-Xin; Yu, Dong-Jun; Zhang, Gui-Jun.

Anal Biochem ; 631: 114358, 2021 10 15.

Artículo en Inglés | MEDLINE | ID: mdl-34478704

RESUMEN

The accurate prediction of the relative solvent accessibility of a protein is critical to understanding its 3D structure and biological function. In this study, a novel deep multi-view feature learning (DMVFL) framework that integrates three different neural network units, i.e., bidirectional long short-term memory recurrent neural network, squeeze-and-excitation, and fully-connected hidden layer, with four sequence-based single-view features, i.e., position-specific scoring matrix, position-specific frequency matrix, predicted secondary structure, and roughly predicted three-state relative solvent accessibility probability, is developed to accurately predict relative solvent accessibility information of protein. On the basis of this newly developed framework, one new protein relative solvent accessibility predictor was proposed and called DMVFL-RSA, which employs a customized multiple feedback mechanism that helps to extract discriminative information embedded in the four single-view features. In benchmark tests on TEST524 and CASP14-derived (CASP14set) datasets, DMVFL-RSA outperforms other existing state-of-the-art protein relative solvent accessibility predictors when predicting two-state (exposure threshold of 25%), three-state (exposure thresholds of 9% and 36%), and four-state (exposure thresholds of 4%, 25%, and 50%) discrete values. For real-valued prediction on TEST524 and CASP14set, DMVFL-RSA has also gained high Pearson correlation coefficient values, indicating a positive correlation between the predicted and native relative solvent accessibility. Detailed analyses show that the major advantages of DMVFL-RSA lie in the high efficiency of the DMVFL framework, the applied multiple feedback mechanism, and the strong sensitivity of the sequence-based features. The web server of DMVFL-RSA is freely available at https://jun-csbio.github.io/DMVFL-RSA/for academic use. The standalone package of DMVFL-RSA is downloadable at https://github.com/XueQiangFan/DMVFL-RSA.

Asunto(s)

Biología Computacional/métodos , Aprendizaje Profundo , Proteínas/química , Solventes/química , Bases de Datos de Proteínas , Retroalimentación , Internet , Redes Neurales de la Computación , Estructura Secundaria de Proteína

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA