Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 133
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38857126

RESUMEN

AlphaFold2 has achieved a major breakthrough in end-to-end prediction for static protein structures. However, protein conformational change is considered to be a key factor in protein biological function. Inter-residue multiple distances prediction is of great significance for research on protein multiple conformations exploration. In this study, we proposed an inter-residue multiple distances prediction method, DeepMDisPre, based on an improved network which integrates triangle update, axial attention and ResNet to predict multiple distances of residue pairs. We built a dataset which contains proteins with a single structure and proteins with multiple conformations to train the network. We tested DeepMDisPre on 114 proteins with multiple conformations. The results show that the inter-residue distance distribution predicted by DeepMDisPre tends to have multiple peaks for flexible residue pairs than for rigid residue pairs. On two cases of proteins with multiple conformations, we modeled the multiple conformations relatively accurately by using the predicted inter-residue multiple distances. In addition, we also tested the performance of DeepMDisPre on 279 proteins with a single structure. Experimental results demonstrate that the average contact accuracy of DeepMDisPre is higher than that of the comparative method. In terms of static protein modeling, the average TM-score of the 3D models built by DeepMDisPre is also improved compared with the comparative method. The executable program is freely available at https://github.com/iobio-zjut/DeepMDisPre.

2.
Genome Biol ; 25(1): 152, 2024 06 11.
Artículo en Inglés | MEDLINE | ID: mdl-38862984

RESUMEN

Protein folding has become a tractable problem with the significant advances in deep learning-driven protein structure prediction. Here we propose FoldPAthreader, a protein folding pathway prediction method that uses a novel folding force field model by exploring the intrinsic relationship between protein evolution and folding from the known protein universe. Further, the folding force field is used to guide Monte Carlo conformational sampling, driving the protein chain fold into its native state by exploring potential intermediates. On 30 example targets, FoldPAthreader successfully predicts 70% of the proteins whose folding pathway is consistent with biological experimental data.


Asunto(s)
Pliegue de Proteína , Proteínas , Proteínas/química , Proteínas/metabolismo , Método de Montecarlo , Conformación Proteica , Programas Informáticos , Modelos Moleculares , Biología Computacional/métodos
3.
Comput Struct Biotechnol J ; 23: 1824-1832, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38707538

RESUMEN

Estimation of model accuracy plays a crucial role in protein structure prediction, aiming to evaluate the quality of predicted protein structure models accurately and objectively. This process is not only key to screening candidate models that are close to the real structure, but also provides guidance for further optimization of protein structures. With the significant advancements made by AlphaFold2 in monomer structure, the problem of single-domain protein structure prediction has been widely solved. Correspondingly, the importance of assessing the quality of single-domain protein models decreased, and the research focus has shifted to estimation of model accuracy of protein complexes. In this review, our goal is to provide a comprehensive overview of the reference and statistical metrics, as well as representative methods, and the current challenges within four distinct facets (Topology Global Score, Interface Total Score, Interface Residue-Wise Score, and Tertiary Residue-Wise Score) in the field of complex EMA.

4.
ACS Omega ; 9(13): 15040-15051, 2024 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-38585058

RESUMEN

The photoelectric characteristics of poly(3,4-ethylenedioxythiophene):polystyrene sulfonate (PEDOT:PSS) films significantly affect the power conversion efficiency and stability of Si/PEDOT:PSS hybrid solar cells. In this paper, we investigated PEDOT:PSS modification with alcohol ether solvents (dipropylene glycol methyl ether (DPM) and propylene glycol phenyl ether (PPH)). The reduction of PSS content and the transformation of the PEDOT chain from benzene to a quinone structure in PEDOT:PSS induced by doping with DPM or PPH are the reasons for the improved conductivity of PEDOT:PSS films. DPM and PPH doping improves the quality of silicon with the PEDOT:PSS heterojunction and silicon surface passivation, thereby reducing the surface recombination of charge carriers, which improves the photovoltaic performance of Si/PEDOT:PSS solar cells. Comparing the power conversion performance (PCE) and air stability of Si/PEDOT:PSS solar cells with DPM (13.24%), DPH (13.51%), ethylene glycol (EG, 13.07%), and dimethyl sulfoxide (DMSO, 12.62%), it is suggested that doping with DPM and DPH can replace DMSO and EG to enhance the performance of Si/PEDOT:PSS solar cells. The EG and DMSO solvents not only have a certain toxicity to the human body but also are not environmentally friendly. In comparison to DMSO and EG, DPM and DPH are more economical and environmentally friendly, helping to reduce the manufacturing cost of Si/PEDOT:PSS solar cells and making them more conducive to their commercial applications.

5.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38600663

RESUMEN

Protein sequence design can provide valuable insights into biopharmaceuticals and disease treatments. Currently, most protein sequence design methods based on deep learning focus on network architecture optimization, while ignoring protein-specific physicochemical features. Inspired by the successful application of structure templates and pre-trained models in the protein structure prediction, we explored whether the representation of structural sequence profile can be used for protein sequence design. In this work, we propose SPDesign, a method for protein sequence design based on structural sequence profile using ultrafast shape recognition. Given an input backbone structure, SPDesign utilizes ultrafast shape recognition vectors to accelerate the search for similar protein structures in our in-house PAcluster80 structure database and then extracts the sequence profile through structure alignment. Combined with structural pre-trained knowledge and geometric features, they are further fed into an enhanced graph neural network for sequence prediction. The results show that SPDesign significantly outperforms the state-of-the-art methods, such as ProteinMPNN, Pifold and LM-Design, leading to 21.89%, 15.54% and 11.4% accuracy gains in sequence recovery rate on CATH 4.2 benchmark, respectively. Encouraging results also have been achieved on orphan and de novo (designed) benchmarks with few homologous sequences. Furthermore, analysis conducted by the PDBench tool suggests that SPDesign performs well in subdivided structures. More interestingly, we found that SPDesign can well reconstruct the sequences of some proteins that have similar structures but different sequences. Finally, the structural modeling verification experiment indicates that the sequences designed by SPDesign can fold into the native structures more accurately.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Alineación de Secuencia , Secuencia de Aminoácidos , Proteínas/química , Análisis de Secuencia de Proteína/métodos
6.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38517699

RESUMEN

The breakthrough in cryo-electron microscopy (cryo-EM) technology has led to an increasing number of density maps of biological macromolecules. However, constructing accurate protein complex atomic structures from cryo-EM maps remains a challenge. In this study, we extend our previously developed DEMO-EM to present DEMO-EM2, an automated method for constructing protein complex models from cryo-EM maps through an iterative assembly procedure intertwining chain- and domain-level matching and fitting for predicted chain models. The method was carefully evaluated on 27 cryo-electron tomography (cryo-ET) maps and 16 single-particle EM maps, where DEMO-EM2 models achieved an average TM-score of 0.92, outperforming those of state-of-the-art methods. The results demonstrate an efficient method that enables the rapid and reliable solution of challenging cryo-EM structure modeling problems.


Asunto(s)
Microscopía por Crioelectrón , Microscopía por Crioelectrón/métodos , Modelos Moleculares , Conformación Proteica
7.
ACS Appl Mater Interfaces ; 16(11): 14263-14274, 2024 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-38441548

RESUMEN

The dynamic defect tolerance under light soaking is a crucial aspect of halide perovskites. However, the underlying physics of light soaking remains elusive and is subject to debate, exhibiting both positive and negative effects. In this investigation, we demonstrated that surface defects in perovskite films significantly impact the performance and stability of perovskite solar cells, closely correlated with light soaking behaviors. Removing the top surface layer through adhesive tape, the surface defect density noticeably decreases, leading to enhanced photoluminescence (PL) efficiency, prolonged carrier lifetime, and higher conductivity. Consequently, the power conversion efficiency (PCE) of solar cells improves from 17.70% to 20.5%. Furthermore, we confirmed a positive correlation between surface defects and the light soaking effect. Perovskite films with low surface defects surprisingly exhibit a 3-fold increase in PL intensity and an 85% increase in carrier lifetime under 500 s of continuous illumination at an intensity of 100 mW/cm2. Beyond the conventional strategy of suppressing defect trapping, we propose increasing the capability of dynamic defect tolerance as an effective strategy to enhance the optoelectronic properties and performance of perovskite solar cells.

8.
Zhen Ci Yan Jiu ; 49(3): 221-230, 2024 Mar 25.
Artículo en Inglés, Chino | MEDLINE | ID: mdl-38500318

RESUMEN

OBJECTIVES: To observe the effects of electroacupuncture (EA) at "Fengfu"(GV16), "Taichong"(LR3), and "Zusanli"(ST36) on mitophagy mediated by silencing regulatory protein 3 (SIRT3)/ PTEN induced putative kinase 1 (PINK1)/PARK2 gene coding protein (Parkin) in the midbrain substantia nigra of Parkinson's disease (PD) mice, and to explore the potential mechanisms of EA in treating PD. METHODS: C57BL/6 mice were randomly divided into the control, model, EA, and sham EA groups, with 12 mice in each group. The PD mouse model was established by intraperitoneal injection of 1-methyl-4-phenyl-1, 2, 3, 6-tetrahydropyridine (MPTP). The EA group received EA stimulation at GV16, LR3 and ST36, while the sham EA group received shallow needling 1 mm away from the above acupoints without electrical stimulation. The motor ability of mice in each group was evaluated using an open field experiment. Immunohistochemistry was used to detect the expression of tyrosine hydroxylase (TH) and α-synuclein (α-syn) in the substantia nigra of mice. The ultrastructure of neurons in substantia nigra was observed by transmission electron microscope (TEM). Immunofluorescence was used to detect the expression of the autophagy marker autophagy-associated protein light chain 3 (LC3). The expression levels of TH, α-syn, SIRT3, PINK1, Parkin, P62, Beclin-1, LC3Ⅱ mRNA and protein were detected by PCR and Western blot. RESULTS: Compared with the control group, mice in the model group showed a decrease in the total exercise distance, time, movement speed and times of crossing central region (P<0.01);the positive expressions of TH and LC3 were decreased (P<0.01), while the positive expression of α-syn increased (P<0.01), accompanied by mitochondrial swelling, mitochondrial cristae fragmentation and decrease, and decreased lysosome count;the expression levels of TH, SIRT3, PINK1, Parkin, Beclin-1, and LC3Ⅱ mRNA and protein in the midbrain substantia nigra were decreased (P<0.01), while the expression levels of α-syn and P62 mRNA and protein were increased (P<0.01, P<0.05). Compared with the model group, the mice in EA group showed a significant increase in the total exercise distance, time, movement speed and times of crossing central region (P<0.01, P<0.05);the positive expressions of TH and LC3 were increased (P<0.01, P<0.05), while the positive expression of α-syn was decreased (P<0.01), accompanied by an increase in mitochondrial count, appearance of autophagic va-cuoles, and a decrease in swelling, the expression levels of TH, SIRT3, PINK1, Parkin, Beclin-1 and LC3Ⅱ mRNA and protein in the midbrain substantia nigra were increased (P<0.01, P<0.05), while the mRNA and protein expression levels of α-syn and P62 were decreased (P<0.01);the sham EA group showed an increase in the total exercise distance and time(P<0.05), with an increase in the positive expression of TH (P<0.05) and a decrease in the positive expression of α-syn (P<0.05);some mitochondria exhibited swelling, and no autophagic vacuoles were observed;the protein expression levels of TH, SIRT3, Parkin and LC3Ⅱ were increased (P<0.01, P<0.05), and the expression levels of P62 mRNA, α-syn mRNA and protein were decreased (P<0.01, P<0.05), and LC3Ⅱ mRNA expression was increased (P<0.05). In comparison to the sham EA group, the EA group showed an extension in the total exercise time (P<0.01), the positive expression and mRNA expression levels of α-syn were decreased (P<0.01, P<0.05), while the expression levels of TH, SIRT3, PINK1, Parkin mRNA and SIRT3 protein were increased (P<0.05). CONCLUSIONS: EA at GV16, LR3, and ST36 can exert neuroprotective function and improve the motor ability of PD mice by activating the SIRT3/PINK1/Parkin pathway to enhance the expression of TH and reduce α-syn aggregation in the substantia nigra of PD mice.


Asunto(s)
Electroacupuntura , Enfermedad de Parkinson , Sirtuina 3 , Ratones , Animales , Enfermedad de Parkinson/genética , Enfermedad de Parkinson/terapia , Sirtuina 3/genética , Mitofagia/genética , Proteínas Quinasas/genética , Beclina-1 , Ratones Endogámicos C57BL , Ubiquitina-Proteína Ligasas/genética , Ubiquitina-Proteína Ligasas/metabolismo , ARN Mensajero
9.
Interdiscip Sci ; 2024 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-38190097

RESUMEN

The breakthrough of AlphaFold2 and the publication of AlphaFold DB represent a significant advance in the field of predicting static protein structures. However, AlphaFold2 models tend to represent a single static structure, and multiple-conformation prediction remains a challenge. In this work, we proposed a method named MultiSFold, which uses a distance-based multi-objective evolutionary algorithm to predict multiple conformations. To begin, multiple energy landscapes are constructed using different competing constraints generated by deep learning. Subsequently, an iterative modal exploration and exploitation strategy is designed to sample conformations, incorporating multi-objective optimization, geometric optimization and structural similarity clustering. Finally, the final population is generated using a loop-specific sampling strategy to adjust the spatial orientations. MultiSFold was evaluated against state-of-the-art methods using a benchmark set containing 80 protein targets, each characterized by two representative conformational states. Based on the proposed metric, MultiSFold achieves a remarkable success ratio of 56.25% in predicting multiple conformations, while AlphaFold2 only achieves 10.00%, which may indicate that conformational sampling combined with knowledge gained through deep learning has the potential to generate conformations spanning the range between different conformational states. In addition, MultiSFold was tested on 244 human proteins with low structural accuracy in AlphaFold DB to test whether it could further improve the accuracy of static structures. The experimental results demonstrate the performance of MultiSFold, with a TM-score better than that of AlphaFold2 by 2.97% and RoseTTAFold by 7.72%. The online server is at http://zhanglab-bioinf.com/MultiSFold .

10.
J Chem Inf Model ; 64(1): 76-95, 2024 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-38109487

RESUMEN

Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Pliegue de Proteína , Proyectos de Investigación
11.
Commun Biol ; 6(1): 1221, 2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-38040847

RESUMEN

Accurately capturing domain-domain interactions is key to understanding protein function and designing structure-based drugs. Although AlphaFold2 has made a breakthrough on single domain, it should be noted that the structure modeling for multi-domain protein and complex remains a challenge. In this study, we developed a multi-domain and complex structure assembly protocol, named DeepAssembly, based on domain segmentation and single domain modeling algorithms. Firstly, DeepAssembly uses a population-based evolutionary algorithm to assemble multi-domain proteins by inter-domain interactions inferred from a developed deep learning network. Secondly, protein complexes are assembled by means of domains rather than chains using DeepAssembly. Experimental results show that on 219 multi-domain proteins, the average inter-domain distance precision by DeepAssembly is 22.7% higher than that of AlphaFold2. Moreover, DeepAssembly improves accuracy by 13.1% for 164 multi-domain structures with low confidence deposited in AlphaFold database. We apply DeepAssembly for the prediction of 247 heterodimers. We find that DeepAssembly successfully predicts the interface (DockQ ≥ 0.23) for 32.4% of the dimers, suggesting a lighter way to assemble complex structures by treating domains as assembly units and using inter-domain interactions learned from monomer structures.


Asunto(s)
Aprendizaje Profundo , Proteínas/química , Algoritmos
12.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38018909

RESUMEN

Model quality evaluation is a crucial part of protein structural biology. How to distinguish high-quality models from low-quality models, and to assess which high-quality models have relatively incorrect regions for improvement, are remain a challenge. More importantly, the quality assessment of multimer models is a hot topic for structure prediction. In this study, we propose GraphCPLMQA, a novel approach for evaluating residue-level model quality that combines graph coupled networks and embeddings from protein language models. The GraphCPLMQA consists of a graph encoding module and a transform-based convolutional decoding module. In encoding module, the underlying relational representations of sequence and high-dimensional geometry structure are extracted by protein language models with Evolutionary Scale Modeling. In decoding module, the mapping connection between structure and quality is inferred by the representations and low-dimensional features. Specifically, the triangular location and residue level contact order features are designed to enhance the association between the local structure and the overall topology. Experimental results demonstrate that GraphCPLMQA using single-sequence embedding achieves the best performance compared with the CASP15 residue-level interface evaluation methods among 9108 models in the local residue interface test set of CASP15 multimers. In CAMEO blind test (20 May 2022 to 13 August 2022), GraphCPLMQA ranked first compared with other servers (https://www.cameo3d.org/quality-estimation). GraphCPLMQA also outperforms state-of-the-art methods on 19, 035 models in CASP13 and CASP14 monomer test set.


Asunto(s)
Biología Computacional , Redes Neurales de la Computación , Biología Computacional/métodos , Proteínas/química , Lenguaje
13.
Heliyon ; 9(10): e20781, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37876416

RESUMEN

Background: Given that limited reports have described the survival and risk factors for elderly patients with hypertensive intracerebral hemorrhage (HICH), we aimed to develop a valid but simple prediction nomogram for the survival of HICH patients. Methods: All elderly patients ≥65 years old who were diagnosed with HICH between January 2011 and December 2019 were identified. We performed the least absolute shrinkage and selection operator (Lasso) on the Cox regression model with the R package glmnet. A concordance index was performed to calculate the nomogram discrimination; and calibration curves and decision curves were graphically evaluated by depicting the observed rates against the probabilities predicted by the nomogram. Results: A total of 204 eligible patients were analyzed, and over 20 % of the population was above the age of 80 (65-79 years old, n = 161; 80+ years old, n = 43). A hematoma volume ≥13.64 cm3 was associated with higher 7-day mortality (OR = 6.773, 95 % CI = 2.622-19.481; p < 0.001) and higher 90-day mortality (OR = 3.955, 95 % CI = 1.611-10.090, p = 0.003). A GCS score between 13 and 15 at admission was associated with a 7-day favorable outcome (OR = 0.025, 95 % CI = 0.005-0.086; p < 0.001) and a 90-day favorable outcome (OR = 0.033, 95 % CI = 0.010-0.099; p < 0.001). Conclusions: Our nomogram models were visualized and accurate. Neurosurgeons could use them to assess the prognostic factors and provide advice to patients and their relatives.

14.
J Chem Inf Model ; 63(20): 6451-6461, 2023 10 23.
Artículo en Inglés | MEDLINE | ID: mdl-37788318

RESUMEN

With the development of deep learning, almost all single-domain proteins can be predicted at experimental resolution. However, the structure prediction of multi-domain proteins remains a challenge. Achieving end-to-end protein domain assembly and further improving the accuracy of the full-chain modeling by accurately predicting inter-domain orientation while improving the assembly efficiency will provide significant insights into structure-based drug discovery. In this work, we propose an End-to-End Domain Assembly method based on deep learning, named E2EDA. We first develop RMNet, an EfficientNetV2-based deep learning model that fuses multiple features using an attention mechanism to predict inter-domain rigid motion. Then, the predicted rigid motions are transformed into inter-domain spatial transformations to directly assemble the full-chain model. Finally, the scoring strategy RMscore is designed to select the best model from multiple assembled models. The experimental results show that the average TM-score of the model assembled by E2EDA on the benchmark set (282) is 0.827, which is better than those of other domain assembly methods SADA (0.792) and DEMO (0.730). Meanwhile, on our constructed multi-domain data set from AlphaFold DB, the model reassembled by E2EDA is 7.0% higher in TM-score compared to the full-chain model predicted by AlphaFold2, indicating that E2EDA can capture more accurate inter-domain orientations to improve the quality of the model predicted by AlphaFold2. Furthermore, compared to SADA and AlphaFold2, E2EDA reduced the average runtime on the benchmark by 64.7% and 19.2%, respectively, indicating that E2EDA can significantly improve assembly efficiency through an end-to-end approach. The online server is available at http://zhanglab-bioinf.com/E2EDA.


Asunto(s)
Aprendizaje Profundo , Dominios Proteicos , Proteínas/química
15.
Curr Med Chem ; 2023 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-37828669

RESUMEN

The protein folding mechanisms are crucial to understanding the fundamental processes of life and solving many biological and medical problems. By studying the folding process, we can reveal how proteins achieve their biological functions through specific structures, providing insights into the treatment and prevention of diseases. With the advancement of AI technology in the field of protein structure prediction, computational methods have become increasingly important and promising for studying protein folding mechanisms. In this review, we retrospect the current progress in the field of protein folding mechanisms by computational methods from four perspectives: simulation of an inverse folding pathway from native state to unfolded state; prediction of early folding residues by machine learning; exploration of protein folding pathways through conformational sampling; prediction of protein folding intermediates based on templates. Finally, the challenges and future perspectives of the protein folding problem by computational methods are also discussed.

16.
PLoS Comput Biol ; 19(9): e1011438, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37695768

RESUMEN

The study of protein folding mechanism is a challenge in molecular biology, which is of great significance for revealing the movement rules of biological macromolecules, understanding the pathogenic mechanism of folding diseases, and designing protein engineering materials. Based on the hypothesis that the conformational sampling trajectory contain the information of folding pathway, we propose a protein folding pathway prediction algorithm named Pathfinder. Firstly, Pathfinder performs large-scale sampling of the conformational space and clusters the decoys obtained in the sampling. The heterogeneous conformations obtained by clustering are named seed states. Then, a resampling algorithm that is not constrained by the local energy basin is designed to obtain the transition probabilities of seed states. Finally, protein folding pathways are inferred from the maximum transition probabilities of seed states. The proposed Pathfinder is tested on our developed test set (34 proteins). For 11 widely studied proteins, we correctly predicted their folding pathways and specifically analyzed 5 of them. For 13 proteins, we predicted their folding pathways to be further verified by biological experiments. For 6 proteins, we analyzed the reasons for the low prediction accuracy. For the other 4 proteins without biological experiment results, potential folding pathways were predicted to provide new insights into protein folding mechanism. The results reveal that structural analogs may have different folding pathways to express different biological functions, homologous proteins may contain common folding pathways, and α-helices may be more prone to early protein folding than ß-strands.


Asunto(s)
Algoritmos , Biología Molecular , Análisis por Conglomerados , Conformación Molecular , Pliegue de Proteína
17.
Bioinformatics ; 39(10)2023 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-37740296

RESUMEN

MOTIVATION: Model quality assessment is a crucial part of protein structure prediction and a gateway to proper usage of models in biomedical applications. Many methods have been proposed for assessing the quality of structural models of protein monomers, but few methods for evaluating protein complex models. As protein complex structure prediction becomes a new challenge, there is an urgent need for model quality assessment methods that can accurately assess the accuracy of interface residues of complex structures. RESULTS: Here, we present DeepUMQA3, a web server for evaluating the accuracy of interface residues of protein complex structures using deep neural networks. For an input complex structure, features are extracted from three levels of overall complex, intra-monomer, and inter-monomer, and an improved deep residual neural network is used to predict per-residue lDDT and interface residue accuracy. DeepUMQA3 ranks first in the blind test of interface residue accuracy estimation in CASP15, with Pearson, Spearman, and AUC of 0.564, 0.535, and 0.755 under the lDDT measurement, which are 17.6%, 23.6%, and 10.9% higher than the second best method, respectively. DeepUMQA3 can also assess the accuracy of all residues in the entire complex and distinguish high- and low-precision residues. AVAILABILITY AND IMPLEMENTATION: The web sever of DeepUMQA3 are freely available at http://zhanglab-bioinf.com/DeepUMQA_server/.

18.
J Chem Inf Model ; 63(17): 5689-5700, 2023 09 11.
Artículo en Inglés | MEDLINE | ID: mdl-37603823

RESUMEN

Identifying DNA N6-methyladenine (6mA) sites is significantly important to understanding the function of DNA. Many deep learning-based methods have been developed to improve the performance of 6mA site prediction. In this study, to further improve the performance of 6mA site prediction, we propose a new meta method, called Co6mA, to integrate bidirectional long short-term memory (BiLSTM), convolutional neural networks (CNNs), and self-attention mechanisms (SAM) via assembling two different deep learning-based models. The first model developed in this study is called CBi6mA, which is composed of CNN, BiLSTM, and fully connected modules. The second model is borrowed from LA6mA, which is an existing 6mA prediction method based on BiLSTM and SAM modules. Experimental results on two independent testing sets of different model organisms, i.e., Arabidopsis thaliana and Drosophila melanogaster, demonstrate that Co6mA can achieve an average accuracy of 91.8%, covering 89% of all 6mA samples while achieving an average Matthews correlation coefficient value (0.839), which is higher than the second-best method DeepM6A.


Asunto(s)
Arabidopsis , Drosophila melanogaster , Animales , Memoria a Corto Plazo , ADN , Redes Neurales de la Computación
19.
Proteins ; 91(12): 1861-1870, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37553848

RESUMEN

This article reports and analyzes the results of protein complex model accuracy estimation by our methods (DeepUMQA3 and GraphGPSM) in the 15th Critical Assessment of techniques for protein Structure Prediction (CASP15). The new deep learning-based multimeric complex model accuracy estimation methods are proposed based on the ensemble of three-level features coupling with deep residual/graph neural networks. For the input multimeric complex model, we describe it from three levels: overall complex features, intra-monomer features, and inter-monomer features. We designed an overall ultrafast shape recognition (USR) to characterize the relationship between local residues and the overall complex topology, and an inter-monomer USR to characterize the relationship between the residues of one monomer and the topology of other monomers. DeepUMQA3 (Group name: GuijunLab-RocketX) ranked first in the interface residue accuracy estimation of CASP15. The Pearson correlation between the interface residue Local Distance Difference Test (lDDT) predicted by DeepUMQA3 and the real lDDT is 0.570, the only method that exceeds 0.5. Among the top 5 methods, DeepUMQA3 achieved the highest Pearson correlation of lDDT on 25 out of 39 targets. GraphGPSM (Group name: GuijunLab-PAthreader) has TM-score Pearson correlations greater than 0.9 on 14 targets, showing a good ability to estimate the overall fold accuracy. The DeepUMQA3 server is available at http://zhanglab-bioinf.com/DeepUMQA/ and the GraphGPSM server is available at http://zhanglab-bioinf.com/GraphGPSM/.


Asunto(s)
Aprendizaje Profundo , Conformación Proteica , Biología Computacional/métodos , Proteínas/química , Redes Neurales de la Computación
20.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37317619

RESUMEN

The scoring models used for protein structure modeling and ranking are mainly divided into unified field and protein-specific scoring functions. Although protein structure prediction has made tremendous progress since CASP14, the modeling accuracy still cannot meet the requirements to a certain extent. Especially, accurate modeling of multi-domain and orphan proteins remains a challenge. Therefore, an accurate and efficient protein scoring model should be developed urgently to guide the protein structure folding or ranking through deep learning. In this work, we propose a protein structure global scoring model based on equivariant graph neural network (EGNN), named GraphGPSM, to guide protein structure modeling and ranking. We construct an EGNN architecture, and a message passing mechanism is designed to update and transmit information between nodes and edges of the graph. Finally, the global score of the protein model is output through a multilayer perceptron. Residue-level ultrafast shape recognition is used to describe the relationship between residues and the overall structure topology, and distance and direction encoded by Gaussian radial basis functions are designed to represent the overall topology of the protein backbone. These two features are combined with Rosetta energy terms, backbone dihedral angles and inter-residue distance and orientations to represent the protein model and embedded into the nodes and edges of the graph neural network. The experimental results on the CASP13, CASP14 and CAMEO test sets show that the scores of our developed GraphGPSM have a strong correlation with the TM-score of the models, which are significantly better than those of the unified field score function REF2015 and the state-of-the-art local lDDT-based scoring models ModFOLD8, ProQ3D and DeepAccNet, etc. The modeling experimental results on 484 test proteins demonstrate that GraphGPSM can greatly improve the modeling accuracy. GraphGPSM is further used to model 35 orphan proteins and 57 multi-domain proteins. The results show that the average TM-score of the models predicted by GraphGPSM is 13.2 and 7.1% higher than that of the models predicted by AlphaFold2. GraphGPSM also participates in CASP15 and achieves competitive performance in global accuracy estimation.


Asunto(s)
Algoritmos , Proteínas , Conformación Proteica , Bases de Datos de Proteínas , Proteínas/química , Redes Neurales de la Computación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...