Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38341662

RESUMO

MOTIVATION: RNA threading aims to identify remote homologies for template-based modeling of RNA 3D structure. Existing RNA alignment methods primarily rely on secondary structure alignment. They are often time- and memory-consuming, limiting large-scale applications. In addition, the accuracy is far from satisfactory. RESULTS: Using RNA secondary structure and sequence profile, we developed a novel RNA threading algorithm, named RNAthreader. To enhance the alignment process and minimize memory usage, a novel approach has been introduced to simplify RNA secondary structures into compact diagrams. RNAthreader employs a two-step methodology. Initially, integer programming and dynamic programming are combined to create an initial alignment for the simplified diagram. Subsequently, the final alignment is obtained using dynamic programming, taking into account the initial alignment derived from the previous step. The benchmark test on 80 RNAs illustrates that RNAthreader generates more accurate alignments than other methods, especially for RNAs with pseudoknots. Another benchmark, involving 30 RNAs from the RNA-Puzzles experiments, exhibits that the models constructed using RNAthreader templates have a lower average RMSD than those created by alternative methods. Remarkably, RNAthreader takes less than two hours to complete alignments with ∼5000 RNAs, which is 3-40 times faster than other methods. These compelling results suggest that RNAthreader is a promising algorithm for RNA template detection. AVAILABILITY AND IMPLEMENTATION: https://yanglab.qd.sdu.edu.cn/RNAthreader.


Assuntos
RNA , Software , RNA/química , Alinhamento de Sequência , Algoritmos , Estrutura Secundária de Proteína
2.
New Phytol ; 242(4): 1798-1813, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38155454

RESUMO

It is well understood that agricultural management influences arbuscular mycorrhizal (AM) fungi, but there is controversy about whether farmers should manage for AM symbiosis. We assessed AM fungal communities colonizing wheat roots for three consecutive years in a long-term (> 14 yr) tillage and fertilization experiment. Relationships among mycorrhizas, crop performance, and soil ecosystem functions were quantified. Tillage, fertilizers and continuous monoculture all reduced AM fungal richness and shifted community composition toward dominance of a few ruderal taxa. Rhizophagus and Dominikia were depressed by tillage and/or fertilization, and their abundances as well as AM fungal richness correlated positively with soil aggregate stability and nutrient cycling functions across all or no-tilled samples. In the field, wheat yield was unrelated to AM fungal abundance and correlated negatively with AM fungal richness. In a complementary glasshouse study, wheat biomass was enhanced by soil inoculum from unfertilized, no-till plots while neutral to depressed growth was observed in wheat inoculated with soils from fertilized and conventionally tilled plots. This study demonstrates contrasting impacts of low-input and conventional agricultural practices on AM symbiosis and highlights the importance of considering both crop yield and soil ecosystem functions when managing mycorrhizas for more sustainable agroecosystems.


Assuntos
Produtos Agrícolas , Ecossistema , Fertilizantes , Micorrizas , Microbiologia do Solo , Solo , Triticum , Micorrizas/fisiologia , Solo/química , Triticum/microbiologia , Triticum/crescimento & desenvolvimento , Triticum/fisiologia , Produtos Agrícolas/microbiologia , Produtos Agrícolas/crescimento & desenvolvimento , Agricultura/métodos , Biomassa , Raízes de Plantas/microbiologia , Fatores de Tempo , Biodiversidade
3.
Nat Commun ; 14(1): 7266, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945552

RESUMO

RNA 3D structure prediction is a long-standing challenge. Inspired by the recent breakthrough in protein structure prediction, we developed trRosettaRNA, an automated deep learning-based approach to RNA 3D structure prediction. The trRosettaRNA pipeline comprises two major steps: 1D and 2D geometries prediction by a transformer network; and 3D structure folding by energy minimization. Benchmark tests suggest that trRosettaRNA outperforms traditional automated methods. In the blind tests of the 15th Critical Assessment of Structure Prediction (CASP15) and the RNA-Puzzles experiments, the automated trRosettaRNA predictions for the natural RNAs are competitive with the top human predictions. trRosettaRNA also outperforms other deep learning-based methods in CASP15 when measured by the Z-score of the Root-Mean-Square Deviation. Nevertheless, it remains challenging to predict accurate structures for synthetic RNAs with an automated approach. We hope this work could be a good start toward solving the hard problem of RNA structure prediction with deep learning.


Assuntos
Proteínas , RNA , Humanos , RNA/genética , Conformação de Ácido Nucleico , Proteínas/genética
4.
Mycorrhiza ; 33(5-6): 359-368, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37821597

RESUMO

Strong effects of plant identity, soil nutrient availability or mycorrhizal fungi on root traits have been well documented, but their interactive influences on root traits are still poorly understood. Here, three crop species (maize, wheat and soybean) were grown under four phosphorus (P) addition levels (0, 20, 40 and 60 mg P kg-1 dry soil), and plants were inoculated with or without five combined arbuscular mycorrhizal fungal (AMF) species. Plant biomass, nutrient contents, root traits (including total root length, average root diameter, specific root length and root tissue density) and plants' mycorrhizal responses were measured. Crop species, P level, AMF, and their interactions strongly affected plant biomass and root traits. P fertilization promoted plant growth but reduced mycorrhizal benefits on plant biomass and nutrient uptake. Root traits of maize were sensitive to P addition only under the non-mycorrhizal condition, whilst most root traits of soybean and wheat plants were responsive to mycorrhizal inoculation but not P addition. Mycorrhizal colonization reduced the root plasticity in response to P fertility for maize but not for wheat or soybean. This study highlights the importance of soil nutrient fertility and mycorrhizal symbiosis in influencing root traits.


Assuntos
Micorrizas , Micorrizas/fisiologia , Solo , Glycine max , Triticum , Zea mays , Fósforo , Raízes de Plantas/microbiologia
5.
Proteins ; 91(12): 1704-1711, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37565699

RESUMO

We present the monomer and multimer structure prediction results of our methods in CASP15. We first designed an elaborate pipeline that leverages complementary sequence databases and advanced database searching algorithms to generate high-quality multiple sequence alignments (MSAs). Top MSAs were then selected for the subsequent step of structure prediction. We utilized trRosettaX2 and AlphaFold2 for monomer structure prediction (group name Yang-Server), and AlphaFold-Multimer for multimer structure prediction (group name Yang-Multimer). Yang-Server and Yang-Multimer are ranked at the top and the fourth, respectively, for monomer and multimer structure prediction. For 94 monomers, the average TM-score of the predicted structure models by Yang-Server is 0.876, compared to 0.798 by the default AlphaFold2 (i.e., the group NBIS-AF2-standard). For 42 multimers, the average DockQ score of the predicted structure models by Yang-Multimer is 0.464, compared to 0.389 by the default AlphaFold-Multimer (i.e., the group NBIS-AF2-multimer). Detailed analysis of the results shows that several factors contribute to the improvement, including improved MSAs, iterated modeling for large targets, interplay between monomer and multimer structure prediction for intertwined structures, etc. However, the structure predictions for orphan proteins and multimers remain challenging, and breakthroughs in this area are anticipated in the future.


Assuntos
Algoritmos , Furilfuramida , Alinhamento de Sequência , Bases de Dados de Ácidos Nucleicos
6.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36734597

RESUMO

MOTIVATION: It is fundamental to cut multi-domain proteins into individual domains, for precise domain-based structural and functional studies. In the past, sequence-based and structure-based domain parsing was carried out independently with different methodologies. The recent progress in deep learning-based protein structure prediction provides the opportunity to unify sequence-based and structure-based domain parsing. RESULTS: Based on the inter-residue distance matrix, which can be either derived from the input structure or predicted by trRosettaX, we can decode the domain boundaries under a unified framework. We name the proposed method UniDoc. The principle of UniDoc is based on the well-accepted physical concept of maximizing intra-domain interaction while minimizing inter-domain interaction. Comprehensive tests on five benchmark datasets indicate that UniDoc outperforms other state-of-the-art methods in terms of both accuracy and speed, for both sequence-based and structure-based domain parsing. The major contribution of UniDoc is providing a unified framework for structure-based and sequence-based domain parsing. We hope that UniDoc would be a convenient tool for protein domain analysis. AVAILABILITY AND IMPLEMENTATION: https://yanglab.nankai.edu.cn/UniDoc/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Biologia Computacional , Domínios Proteicos , Biologia Computacional/métodos , Proteínas/química
7.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36458437

RESUMO

One of key features of intrinsically disordered regions (IDRs) is facilitation of protein-protein and protein-nucleic acids interactions. These disordered binding regions include molecular recognition features (MoRFs), short linear motifs (SLiMs) and longer binding domains. Vast majority of current predictors of disordered binding regions target MoRFs, with a handful of methods that predict SLiMs and disordered protein-binding domains. A new and broader class of disordered binding regions, linear interacting peptides (LIPs), was introduced recently and applied in the MobiDB resource. LIPs are segments in protein sequences that undergo disorder-to-order transition upon binding to a protein or a nucleic acid, and they cover MoRFs, SLiMs and disordered protein-binding domains. Although current predictors of MoRFs and disordered protein-binding regions could be used to identify some LIPs, there are no dedicated sequence-based predictors of LIPs. To this end, we introduce CLIP, a new predictor of LIPs that utilizes robust logistic regression model to combine three complementary types of inputs: co-evolutionary information derived from multiple sequence alignments, physicochemical profiles and disorder predictions. Ablation analysis suggests that the co-evolutionary information is particularly useful for this prediction and that combining the three inputs provides substantial improvements when compared to using these inputs individually. Comparative empirical assessments using low-similarity test datasets reveal that CLIP secures area under receiver operating characteristic curve (AUC) of 0.8 and substantially improves over the results produced by the closest current tools that predict MoRFs and disordered protein-binding regions. The webserver of CLIP is freely available at http://biomine.cs.vcu.edu/servers/CLIP/ and the standalone code can be downloaded from http://yanglab.qd.sdu.edu.cn/download/CLIP/.


Assuntos
Proteínas Intrinsicamente Desordenadas , Proteínas Intrinsicamente Desordenadas/química , Biologia Computacional/métodos , Sequência de Aminoácidos , Peptídeos/metabolismo , Domínios Proteicos , Bases de Dados de Proteínas , Ligação Proteica
8.
Curr Opin Struct Biol ; 77: 102495, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36371845

RESUMO

Significant advances have been achieved in protein structure prediction, especially with the recent development of the AlphaFold2 and the RoseTTAFold systems. This article reviews the progress in deep learning-based protein structure prediction methods in the past two years. First, we divide the representative methods into two categories: the two-step approach and the end-to-end approach. Then, we show that the two-step approach is possible to achieve similar accuracy to the state-of-the-art end-to-end approach AlphaFold2. Compared to the end-to-end approach, the two-step approach requires fewer computing resources. We conclude that it is valuable to keep developing both approaches. Finally, a few outstanding challenges in function-orientated protein structure prediction are pointed out for future development.


Assuntos
Aprendizado Profundo , Proteínas/química
9.
BMC Plant Biol ; 22(1): 60, 2022 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-35114932

RESUMO

BACKGROUND: The impacts of increasing nitrogen (N) deposition and overgrazing on terrestrial ecosystems have been continuously hot issues. Grazing exclusion, aimed at restoration of grassland ecosystem function and service, has been extensively applied, and considered a rapid and effective vegetation restoration method. However, the synthetic effects of exclosure and N deposition on plant and community characteristics have rarely been studied. Here, a 4-year field experiment of N addition and exclusion treatment had been conducted in the desert steppe dominated by Alhagi sparsifolia and Lycium ruthenicum in northwest of China, and the responses of soil characteristics, plant nutrition and plant community to the treatments had been analyzed. RESULTS: The grazing exclusion significantly increased total N concentration in the surface soil (0-20 cm), and increased plant height, coverage (P < 0.05) and aboveground biomass. Specifically, A. sparsifolia recovered faster both in individual and community levels than L. ruthenicum did after exclusion. There was no difference in response to N addition gradients between the two plants. CONCLUSIONS: Our findings suggest that it is exclusion rather than N addition that has greater impacts on soil properties and plant community in desert steppe. Present N deposition level has no effect on plant community of desert steppe based on short-term experimental treatments.


Assuntos
Biodiversidade , Ecossistema , Pradaria , Herbivoria , Nitrogênio/metabolismo , Fenômenos Fisiológicos Vegetais/efeitos dos fármacos , Microbiologia do Solo , China , Clima Desértico
10.
Artigo em Inglês | MEDLINE | ID: mdl-32750865

RESUMO

Many ligands simultaneously interact with multiple protein chains in quaternary structure (QS). However, a significant number of previous studies on template-based modeling of protein-ligand interactions were based on monomeric structure (MS), which may suffer from incomplete binding information. The defects of using MS rather than QS have not been systematically studied before. In this work, based on molecular docking experiments and binding free energy estimations, we performed a large-scale comparison of the protein-ligand interactions in both forms of structures. We found that 1) about 18.6 percent biologically relevant ligands bind multiple chains in QS simultaneously. 2) For more than 95 percent complexes with multiple chains involved in the interactions, the binding free energy is lower for the QS form than the MS form. 3) For over 70 percent complexes with multi-chain binding pockets, docking with QS yields more accurate ligand conformations than with MS. While for about 1.82 percent complexes, accurate docking conformations were obtained by MS. Based on this work, it is encouraged to make use of QS rather than MS in future studies on protein-ligand interactions.


Assuntos
Proteínas , Sítios de Ligação , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Conformação Proteica , Proteínas/metabolismo
11.
Bioinformatics ; 38(4): 962-969, 2022 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-34791040

RESUMO

MOTIVATION: Significant progress has been achieved in distance-based protein folding, due to improved prediction of inter-residue distance by deep learning. Many efforts are thus made to improve distance prediction in recent years. However, it remains unknown what is the best way of objectively assessing the accuracy of predicted distance. RESULTS: A total of 19 metrics were proposed to measure the accuracy of predicted distance. These metrics were discussed and compared quantitatively on three benchmark datasets, with distance and structure models predicted by the trRosetta pipeline. The experiments show that a few metrics, such as distance precision, have a high correlation with the model accuracy measure TM-score (Pearson's correlation coefficient >0.7). In addition, the metrics are applied to rank the distance prediction groups in CASP14. The ranking by our metrics coincides largely with the official version. These data suggest that the proposed metrics are effective for measuring distance prediction. We anticipate that this study paves the way for objectively monitoring the progress of inter-residue distance prediction. A web server and a standalone package are provided to implement the proposed metrics. AVAILABILITY AND IMPLEMENTATION: http://yanglab.nankai.edu.cn/APD. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Proteínas , Proteínas/química , Biologia Computacional , Dobramento de Proteína
12.
Nat Comput Sci ; 2(12): 804-814, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38177395

RESUMO

Significant progress has been made in protein structure prediction in recent years. However, it remains challenging for AlphaFold2 and other deep learning-based methods to predict protein structure with single-sequence input. Here we introduce trRosettaX-Single, an automated algorithm for single-sequence protein structure prediction. It incorporates the sequence embedding from a supervised transformer protein language model into a multi-scale network enhanced by knowledge distillation to predict inter-residue two-dimensional geometry, which is then used to reconstruct three-dimensional structures via energy minimization. Benchmark tests show that trRosettaX-Single outperforms AlphaFold2 and RoseTTAFold on orphan proteins and works well on human-designed proteins (with an average template modeling score (TM-score) of 0.79). An experimental test shows that the full trRosettaX-Single pipeline is two times faster than AlphaFold2, using much fewer computing resources (<10%). On 2,000 designed proteins from network hallucination, trRosettaX-Single generates structure models with high confidence. As a demonstration, trRosettaX-Single is applied to missense mutation analysis. These data suggest that trRosettaX-Single may find potential applications in protein design and related studies.


Assuntos
Algoritmos , Benchmarking , Humanos , Destilação , Fontes de Energia Elétrica , Idioma
13.
Nat Protoc ; 16(12): 5634-5651, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34759384

RESUMO

The trRosetta (transform-restrained Rosetta) server is a web-based platform for fast and accurate protein structure prediction, powered by deep learning and Rosetta. With the input of a protein's amino acid sequence, a deep neural network is first used to predict the inter-residue geometries, including distance and orientations. The predicted geometries are then transformed as restraints to guide the structure prediction on the basis of direct energy minimization, which is implemented under the framework of Rosetta. The trRosetta server distinguishes itself from other similar structure prediction servers in terms of rapid and accurate de novo structure prediction. As an illustration, trRosetta was applied to two Pfam families with unknown structures, for which the predicted de novo models were estimated to have high accuracy. Nevertheless, to take advantage of homology modeling, homologous templates are used as additional inputs to the network automatically. In general, it takes ~1 h to predict the final structure for a typical protein with ~300 amino acids, using a maximum of 10 CPU cores in parallel in our cluster system. To enable large-scale structure modeling, a downloadable package of trRosetta with open-source codes is available as well. A detailed guidance for using the package is also available in this protocol. The server and the package are available at https://yanglab.nankai.edu.cn/trRosetta/ and https://yanglab.nankai.edu.cn/trRosetta/download/ , respectively.


Assuntos
Aminoácidos/química , Biologia Computacional/métodos , Proteínas/química , Software , Sequência de Aminoácidos , Internet , Simulação de Dinâmica Molecular , Redes Neurais de Computação , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Termodinâmica
14.
Adv Sci (Weinh) ; 8(24): e2102592, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34719864

RESUMO

The accuracy of de novo protein structure prediction has been improved considerably in recent years, mostly due to the introduction of deep learning techniques. In this work, trRosettaX, an improved version of trRosetta for protein structure prediction is presented. The major improvement over trRosetta consists of two folds. The first is the application of a new multi-scale network, i.e., Res2Net, for improved prediction of inter-residue geometries, including distance and orientations. The second is an attention-based module to exploit multiple homologous templates to increase the accuracy further. Compared with trRosetta, trRosettaX improves the contact precision by 6% and 8% on the free modeling targets of CASP13 and CASP14, respectively. A preliminary version of trRosettaX is ranked as one of the top server groups in CASP14's blind test. Additional benchmark test on 161 targets from CAMEO (between Jun and Sep 2020) shows that trRosettaX achieves an average TM-score ≈0.8, outperforming the top groups in CAMEO. These data suggest the effectiveness of using the multi-scale network and the benefit of incorporating homologous templates into the network. The trRosettaX algorithm is incorporated into the trRosetta server since Nov 2020. The web server, the training and inference codes are available at: https://yanglab.nankai.edu.cn/trRosetta/.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Modelos Moleculares , Redes Neurais de Computação , Conformação Proteica , Análise de Sequência de Proteína/métodos , Conjuntos de Dados como Assunto
15.
Biomolecules ; 11(9)2021 09 09.
Artigo em Inglês | MEDLINE | ID: mdl-34572550

RESUMO

Non-synonymous single nucleotide polymorphisms (nsSNPs) may result in pathogenic changes that are associated with human diseases. Accurate prediction of these deleterious nsSNPs is in high demand. The existing predictors of deleterious nsSNPs secure modest levels of predictive performance, leaving room for improvements. We propose a new sequence-based predictor, DMBS, which addresses the need to improve the predictive quality. The design of DMBS relies on the observation that the deleterious mutations are likely to occur at the highly conserved and functionally important positions in the protein sequence. Correspondingly, we introduce two innovative components. First, we improve the estimates of the conservation computed from the multiple sequence profiles based on two complementary databases and two complementary alignment algorithms. Second, we utilize putative annotations of functional/binding residues produced by two state-of-the-art sequence-based methods. These inputs are processed by a random forests model that provides favorable predictive performance when empirically compared against five other machine-learning algorithms. Empirical results on four benchmark datasets reveal that DMBS achieves AUC > 0.94, outperforming current methods, including protein structure-based approaches. In particular, DMBS secures AUC = 0.97 for the SNPdbe and ExoVar datasets, compared to AUC = 0.70 and 0.88, respectively, that were obtained by the best available methods. Further tests on the independent HumVar dataset shows that our method significantly outperforms the state-of-the-art method SNPdryad. We conclude that DMBS provides accurate predictions that can effectively guide wet-lab experiments in a high-throughput manner.


Assuntos
Algoritmos , Biologia Computacional/métodos , Polimorfismo de Nucleotídeo Único/genética , Proteínas/química , Proteínas/metabolismo , Área Sob a Curva , Sequência de Bases , Bases de Dados Genéticas , Humanos , Ligantes , Aprendizado de Máquina , Ligação Proteica , Curva ROC
16.
Bioinformatics ; 37(21): 3752-3759, 2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34473228

RESUMO

MOTIVATION: Protein model quality assessment (QA) is an essential component in protein structure prediction, which aims to estimate the quality of a structure model and/or select the most accurate model out from a pool of structure models, without knowing the native structure. QA remains a challenging task in protein structure prediction. RESULTS: Based on the inter-residue distance predicted by the recent deep learning-based structure prediction algorithm trRosetta, we developed QDistance, a new approach to the estimation of both global and local qualities. QDistance works for both single- and multi-models inputs. We designed several distance-based features to assess the agreement between the predicted and model-derived inter-residue distances. Together with a few widely used features, they are fed into a simple yet powerful linear regression model to infer the global QA scores. The local QA scores for each structure model are predicted based on a comparative analysis with a set of selected reference models. For multi-models input, the reference models are selected from the input based on the predicted global QA scores. For single-model input, the reference models are predicted by trRosetta. With the informative distance-based features, QDistance can predict the global quality with satisfactory accuracy. Benchmark tests on the CASP13 and the CAMEO structure models suggested that QDistance was competitive with other methods. Blind tests in the CASP14 experiments showed that QDistance was robust and ranked among the top predictors. Especially, QDistance was the top 3 local QA method and made the most accurate local QA prediction for unreliable local region. Analysis showed that this superior performance can be attributed to the inclusion of the predicted inter-residue distance. AVAILABILITY AND IMPLEMENTATION: http://yanglab.nankai.edu.cn/QDistance. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Proteínas , Biologia Computacional/métodos , Proteínas/química , Algoritmos
17.
Bioinformatics ; 37(1): 36-42, 2021 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-33416863

RESUMO

MOTIVATION: RNA molecules become attractive small molecule drug targets to treat disease in recent years. Computer-aided drug design can be facilitated by detecting the RNA sites that bind small molecules. However, very limited progress has been reported for the prediction of small molecule-RNA binding sites. RESULTS: We developed a novel method RNAsite to predict small molecule-RNA binding sites using sequence profile- and structure-based descriptors. RNAsite was shown to be competitive with the state-of-the-art methods on the experimental structures of two independent test sets. When predicted structure models were used, RNAsite outperforms other methods by a large margin. The possibility of improving RNAsite by geometry-based binding pocket detection was investigated. The influence of RNA structure's flexibility and the conformational changes caused by ligand binding on RNAsite were also discussed. RNAsite is anticipated to be a useful tool for the design of RNA-targeting small molecule drugs. AVAILABILITY AND IMPLEMENTATION: http://yanglab.nankai.edu.cn/RNAsite. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

18.
Bioinformatics ; 37(8): 1093-1098, 2021 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-33135062

RESUMO

MOTIVATION: Recent years have witnessed that the inter-residue contact/distance in proteins could be accurately predicted by deep neural networks, which significantly improve the accuracy of predicted protein structure models. In contrast, fewer studies have been done for the prediction of RNA inter-nucleotide 3D closeness. RESULTS: We proposed a new algorithm named RNAcontact for the prediction of RNA inter-nucleotide 3D closeness. RNAcontact was built based on the deep residual neural networks. The covariance information from multiple sequence alignments and the predicted secondary structure were used as the input features of the networks. Experiments show that RNAcontact achieves the respective precisions of 0.8 and 0.6 for the top L/10 and L (where L is the length of an RNA) predictions on an independent test set, significantly higher than other evolutionary coupling methods. Analysis shows that about 1/3 of the correctly predicted 3D closenesses are not base pairings of secondary structure, which are critical to the determination of RNA structure. In addition, we demonstrated that the predicted 3D closeness could be used as distance restraints to guide RNA structure folding by the 3dRNA package. More accurate models could be built by using the predicted 3D closeness than the models without using 3D closeness. AVAILABILITY AND IMPLEMENTATION: The webserver and a standalone package are available at: http://yanglab.nankai.edu.cn/RNAcontact/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , RNA , Algoritmos , Redes Neurais de Computação , Nucleotídeos , Alinhamento de Sequência
19.
Bioinformatics ; 36(Suppl_2): i754-i761, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33381830

RESUMO

MOTIVATION: Disordered flexible linkers (DFLs) are abundant and functionally important intrinsically disordered regions that connect protein domains and structural elements within domains and which facilitate disorder-based allosteric regulation. Although computational estimates suggest that thousands of proteins have DFLs, they were annotated experimentally in <200 proteins. This substantial annotation gap can be reduced with the help of accurate computational predictors. The sole predictor of DFLs, DFLpred, trade-off accuracy for shorter runtime by excluding relevant but computationally costly predictive inputs. Moreover, it relies on the local/window-based information while lacking to consider useful protein-level characteristics. RESULTS: We conceptualize, design and test APOD (Accurate Predictor Of DFLs), the first highly accurate predictor that utilizes both local- and protein-level inputs that quantify propensity for disorder, sequence composition, sequence conservation and selected putative structural properties. Consequently, APOD offers significantly more accurate predictions when compared with its faster predecessor, DFLpred, and several other alternative ways to predict DFLs. These improvements stem from the use of a more comprehensive set of inputs that cover the protein-level information and the application of a more sophisticated predictive model, a well-parametrized support vector machine. APOD achieves area under the curve = 0.82 (28% improvement over DFLpred) and Matthews correlation coefficient = 0.42 (180% increase over DFLpred) when tested on an independent/low-similarity test dataset. Consequently, APOD is a suitable choice for accurate and small-scale prediction of DFLs. AVAILABILITY AND IMPLEMENTATION: https://yanglab.nankai.edu.cn/APOD/.


Assuntos
Biologia Computacional , Proteínas Intrinsicamente Desordenadas , Bases de Dados de Proteínas , Domínios Proteicos , Proteínas/genética , Máquina de Vetores de Suporte
20.
Methods Mol Biol ; 2106: 225-239, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31889261

RESUMO

RNA chaperone activity is one of the many functions of intrinsically disordered regions (IDRs). IDRs function without the prerequisite of a stable structure. Instead, their functions arise from structural ensembles. A common theme in IDR function is molecular recognition; IDRs mediate interactions with other proteins, RNA, and DNA. Many computational methods are available to predict IDRs from protein sequence, but relatively few are available for predicting IDR functions. Available methods primarily focus on protein-protein interactions. DisoRDPbind was developed to predict several protein functions including interactions with RNA. This method is available as a user-friendly web interface, located at http://biomine.cs.vcu.edu/servers/DisoRDPbind/ . The development and architecture of DisoRDPbind is briefly presented, and its accuracy relative to other RNA-binding residue predictors is discussed. We explain usage of the web interface in detail and provide an example of prediction results and interpretation. While DisoRDPbind does not identify RNA chaperones directly, we provide a case study of an RNA chaperone, HCV core protein, as an example of the method's utility in the study of RNA chaperones.


Assuntos
Proteínas Intrinsicamente Desordenadas/química , Chaperonas Moleculares/química , Proteínas de Ligação a RNA/química , Análise de Sequência de Proteína/métodos , Software , Animais , Humanos , Proteínas Intrinsicamente Desordenadas/metabolismo , Chaperonas Moleculares/metabolismo , Domínios Proteicos , RNA/metabolismo , Proteínas de Ligação a RNA/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA