Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
BMC Bioinformatics ; 22(1): 439, 2021 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-34525939

RESUMO

BACKGROUND: Accurate prediction of protein tertiary structures is highly desired as the knowledge of protein structures provides invaluable insights into protein functions. We have designed two approaches to protein structure prediction, including a template-based modeling approach (called ProALIGN) and an ab initio prediction approach (called ProFOLD). Briefly speaking, ProALIGN aligns a target protein with templates through exploiting the patterns of context-specific alignment motifs and then builds the final structure with reference to the homologous templates. In contrast, ProFOLD uses an end-to-end neural network to estimate inter-residue distances of target proteins and builds structures that satisfy these distance constraints. These two approaches emphasize different characteristics of target proteins: ProALIGN exploits structure information of homologous templates of target proteins while ProFOLD exploits the co-evolutionary information carried by homologous protein sequences. Recent progress has shown that the combination of template-based modeling and ab initio approaches is promising. RESULTS: In the study, we present FALCON2, a web server that integrates ProALIGN and ProFOLD to provide high-quality protein structure prediction service. For a target protein, FALCON2 executes ProALIGN and ProFOLD simultaneously to predict possible structures and selects the most likely one as the final prediction result. We evaluated FALCON2 on widely-used benchmarks, including 104 CASP13 (the 13th Critical Assessment of protein Structure Prediction) targets and 91 CASP14 targets. In-depth examination suggests that when high-quality templates are available, ProALIGN is superior to ProFOLD and in other cases, ProFOLD shows better performance. By integrating these two approaches with different emphasis, FALCON2 server outperforms the two individual approaches and also achieves state-of-the-art performance compared with existing approaches. CONCLUSIONS: By integrating template-based modeling and ab initio approaches, FALCON2 provides an easy-to-use and high-quality protein structure prediction service for the community and we expect it to enable insights into a deep understanding of protein functions.


Assuntos
Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos , Computadores , Conformação Proteica , Software
2.
Proteins ; 86 Suppl 1: 113-121, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-28940798

RESUMO

We describe several notable aspects of our structure predictions using Rosetta in CASP12 in the free modeling (FM) and refinement (TR) categories. First, we had previously generated (and published) models for most large protein families lacking experimentally determined structures using Rosetta guided by co-evolution based contact predictions, and for several targets these models proved better starting points for comparative modeling than any known crystal structure-our model database thus starts to fulfill one of the goals of the original protein structure initiative. Second, while our "human" group simply submitted ROBETTA models for most targets, for six targets expert intervention improved predictions considerably; the largest improvement was for T0886 where we correctly parsed two discontinuous domains guided by predicted contact maps to accurately identify a structural homolog of the same fold. Third, Rosetta all atom refinement followed by MD simulations led to consistent but small improvements when starting models were close to the native structure, and larger but less consistent improvements when starting models were further away.


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Proteínas/química , Algoritmos , Cristalografia por Raios X , Humanos , Análise de Sequência de Proteína
3.
Int J Mod Phys B ; 32(18)2018 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-30853739

RESUMO

Predicting 3D structure of protein from its amino acid sequence is one of the most important unsolved problems in biophysics and computational biology. This paper attempts to give a comprehensive introduction of the most recent effort and progress on protein structure prediction. Following the general flowchart of structure prediction, related concepts and methods are presented and discussed. Moreover, brief introductions are made to several widely-used prediction methods and the community-wide critical assessment of protein structure prediction (CASP) experiments.

4.
BMC Bioinformatics ; 17(Suppl 18): 474, 2016 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-28105918

RESUMO

BACKGROUND: MicroRNAs (miRNAs) are key gene expression regulators in plants and animals. Therefore, miRNAs are involved in several biological processes, making the study of these molecules one of the most relevant topics of molecular biology nowadays. However, characterizing miRNAs in vivo is still a complex task. As a consequence, in silico methods have been developed to predict miRNA loci. A common ab initio strategy to find miRNAs in genomic data is to search for sequences that can fold into the typical hairpin structure of miRNA precursors (pre-miRNAs). The current ab initio approaches, however, have selectivity issues, i.e., a high number of false positives is reported, which can lead to laborious and costly attempts to provide biological validation. This study presents an extension of the ab initio method miRNAFold, with the aim of improving selectivity through machine learning techniques, namely, random forest combined with the SMOTE procedure that copes with imbalance datasets. RESULTS: By comparing our method, termed Mirnacle, with other important approaches in the literature, we demonstrate that Mirnacle substantially improves selectivity without compromising sensitivity. For the three datasets used in our experiments, our method achieved at least 97% of sensitivity and could deliver a two-fold, 20-fold, and 6-fold increase in selectivity, respectively, compared with the best results of current computational tools. CONCLUSIONS: The extension of miRNAFold by the introduction of machine learning techniques, significantly increases selectivity in pre-miRNA ab initio prediction, which optimally contributes to advanced studies on miRNAs, as the need of biological validations is diminished. Hopefully, new research, such as studies of severe diseases caused by miRNA malfunction, will benefit from the proposed computational tool.


Assuntos
Biologia Computacional/métodos , Eucariotos/genética , Genômica/métodos , MicroRNAs/química , Animais , Biologia Computacional/instrumentação , Simulação por Computador , Eucariotos/química , Genoma , Genômica/instrumentação , Humanos , Aprendizado de Máquina , MicroRNAs/genética , Conformação de Ácido Nucleico , Plantas/química , Plantas/genética
5.
Proteins ; 84 Suppl 1: 67-75, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-26677056

RESUMO

We describe CASP11 de novo blind structure predictions made using the Rosetta structure prediction methodology with both automatic and human assisted protocols. Model accuracy was generally improved using coevolution derived residue-residue contact information as restraints during Rosetta conformational sampling and refinement, particularly when the number of sequences in the family was more than three times the length of the protein. The highlight was the human assisted prediction of T0806, a large and topologically complex target with no homologs of known structure, which had unprecedented accuracy-<3.0 Å root-mean-square deviation (RMSD) from the crystal structure over 223 residues. For this target, we increased the amount of conformational sampling over our fully automated method by employing an iterative hybridization protocol. Our results clearly demonstrate, in a blind prediction scenario, that coevolution derived contacts can considerably increase the accuracy of template-free structure modeling. Proteins 2016; 84(Suppl 1):67-75. © 2015 Wiley Periodicals, Inc.


Assuntos
Biologia Computacional/estatística & dados numéricos , Proteínas de Escherichia coli/química , Modelos Moleculares , Modelos Estatísticos , Software , Sequência de Aminoácidos , Biologia Computacional/métodos , Cristalografia por Raios X , Evolução Molecular Direcionada , Escherichia coli/química , Humanos , Internet , Dobramento de Proteína , Domínios e Motivos de Interação entre Proteínas , Estrutura Secundária de Proteína , Alinhamento de Sequência
6.
Proteins ; 84(3): 332-48, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26756402

RESUMO

In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position-specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα-Cα atoms. First, using a rigorous leave-one-protein-out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state-of-the-art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessible at http://hpcc.siat.ac.cn/COMSAT/.


Assuntos
Simulação por Computador , Proteínas de Membrana/química , Modelos Moleculares , Programação Linear , Algoritmos , Reconhecimento Automatizado de Padrão , Matrizes de Pontuação de Posição Específica , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Reprodutibilidade dos Testes , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia Estrutural de Proteína , Máquina de Vetores de Suporte
7.
Proteins ; 84 Suppl 1: 145-51, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-26205532

RESUMO

Here we present the results of residue-residue contact predictions achieved in CASP11 by the CONSIP2 server, which is based around our MetaPSICOV contact prediction method. On a set of 40 target domains with a median family size of around 40 effective sequences, our server achieved an average top-L/5 long-range contact precision of 27%. MetaPSICOV method bases on a combination of classical contact prediction features, enhanced with three distinct covariation methods embedded in a two-stage neural network predictor. Some unique features of our approach are (1) the tuning between the classical and covariation features depending on the depth of the input alignment and (2) a hybrid approach to generate deepest possible multiple-sequence alignments by combining jackHMMer and HHblits. We discuss the CONSIP2 pipeline, our results and show that where the method underperformed, the major factor was relying on a fixed set of parameters for the initial sequence alignments and not attempting to perform domain splitting as a preprocessing step. Proteins 2016; 84(Suppl 1):145-151. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.


Assuntos
Biologia Computacional/estatística & dados numéricos , Aprendizado de Máquina , Modelos Moleculares , Modelos Estatísticos , Proteínas/química , Software , Sequência de Aminoácidos , Bactérias/química , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Humanos , Internet , Redes Neurais de Computação , Dobramento de Proteína , Domínios e Motivos de Interação entre Proteínas , Estrutura Secundária de Proteína , Alinhamento de Sequência , Vírus/química
8.
Proteins ; 82 Suppl 2: 208-18, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23900763

RESUMO

A number of methods have been described for identifying pairs of contacting residues in protein three-dimensional structures, but it is unclear how many contacts are required for accurate structure modeling. The CASP10 assisted contact experiment provided a blind test of contact guided protein structure modeling. We describe the models generated for these contact guided prediction challenges using the Rosetta structure modeling methodology. For nearly all cases, the submitted models had the correct overall topology, and in some cases, they had near atomic-level accuracy; for example the model of the 384 residue homo-oligomeric tetramer (Tc680o) had only 2.9 Å root-mean-square deviation (RMSD) from the crystal structure. Our results suggest that experimental and bioinformatic methods for obtaining contact information may need to generate only one correct contact for every 12 residues in the protein to allow accurate topology level modeling.


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Conformação Proteica , Proteínas/química , Modelos Estatísticos , Alinhamento de Sequência
9.
Proteins ; 80(2): 490-504, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22095594

RESUMO

In fragment-assembly techniques for protein structure prediction, models of protein structure are assembled from fragments of known protein structures. This process is typically guided by a knowledge-based energy function and uses a heuristic optimization method. The fragments play two important roles in this process: they define the set of structural parameters available, and they also assume the role of the main variation operators that are used by the optimiser. Previous analysis has typically focused on the first of these roles. In particular, the relationship between local amino acid sequence and local protein structure has been studied by a range of authors. The correlation between the two has been shown to vary with the window length considered, and the results of these analyses have informed directly the choice of fragment length in state-of-the-art prediction techniques. Here, we focus on the second role of fragments and aim to determine the effect of fragment length from an optimization perspective. We use theoretical analyses to reveal how the size and structure of the search space changes as a function of insertion length. Furthermore, empirical analyses are used to explore additional ways in which the size of the fragment insertion influences the search both in a simulation model and for the fragment-assembly technique, Rosetta.


Assuntos
Modelos Moleculares , Fragmentos de Peptídeos/química , Proteínas/química , Algoritmos , Cadeias de Markov , Conformação Proteica
10.
Methods Mol Biol ; 2257: 167-174, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34432278

RESUMO

MicroRNA (miRNA) studies have been one of the most popular research areas in recent years. Although thousands of miRNAs have been detected in several species, the majority remains unidentified. Thus, finding novel miRNAs is a vital element for investigating miRNA mediated posttranscriptional gene regulation machineries. Furthermore, experimental methods have challenging inadequacies in their capability to detect rare miRNAs, and are also limited to the state of the organism under examination (e.g., tissue type, developmental stage, stress-disease conditions). These issues have initiated the creation of high-level computational methodologies endeavoring to distinguish potential miRNAs in silico. On the other hand, most of these tools suffer from high numbers of false positives and/or false negatives and as a result they do not provide enough confidence for validating all their predictions experimentally. In this chapter, computational difficulties in detection of pre-miRNAs are discussed and a machine learning based approach that has been designed to address these issues is reviewed.


Assuntos
Biologia Computacional , MicroRNAs/genética , Aprendizado de Máquina
11.
Methods Mol Biol ; 2257: 311-347, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34432286

RESUMO

The critical role microRNAs play in modulating global functions is emerging, both in the maintenance of homeostatic mechanisms and in the adaptation to diverse environmental stresses. When stressed, cells must divert metabolic requirements toward immediate survival and eventual recovery and the unique features of miRNAs, such as their relatively ATP-inexpensive biogenesis costs, and the quick and reversible nature of their action, renders them excellent "master controllers" for rapid responses. Many animal survival strategies for dealing with extreme environmental pressures involve prolonged retreats into states of suspended animation to extend the time that they can survive on their limited internal fuel reserves until conditions improve. The ability to retreat into such hypometabolic states is only possible by coupling the global suppression of nonessential energy-expensive functions with an activation of prosurvival networks, a process in which miRNAs are now known to play a major role. In this chapter, we discuss the activation, expression, biogenesis, and unique attributes of miRNA regulation required to facilitate profound metabolic rate depression and implement stress-specific metabolic adaptations. We examine the role of miRNA in strategies of biochemical adaptation including mammalian hibernation, freeze tolerance, freeze avoidance, anoxia and hypoxia survival, estivation, and dehydration tolerance. By comparing these seemingly different adaptive programs in traditional and exotic animal models, we highlight both unique and conserved miRNA-meditated mechanisms for survival. Additional topics discussed include transcription factor networks, temperature dependent miRNA-targeting, and novel species-specific and stress-specific miRNAs.


Assuntos
MicroRNAs/genética , Aclimatação , Adaptação Fisiológica , Animais , Congelamento , Hibernação , Hipóxia
12.
Methods Mol Biol ; 1484: 115-126, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27787823

RESUMO

Recent successes of contact-guided protein structure prediction methods have revived interest in solving the long-standing problem of ab initio protein structure prediction. With homology modeling failing for many protein sequences that do not have templates, contact-guided structure prediction has shown promise, and consequently, contact prediction has gained a lot of interest recently. Although a few dozen contact prediction tools are already currently available as web servers and downloadables, not enough research has been done towards using existing measures like precision and recall to evaluate these contacts with the goal of building three-dimensional models. Moreover, when we do not have a native structure for a set of predicted contacts, the only analysis we can perform is a simple contact map visualization of the predicted contacts. A wider and more rigorous assessment of the predicted contacts is needed, in order to build tertiary structure models. This chapter discusses instructions and protocols for using tools and applying techniques in order to assess predicted contacts for building three-dimensional models.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Software , Algoritmos , Bases de Dados de Proteínas , Redes Neurais de Computação , Conformação Proteica , Dobramento de Proteína , Proteínas/genética , Análise de Sequência de Proteína
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa