Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Protein Eng Des Sel ; 31(9): 345-354, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30407584

RESUMO

A wide variety of protein and peptidomimetic design tasks require matching functional 3D motifs to potential oligomeric scaffolds. For example, during enzyme design, one aims to graft active-site patterns-typically consisting of 3-15 residues-onto new protein surfaces. Identifying protein scaffolds suitable for such active-site engraftment requires costly searches for protein folds that provide the correct side chain positioning to host the desired active site. Other examples of biodesign tasks that require similar fast exact geometric searches of potential side chain positioning include mimicking binding hotspots, design of metal binding clusters and the design of modular hydrogen binding networks for specificity. In these applications, the speed and scaling of geometric searches limits the scope of downstream design to small patterns. Here, we present an adaptive algorithm capable of searching for side chain take-off angles, which is compatible with an arbitrarily specified functional pattern and which enjoys substantive performance improvements over previous methods. We demonstrate this method in both genetically encoded (protein) and synthetic (peptidomimetic) design scenarios. Examples of using this method with the Rosetta framework for protein design are provided. Our implementation is compatible with multiple protein design frameworks and is freely available as a set of python scripts (https://github.com/JiangTian/adaptive-geometric-search-for-protein-design).


Assuntos
Modelos Moleculares , Engenharia de Proteínas/métodos , Proteínas Recombinantes , Algoritmos , Domínio Catalítico/genética , Biologia Computacional , Metais/química , Metais/metabolismo , Ligação Proteica , Conformação Proteica , Dobramento de Proteína , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo
2.
Genome Biol ; 17(1): 184, 2016 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-27604469

RESUMO

BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.


Assuntos
Biologia Computacional , Proteínas/química , Software , Relação Estrutura-Atividade , Algoritmos , Bases de Dados de Proteínas , Ontologia Genética , Humanos , Anotação de Sequência Molecular , Proteínas/genética
3.
PLoS Comput Biol ; 10(6): e1003644, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24922051

RESUMO

Negative examples - genes that are known not to carry out a given protein function - are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html).


Assuntos
Algoritmos , Bases de Dados Genéticas , Ontologia Genética , Proteínas/genética , Proteínas/fisiologia , Animais , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/fisiologia , Inteligência Artificial , Biologia Computacional , Genoma , Humanos , Camundongos , Anotação de Sequência Molecular , Proteoma , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/fisiologia
4.
Bioinformatics ; 29(9): 1190-8, 2013 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-23511543

RESUMO

MOTIVATION: Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. RESULTS: We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. AVAILABILITY: Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html


Assuntos
Algoritmos , Proteínas/fisiologia , Animais , Inteligência Artificial , Teorema de Bayes , Redes Reguladoras de Genes , Genoma , Camundongos , Anotação de Sequência Molecular , Mapeamento de Interação de Proteínas , Proteínas/genética , Proteínas/metabolismo , Proteoma/metabolismo , Leveduras/genética , Leveduras/metabolismo
5.
Mol Cell ; 46(5): 674-90, 2012 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-22681889

RESUMO

Protein-RNA interactions are fundamental to core biological processes, such as mRNA splicing, localization, degradation, and translation. We developed a photoreactive nucleotide-enhanced UV crosslinking and oligo(dT) purification approach to identify the mRNA-bound proteome using quantitative proteomics and to display the protein occupancy on mRNA transcripts by next-generation sequencing. Application to a human embryonic kidney cell line identified close to 800 proteins. To our knowledge, nearly one-third were not previously annotated as RNA binding, and about 15% were not predictable by computational methods to interact with RNA. Protein occupancy profiling provides a transcriptome-wide catalog of potential cis-regulatory regions on mammalian mRNAs and showed that large stretches in 3' UTRs can be contacted by the mRNA-bound proteome, with numerous putative binding sites in regions harboring disease-associated nucleotide polymorphisms. Our observations indicate the presence of a large number of mRNA binders with diverse molecular functions participating in combinatorial posttranscriptional gene-expression networks.


Assuntos
Proteômica/métodos , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Sítios de Ligação , Linhagem Celular , Humanos , Espectrometria de Massas , Proteínas de Ligação a RNA/química , Análise de Sequência de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...