Pesquisa | BVS Doenças Infecciosas e Parasitárias

A large-scale evaluation of computational protein function prediction.

Radivojac, Predrag; Clark, Wyatt T; Oron, Tal Ronnen; Schnoes, Alexandra M; Wittkop, Tobias; Sokolov, Artem; Graim, Kiley; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa; Pandey, Gaurav; Yunes, Jeffrey M; Talwalkar, Ameet S; Repo, Susanna; Souza, Michael L; Piovesan, Damiano; Casadio, Rita; Wang, Zheng; Cheng, Jianlin; Fang, Hai; Gough, Julian; Koskinen, Patrik; Törönen, Petri; Nokso-Koivisto, Jussi; Holm, Liisa; Cozzetto, Domenico; Buchan, Daniel W A; Bryson, Kevin; Jones, David T; Limaye, Bhakti; Inamdar, Harshal; Datta, Avik; Manjari, Sunitha K; Joshi, Rajendra; Chitale, Meghana; Kihara, Daisuke; Lisewski, Andreas M; Erdin, Serkan; Venner, Eric; Lichtarge, Olivier; Rentzsch, Robert; Yang, Haixuan; Romero, Alfonso E; Bhat, Prajwal; Paccanaro, Alberto; Hamp, Tobias; Kaßner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian.

Nat Methods ; 10(3): 221-7, 2013 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-23353650

RESUMO

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.

Assuntos

Biologia Computacional/métodos , Biologia Molecular/métodos , Anotação de Sequência Molecular , Proteínas/fisiologia , Algoritmos , Animais , Bases de Dados de Proteínas , Exorribonucleases/classificação , Exorribonucleases/genética , Exorribonucleases/fisiologia , Previsões , Humanos , Proteínas/química , Proteínas/classificação , Proteínas/genética , Especificidade da Espécie

Highlights from the tenth ISCB Student Council Symposium 2014.

Rahman, Farzana; Wilkins, Katie; Jacobsen, Annika; Junge, Alexander; Vicedo, Esmeralda; DeBlasio, Dan; Jigisha, Anupama; Di Domenico, Tomás.

BMC Bioinformatics ; 16 Suppl 2: A1-10, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25708534

RESUMO

This report summarizes the scientific content and activities of the annual symposium organized by the Student Council of the International Society for Computational Biology (ISCB), held in conjunction with the Intelligent Systems for Molecular Biology (ISMB) conference in Boston, USA, on July 11th, 2014.

Assuntos

Biologia Computacional , Resistência a Múltiplos Medicamentos , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites/genética , Revisão da Pesquisa por Pares , Editoração , RNA Mensageiro/metabolismo , Análise de Sequência de DNA

Highlights from the Third International Society for Computational Biology (ISCB) European Student Council Symposium 2014.

Francescatto, Margherita; Hermans, Susanne M A; Babaei, Sepideh; Vicedo, Esmeralda; Borrel, Alexandre; Meysman, Pieter.

BMC Bioinformatics ; 16 Suppl 3: A1-9, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25708611

RESUMO

In this meeting report, we give an overview of the talks, presentations and posters presented at the third European Symposium of the International Society for Computational Biology (ISCB) Student Council. The event was organized as a satellite meeting of the 13th European Conference for Computational Biology (ECCB) and took place in Strasbourg, France on September 6th, 2014.

Assuntos

Biologia Computacional , Distinções e Prêmios , Bases de Dados Factuais , Redes Reguladoras de Genes , Modelos Estatísticos , Revisão da Pesquisa por Pares

Homology-based inference sets the bar high for protein function prediction.

Hamp, Tobias; Kassner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian; Achten, Dominik; Auer, Florian; Boehm, Ariane; Braun, Tatjana; Hecht, Maximilian; Heron, Mark; Hönigschmid, Peter; Hopf, Thomas A; Kaufmann, Stefanie; Kiening, Michael; Krompass, Denis; Landerer, Cedric; Mahlich, Yannick; Roos, Manfred; Rost, Burkhard.

BMC Bioinformatics ; 14 Suppl 3: S7, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23514582

RESUMO

BACKGROUND: Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. METHODS: Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. RESULTS AND CONCLUSIONS: During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA.

Assuntos

Proteínas/fisiologia , Homologia de Sequência de Aminoácidos , Algoritmos , Proteínas/genética

Environmental Pressure May Change the Composition Protein Disorder in Prokaryotes.

Vicedo, Esmeralda; Schlessinger, Avner; Rost, Burkhard.

PLoS One ; 10(8): e0133990, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26252577

RESUMO

Many prokaryotic organisms have adapted to incredibly extreme habitats. The genomes of such extremophiles differ from their non-extremophile relatives. For example, some proteins in thermophiles sustain high temperatures by being more compact than homologs in non-extremophiles. Conversely, some proteins have increased volumes to compensate for freezing effects in psychrophiles that survive in the cold. Here, we revealed that some differences in organisms surviving in extreme habitats correlate with a simple single feature, namely the fraction of proteins predicted to have long disordered regions. We predicted disorder with different methods for 46 completely sequenced organisms from diverse habitats and found a correlation between protein disorder and the extremity of the environment. More specifically, the overall percentage of proteins with long disordered regions tended to be more similar between organisms of similar habitats than between organisms of similar taxonomy. For example, predictions tended to detect substantially more proteins with long disordered regions in prokaryotic halophiles (survive high salt) than in their taxonomic neighbors. Another peculiar environment is that of high radiation survived, e.g. by Deinococcus radiodurans. The relatively high fraction of disorder predicted in this extremophile might provide a shield against mutations. Although our analysis fails to establish causation, the observed correlation between such a simplistic, coarse-grained, microscopic molecular feature (disorder content) and a macroscopic variable (habitat) remains stunning.

Assuntos

Meio Ambiente , Células Procarióticas/metabolismo , Proteínas/química , Ecossistema , Filogenia , Radiação , Salinidade

Protein disorder reduced in Saccharomyces cerevisiae to survive heat shock.

Vicedo, Esmeralda; Gasik, Zofia; Dong, Yu-An; Goldberg, Tatyana; Rost, Burkhard.

F1000Res ; 4: 1222, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26673203

RESUMO

Recent experiments established that a culture of Saccharomyces cerevisiae (baker's yeast) survives sudden high temperatures by specifically duplicating the entire chromosome III and two chromosomal fragments (from IV and XII). Heat shock proteins (HSPs) are not significantly over-abundant in the duplication. In contrast, we suggest a simple algorithm to " postdict " the experimental results: Find a small enough chromosome with minimal protein disorder and duplicate this region. This algorithm largely explains all observed duplications. In particular, all regions duplicated in the experiment reduced the overall content of protein disorder. The differential analysis of the functional makeup of the duplication remained inconclusive. Gene Ontology (GO) enrichment suggested over-representation in processes related to reproduction and nutrient uptake. Analyzing the protein-protein interaction network (PPI) revealed that few network-central proteins were duplicated. The predictive hypothesis hinges upon the concept of reducing proteins with long regions of disorder in order to become less sensitive to heat shock attack.

Cloud prediction of protein structure and function with PredictProtein for Debian.

Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard.

Biomed Res Int ; 2013: 398968, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23971032

RESUMO

We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

Assuntos

Internet , Modelos Químicos , Modelos Genéticos , Modelos Moleculares , Linguagens de Programação , Proteínas , Software , Sequência de Aminoácidos , Sequência de Bases , Simulação por Computador , Mineração de Dados/métodos , Bases de Dados de Proteínas , Dados de Sequência Molecular , Proteínas/química , Proteínas/genética , Proteínas/ultraestrutura , Análise de Sequência de Proteína/métodos , Relação Estrutura-Atividade

Protein disorder--a breakthrough invention of evolution?

Schlessinger, Avner; Schaefer, Christian; Vicedo, Esmeralda; Schmidberger, Markus; Punta, Marco; Rost, Burkhard.

Curr Opin Struct Biol ; 21(3): 412-8, 2011 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-21514145

RESUMO

As an operational definition, we refer to regions in proteins that do not adopt regular three-dimensional structures in isolation, as disordered regions. An antipode to disorder would be 'well-structured' rather than 'ordered'. Here, we argue for the following three hypotheses. Firstly, it is more useful to picture disorder as a distinct phenomenon in structural biology than as an extreme example of protein flexibility. Secondly, there are many very different flavors of protein disorder, nevertheless, it seems advantageous to portray the universe of all possible proteins in terms of two main types: well-structured, disordered. There might be a third type 'other' but we have so far no positive evidence for this. Thirdly, nature uses protein disorder as a tool to adapt to different environments. Protein disorder is evolutionarily conserved and this maintenance of disorder is highly nontrivial. Increasingly integrating protein disorder into the toolbox of a living cell was a crucial step in the evolution from simple bacteria to complex eukaryotes. We need new advanced computational methods to study this new milestone in the advance of protein biology.

Assuntos

Conformação Proteica , Proteínas/química , Proteínas/genética , Animais , Evolução Molecular , Humanos , Proteínas/metabolismo

affyPara-a Bioconductor Package for Parallelized Preprocessing Algorithms of Affymetrix Microarray Data.

Schmidberger, Markus; Vicedo, Esmeralda; Mansmann, Ulrich.

Bioinform Biol Insights ; 3: 83-7, 2009 Jul 22.

Artigo em Inglês | MEDLINE | ID: mdl-20140068

RESUMO

Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule's prediction quality honestly.This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays.affyPara is a free and open source package, under GPL license, available form the Bioconductor project at www.bioconductor.org. A user guide and examples are provided with the package.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA