Navigating through metaproteomics data: a logbook of database searching.

Muth, Thilo; Kolmeder, Carolin A; Salojärvi, Jarkko; Keskitalo, Salla; Varjosalo, Markku; Verdam, Froukje J; Rensen, Sander S; Reichl, Udo; de Vos, Willem M; Rapp, Erdmann; Martens, Lennart

Muth, Thilo; Kolmeder, Carolin A; Salojärvi, Jarkko; Keskitalo, Salla; Varjosalo, Markku; Verdam, Froukje J; Rensen, Sander S; Reichl, Udo; de Vos, Willem M; Rapp, Erdmann; Martens, Lennart.

Afiliação

Muth T; Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany.
Kolmeder CA; Department of Veterinary Biosciences, University of Helsinki, Helsinki, Finland.
Salojärvi J; Department of Veterinary Biosciences, University of Helsinki, Helsinki, Finland.
Keskitalo S; Institute of Biotechnology, University of Helsinki, Helsinki, Finland.
Varjosalo M; Institute of Biotechnology, University of Helsinki, Helsinki, Finland.
Verdam FJ; Department of General Surgery, NUTRIM, Maastricht University Medical Center, Maastricht, The Netherlands.
Rensen SS; Department of General Surgery, NUTRIM, Maastricht University Medical Center, Maastricht, The Netherlands.
Reichl U; Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany.
de Vos WM; Otto-von-Guericke University, Bioprocess Engineering, Magdeburg, Germany.
Rapp E; Department of Veterinary Biosciences, University of Helsinki, Helsinki, Finland.
Martens L; Department of Bacteriology and Immunology, University of Helsinki, Helsinki, Finland.

Proteomics ; 15(20): 3439-53, 2015 Oct.

Article em En | MEDLINE | ID: mdl-25778831

ABSTRACT

ABSTRACT

Metaproteomic research involves various computational challenges during the identification of fragmentation spectra acquired from the proteome of a complex microbiome. These issues are manifold and range from the construction of customized sequence databases, the optimal setting of search parameters to limitations in the identification search algorithms themselves. In order to assess the importance of these individual factors, we studied the effect of strategies to combine different search algorithms, explored the influence of chosen database search settings, and investigated the impact of the size of the protein sequence database used for identification. Furthermore, we applied de novo sequencing as a complementary approach to classic database searching. All evaluations were performed on a human intestinal metaproteome dataset. Pyrococcus furiosus proteome data were used to contrast database searching of metaproteomic data to a classic proteomic experiment. Searching against subsets of metaproteome databases and the use of multiple search engines increased the number of identifications. The integration of P. furiosus sequences in a metaproteomic sequence database showcased the limitation of the target-decoy-controlled false discovery rate approach in combination with large sequence databases. The selection of varying search engine parameters and the application of de novo sequencing represented useful methods to increase the reliability of the results. Based on our findings, we provide recommendations for the data analysis that help researchers to establish or improve analysis workflows in metaproteomics.

Assuntos

Metagenoma/genética; Proteoma/genética; Proteômica; Algoritmos; Sequência de Aminoácidos/genética; Humanos; Pyrococcus furiosus/genética; Software; Espectrometria de Massas em Tandem

Palavras-chave

Bioinformatics; De novo sequencing; False discovery rate; Metaproteomics; Search parameters

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteoma / Proteômica / Metagenoma Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google