Efficient Indexing of Peptides for Database Search Using Tide.

Nii Adoquaye Acquaye, Frank Lawrence; Kertesz-Farkas, Attila; Noble, William Stafford

Nii Adoquaye Acquaye, Frank Lawrence; Kertesz-Farkas, Attila; Noble, William Stafford.

Affiliation

Nii Adoquaye Acquaye FL; Department of Data Analysis and Artificial Intelligence and Laboratory on AI for Computational Biology, Faculty of Computer Science, HSE University, Moscow 109028, Russia.
Kertesz-Farkas A; Department of Data Analysis and Artificial Intelligence and Laboratory on AI for Computational Biology, Faculty of Computer Science, HSE University, Moscow 109028, Russia.
Noble WS; Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States.

J Proteome Res ; 22(2): 577-584, 2023 02 03.

Article in En | MEDLINE | ID: mdl-36633229

ABSTRACT

ABSTRACT

The first step in the analysis of protein tandem mass spectrometry data typically involves searching the observed spectra against a protein database. During database search, the search engine must digest the proteins in the database into peptides, subject to digestion rules that are under user control. The choice of these digestion parameters, as well as selection of post-translational modifications (PTMs), can dramatically affect the size of the search space and hence the statistical power of the search. The Tide search engine separates the creation of the peptide index from the database search step, thereby saving time by allowing a peptide index to be reused in multiple searches. Here we describe an improved implementation of the indexing component of Tide that consumes around four times less resources (CPU and RAM) than the previous version and can generate arbitrarily large peptide databases, limited by only the amount of available disk space. We use this improved implementation to explore the relationship between database size and the parameters controlling digestion and PTMs, as well as database size and statistical power. Our results can help guide practitioners in proper selection of these important parameters.

Subject(s)
Key words

database search; spectrum identification; tandem mass spectrometry

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Peptides / Algorithms Language: En Journal: J Proteome Res Journal subject: BIOQUIMICA Year: 2023 Document type: Article Affiliation country:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google