Your browser doesn't support javascript.
loading
mzDB: a file format using multiple indexing strategies for the efficient analysis of large LC-MS/MS and SWATH-MS data sets.
Bouyssié, David; Dubois, Marc; Nasso, Sara; Gonzalez de Peredo, Anne; Burlet-Schiltz, Odile; Aebersold, Ruedi; Monsarrat, Bernard.
Afiliação
  • Bouyssié D; From the ‡CNRS; IPBS (Institut de Pharmacologie et de Biologie Structurale); 205 route de Narbonne, F-31077 Toulouse, France; §Université de Toulouse; UPS; IPBS; F-31077 Toulouse, France; david.bouyssie@ipbs.fr.
  • Dubois M; From the ‡CNRS; IPBS (Institut de Pharmacologie et de Biologie Structurale); 205 route de Narbonne, F-31077 Toulouse, France; §Université de Toulouse; UPS; IPBS; F-31077 Toulouse, France;
  • Nasso S; ¶Department of Biology, Institute of Molecular Systems Biology, ETH, Auguste-Piccard-Hof 1, ETH Hönggerberg, CH-8093 Zürich, Switzerland;
  • Gonzalez de Peredo A; From the ‡CNRS; IPBS (Institut de Pharmacologie et de Biologie Structurale); 205 route de Narbonne, F-31077 Toulouse, France; §Université de Toulouse; UPS; IPBS; F-31077 Toulouse, France;
  • Burlet-Schiltz O; From the ‡CNRS; IPBS (Institut de Pharmacologie et de Biologie Structurale); 205 route de Narbonne, F-31077 Toulouse, France; §Université de Toulouse; UPS; IPBS; F-31077 Toulouse, France;
  • Aebersold R; ¶Department of Biology, Institute of Molecular Systems Biology, ETH, Auguste-Piccard-Hof 1, ETH Hönggerberg, CH-8093 Zürich, Switzerland; ‖Faculty of Science, University of Zurich, Zurich, Switzerland.
  • Monsarrat B; From the ‡CNRS; IPBS (Institut de Pharmacologie et de Biologie Structurale); 205 route de Narbonne, F-31077 Toulouse, France; §Université de Toulouse; UPS; IPBS; F-31077 Toulouse, France;
Mol Cell Proteomics ; 14(3): 771-81, 2015 Mar.
Article em En | MEDLINE | ID: mdl-25505153
ABSTRACT
The analysis and management of MS data, especially those generated by data independent MS acquisition, exemplified by SWATH-MS, pose significant challenges for proteomics bioinformatics. The large size and vast amount of information inherent to these data sets need to be properly structured to enable an efficient and straightforward extraction of the signals used to identify specific target peptides. Standard XML based formats are not well suited to large MS data files, for example, those generated by SWATH-MS, and compromise high-throughput data processing and storing. We developed mzDB, an efficient file format for large MS data sets. It relies on the SQLite software library and consists of a standardized and portable server-less single-file database. An optimized 3D indexing approach is adopted, where the LC-MS coordinates (retention time and m/z), along with the precursor m/z for SWATH-MS data, are used to query the database for data extraction. In comparison with XML formats, mzDB saves ∼25% of storage space and improves access times by a factor of twofold up to even 2000-fold, depending on the particular data access. Similarly, mzDB shows also slightly to significantly lower access times in comparison with other formats like mz5. Both C++ and Java implementations, converting raw or XML formats to mzDB and providing access methods, will be released under permissive license. mzDB can be easily accessed by the SQLite C library and its drivers for all major languages, and browsed with existing dedicated GUIs. The mzDB described here can boost existing mass spectrometry data analysis pipelines, offering unprecedented performance in terms of efficiency, portability, compactness, and flexibility.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Espectrometria de Massas / Sistemas de Gerenciamento de Base de Dados Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Espectrometria de Massas / Sistemas de Gerenciamento de Base de Dados Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article