Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Mol Cell Proteomics ; 23(9): 100827, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39128790

RESUMO

This work presents a detailed determination of site-specific N-glycan distributions of the recombinant influenza glycoproteins hemagglutinin (HA) and neuraminidase. Variation in glycosylation among recombinant glycoproteins is not predictable and can depend on details of the biomanufacturing process as well as details of protein structure. In this study, recombinant influenza proteins were analyzed from eight strains of four different suppliers. These include five HA and three neuraminidase proteins, each produced from a HEK293 cell line. Digestion was conducted using a series of complex multienzymatic methods designed to isolate glycopeptides containing single N-glycosylated sites. Site-specific glycosylation profiles of intact glycopeptides were produced using a recently developed method and comparisons were made using spectral similarity scores. Variation in glycan abundances and distribution was most pronounced between different strains of virus (similarity score = 383 out of 999), whereas digestion replicates and injection replicates showed relatively little variation (similarity score = 957). Notably, glycan distributions for homologous regions of influenza glycoprotein variants showed low variability. Due to the multiple possible sources of variation and inherent analytical difficulties in site-specific glycan determinations, variations were individually examined for multiple factors, including differences in supplier, production batch, protease digestion, and replicate measurement. After comparing all glycosylation distributions, four distinguishable classes could be identified for the majority of sites. Finally, attempts to identify glycosylation distributions on adjacent potential N-glycosylated sites of one HA variant were made. Only the second site (NnST) was found to be occupied using two rarely used proteases in proteomics, subtilisin and esperase, both of which did selectively cleave these adjacent sites.


Assuntos
Neuraminidase , Polissacarídeos , Proteínas Recombinantes , Glicosilação , Humanos , Células HEK293 , Proteínas Recombinantes/metabolismo , Polissacarídeos/metabolismo , Neuraminidase/metabolismo , Glicoproteínas de Hemaglutininação de Vírus da Influenza/metabolismo , Glicoproteínas de Hemaglutininação de Vírus da Influenza/química , Glicoproteínas/metabolismo , Glicopeptídeos/metabolismo
2.
J Proteome Res ; 23(4): 1443-1457, 2024 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-38450643

RESUMO

We report the comparison of mass-spectral-based abundances of tryptic glycopeptides to fluorescence abundances of released labeled glycans and the effects of mass and charge state and in-source fragmentation on glycopeptide abundances. The primary glycoforms derived from Rituximab, NISTmAb, Evolocumab, and Infliximab were high-mannose and biantennary complex galactosylated and fucosylated N-glycans. Except for Evolocumab, in-source ions derived from the loss of HexNAc or HexNAc-Hex sugars are prominent for other therapeutic IgGs. After excluding in-source fragmentation of glycopeptide ions from the results, a linear correlation was observed between fluorescently labeled N-glycan and glycopeptide abundances over a dynamic range of 500. Different charge states of human IgG-derived glycopeptides containing a wider variety of abundant attached glycans were also investigated to examine the effects of the charge state on ion abundances. These revealed a linear dependence of glycopeptide abundance on the mass of the glycan with higher charge states favoring higher-mass glycans. Findings indicate that the mass spectrometry-based bottom-up approach can provide results as accurate as those of glycan release studies while revealing the origin of each attached glycan. These site-specific relative abundances are conveniently displayed and compared using previously described glycopeptide abundance distribution spectra "GADS" representations. Mass spectrometry data are available from the MAssIVE repository (MSV000093562).


Assuntos
Imunoglobulina G , Espectrometria de Massas em Tandem , Humanos , Glicosilação , Glicopeptídeos/análise , Polissacarídeos/química , Íons
3.
J Proteome Res ; 22(10): 3225-3241, 2023 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-37647588

RESUMO

Glycopeptide Abundance Distribution Spectra (GADS) were recently introduced as a means of representing, storing, and comparing glycan profiles of intact glycopeptides. Here, using that representation, an extensive analysis is made of multiple commercial sources of the recombinant SARS-CoV-2 spike protein, each containing 22 N-linked glycan sites (sequons). Multiple proteases are used along with variable energy fragmentation followed by ion trap confirmation. This enables a detailed examination of the reproducibility of the method across multiple types of variability. These results show that GADS are consistent between replicates and laboratories for sufficiently abundant glycopeptides. Derived GADS enable the examination and comparison of the glycan profiles between commercial sources of the spike protein. Multiple distinct glycopeptide distributions, generated by multiple proteases, confirm these profiles. Comparisons of GADS derived from 11 sources of recombinant spike protein reveal that sources for which protein expression methods were the same produced near-identical glycan profiles, thereby demonstrating the ability of this method to measure GADS of sufficient reliability to distinguish different glycoform distributions between commercial vendors and potentially to reliably determine and compare differences in glycosylation for any glycoprotein under different conditions of production. All mass spectrometry data files have been deposited in the MassIVE repository under the identifier MSV000091776.

4.
J Proteome Res ; 21(10): 2421-2434, 2022 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-36112477

RESUMO

We present a mass spectral library-based method for analyzing site-specific N-linked protein glycosylation. Its operation and utility are illustrated by applying it to both newly measured and available proteomics data of human milk glycoproteins. It generates two varieties of mass spectral libraries. One contains glycopeptide abundance distribution spectra (GADS). The other contains tandem mass spectra of the underlying glycopeptides. Both originate from identified glycopeptides in proteolytic digests of human milk and purified glycoproteins, which include tenascin, lactoferrin, and several antibodies. Analysis was also applied to digests of a NIST human milk standard reference material (SRM), leading to a GADS library of N-glycopeptides, enabling the direct comparison of glycopeptide distributions for individual proteins. Tandem spectra underlying each glycopeptide GADS peak are combined to create a second type of library that contains spectra of the underlying glycopeptide spectra. These were acquired by higher-energy (stepped) collision dissociation fragmentation followed by ion-trap fragmentation. Spectra are annotated using MS_Piano, recently reported annotation software. This data, with extensions of a widely used spectral library search and display software, provides accessible mass spectral libraries.


Assuntos
Proteínas do Leite , Leite Humano , Glicopeptídeos/análise , Glicoproteínas/metabolismo , Glicosilação , Humanos , Lactoferrina/metabolismo , Proteínas do Leite/metabolismo , Leite Humano/química , Tenascina/metabolismo
5.
J Proteome Res ; 20(9): 4475-4486, 2021 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-34327998

RESUMO

A method for representing and comparing distributions of N-linked glycans located at specific sites on proteins is presented. The representation takes the form of a simple mass spectrum for a given peptide sequence, with each peak corresponding to a different glycopeptide. The mass (in place of m/z) of each peak is that of the glycan mass, and its abundance corresponds to its relative abundance in the electrospray MS1 spectrum. This provides a facile means of representing all identifiable glycopeptides arising from a single protein "sequon" on a specific sequence, thereby enabling the comparison and searching of these distributions as routinely done for mass spectra. Likewise, these reference glycopeptide abundance distribution spectra (GADS) can be stored in searchable libraries. A set of such libraries created from available data is provided along with an adapted version of the widely used NIST-MS library-search software. Since GADS contain only MS1 abundances and identifications, they are equally suitable for expressing collision-induced fragmentation and electron-transfer dissociation determinations of glycopeptide identity. Comparisons of GADS for N-glycosylated sites on several proteins, especially the SARS-CoV-2 spike protein, demonstrate the potential reproducibility of GADS and their utility for comparing site-specific distributions.


Assuntos
COVID-19 , Glicopeptídeos/metabolismo , Glicoproteínas , Glicosilação , Humanos , Polissacarídeos , Reprodutibilidade dos Testes , SARS-CoV-2 , Glicoproteína da Espícula de Coronavírus
6.
J Proteome Res ; 20(9): 4603-4609, 2021 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-34264676

RESUMO

Annotating product ion peaks in tandem mass spectra is essential for evaluating spectral quality and validating peptide identification. This task is more complex for glycopeptides and is crucial for the confident determination of glycosylation sites in glycoproteins. MS_Piano (Mass Spectrum Peptide Annotation) software was developed for reliable annotation of peaks in collision induced dissociation (CID) tandem mass spectra of peptides or N-glycopeptides for given peptide sequences, charge states, and optional modifications. The program annotates each peak in high or low resolution spectra with possible product ion(s) and the mass difference between the measured and theoretical m/z values. Spectral quality is measured by two major parameters: the ratio between the sum of unannotated vs all peak intensities in the top 20 peaks, and the intensity of the highest unannotated peak. The product ions of peptides, glycans, and glycopeptides in spectra are labeled in different class-type colors to facilitate interpretation. MS_Piano assists validating peptide and N-glycopeptide identification from database and library searches and provides quality control and optimizes search reliability in custom developed peptide mass spectral libraries. The software is freely available in .exe and .dll formats for the Windows operating system.


Assuntos
Glicopeptídeos , Proteômica , Reprodutibilidade dos Testes , Software , Espectrometria de Massas em Tandem
7.
Anal Chem ; 91(21): 13924-13932, 2019 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-31600070

RESUMO

Metabolomics has a critical need for better tools for mass spectral identification. Common metabolites may be identified by searching libraries of tandem mass spectra, which offers important advantages over other approaches to identification. But tandem libraries are not nearly complete enough to represent the full molecular diversity present in complex biological samples. We present a novel hybrid search method that can help identify metabolites not in the library by similarity to compounds that are. We call it "hybrid" searching because it combines conventional, direct peak matching with the logical equivalent of neutral-loss matching. A successful hybrid search requires the library to contain "cognates" of the unknown: similar compounds with a structural difference confined to a single region of the molecule, that does not substantially alter its fragmentation behavior. We demonstrate that the hybrid search is highly likely to find similar compounds under such circumstances.


Assuntos
Bases de Dados Factuais , Metabolômica/métodos , Espectrometria de Massas em Tandem , Fragmentos de Peptídeos/química , Proteômica/métodos
8.
J Proteome Res ; 17(2): 846-857, 2018 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-29281288

RESUMO

Spectral library searching (SLS) is an attractive alternative to sequence database searching (SDS) for peptide identification due to its speed, sensitivity, and ability to include any selected mass spectra. While decoy methods for SLS have been developed for low mass accuracy peptide spectral libraries, it is not clear that they are optimal or directly applicable to high mass accuracy spectra. Therefore, we report the development and validation of methods for high mass accuracy decoy libraries. Two types of decoy libraries were found to be suitable for this purpose. The first, referred to as Reverse, constructs spectra by reversing a library's peptide sequences except for the C-terminal residue. The second, termed Random, randomly replaces all non-C-terminal residues and either retains the original C-terminal residue or replaces it based on the amino-acid frequency of the library's C-terminus. In both cases the m/z values of fragment ions are shifted accordingly. Determination of FDR is performed in a manner equivalent to SDS, concatenating a library with its decoy prior to a search. The utility of Reverse and Random libraries for target-decoy SLS in estimating false-positives and FDRs was demonstrated using spectra derived from a recently published synthetic human proteome project (Zolg, D. P.; et al. Nat. Methods 2017, 14, 259-262). For data sets from two large-scale label-free and iTRAQ experiments, these decoy building methods yielded highly similar score thresholds and spectral identifications at 1% FDR. The results were also found to be equivalent to those of using the decoy-free PeptideProphet algorithm. Using these new methods for FDR estimation, MSPepSearch, which is freely available search software, led to 18% more identifications at 1% FDR and 23% more at 0.1% FDR when compared with other widely used SDS engines coupled to postprocessing approaches such as Percolator. An application of these methods for FDR estimation for the recently reported "hybrid" library search (Burke, M. C.; et al. J. Proteome Res. 2017, 16, 1924-1935) method is also made. The application of decoy methods for high mass accuracy SLS permits the merging of these results with those of SDS, thereby increasing the assignment of more peptides, leading to deeper proteome coverage.


Assuntos
Algoritmos , Aminoácidos/química , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Bibliotecas Especializadas/métodos , Biblioteca de Peptídeos , Peptídeos/análise , Sequência de Aminoácidos , Humanos , Peptídeos/química , Software , Espectrometria de Massas em Tandem
9.
J Proteome Res ; 16(5): 1924-1935, 2017 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-28367633

RESUMO

We present a mass spectral library-based method to identify tandem mass spectra of peptides that contain unanticipated modifications and amino acid variants. We describe this as a "hybrid" method because it combines matching both ion m/z and mass losses. The mass loss is the difference between the mass of an ion peak and the mass of its precursor. This difference, termed DeltaMass, is used to shift the product ions in the library spectrum that contain the modification, thereby allowing library product ions that contain the unexpected modification to match the query spectrum. Clustered unidentified spectra from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and Chinese hamster ovary cells were used to evaluate this method. The results demonstrate the ability of the hybrid method to identify unanticipated modifications, insertions, and deletions, which may include those due to an incomplete protein sequence database or to search settings that exclude the correct identification, in high-resolution tandem mass spectra without regard to their precursor mass. This has been made possible by indexing of the m/z value of each fragment ion and its difference in mass from its precursor ion.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Células CHO , Linhagem Celular Tumoral , Cricetulus , Bases de Dados Factuais , Humanos , Íons , Peso Molecular , Proteômica/normas
10.
Anal Chem ; 89(24): 13261-13268, 2017 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-29156120

RESUMO

A mass spectral library search algorithm that identifies compounds that differ from library compounds by a single "inert" structural component is described. This algorithm, the Hybrid Similarity Search, generates a similarity score based on matching both fragment ions and neutral losses. It employs the parameter DeltaMass, defined as the mass difference between query and library compounds, to shift neutral loss peaks in the library spectrum to match corresponding neutral loss peaks in the query spectrum. When the spectra being compared differ by a single structural feature, these matching neutral loss peaks should contain that structural feature. This method extends the scope of the library to include spectra of "nearest-neighbor" compounds that differ from library compounds by a single chemical moiety. Additionally, determination of the structural origin of the shifted peaks can aid in the determination of the chemical structure and fragmentation mechanism of the query compound. A variety of examples are presented, including the identification of designer drugs and chemical derivatives not present in the library.


Assuntos
Algoritmos , Drogas Ilícitas/análise , Ferramenta de Busca , Íons/química , Estrutura Molecular , Peso Molecular , Espectrometria de Massas em Tandem
11.
J Proteome Res ; 15(9): 3180-7, 2016 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-27386737

RESUMO

Derivitization of peptides with isobaric tags such as iTRAQ and TMT is widely employed in proteomics due to their compatibility with multiplex quantitative measurements. We recently made publicly available a large peptide library derived from iTRAQ 4-plex labeled spectra. This resource has not been used for identifying peptides labeled with related tags with different masses, because values for virtually all masses of precursor and most product ions would differ for ions containing the different tags as well as containing different tag-specific peaks. We describe a method for interconverting spectra from iTRAQ 4-plex to TMT (6- and 10-plex) and to iTRAQ 8-plex. We interconvert spectra by appropriately mass shifting sequence ions and discarding derivative-specific peaks. After this "cleaning" of search spectra, we demonstrate that the converted libraries perform well in terms of peptide spectral matches. This is demonstrated by comparing results using sequence database searches as well as by comparing search effectiveness using original and converted libraries. At 1% FDR TMT labeled query spectra match 97% as many spectra against a converted iTRAQ library as compared to an original TMT library. Overall this interconversion strategy provides a practical way to extend results from one derivatization method to others that share related chemistry and do not significantly alter fragmentation profiles.


Assuntos
Biblioteca de Peptídeos , Proteômica/métodos , Bases de Dados de Proteínas , Espectrometria de Massas , Peso Molecular , Coloração e Rotulagem
12.
J Proteome Res ; 15(3): 1023-32, 2016 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-26860878

RESUMO

The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.


Assuntos
Neoplasias/diagnóstico , Neoplasias/metabolismo , Proteômica , Biomarcadores Tumorais/metabolismo , Humanos , Proteoma/metabolismo
13.
Anal Chem ; 85(24): 11725-31, 2013 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-24147600

RESUMO

Recent progress in metabolomics and the development of increasingly sensitive analytical techniques have renewed interest in global profiling, i.e., semiquantitative monitoring of all chemical constituents of biological fluids. In this work, we have performed global profiling of NIST SRM 1950, "Metabolites in Human Plasma", using GC-MS, LC-MS, and NMR. Metabolome coverage, difficulties, and reproducibility of the experiments on each platform are discussed. A total of 353 metabolites have been identified in this material. GC-MS provides 65 unique identifications, and most of the identifications from NMR overlap with the LC-MS identifications, except for some small sugars that are not directly found by LC-MS. Also, repeatability and intermediate precision analyses show that the SRM 1950 profiling is reproducible enough to consider this material as a good choice to distinguish between analytical and biological variability. Clinical laboratory data shows that most results are within the reference ranges for each assay. In-house computational tools have been developed or modified for MS data processing and interactive web display. All data and programs are freely available online at http://peptide.nist.gov/ and http://srmd.nist.gov/ .


Assuntos
Análise Química do Sangue/normas , Cromatografia Líquida/normas , Cromatografia Gasosa-Espectrometria de Massas/normas , Internet , Espectroscopia de Ressonância Magnética/normas , Metabolômica/normas , United States Government Agencies , Métodos Analíticos de Preparação de Amostras , Humanos , Padrões de Referência , Software , Estados Unidos
14.
Mol Cell Proteomics ; 9(2): 225-41, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19837981

RESUMO

A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlaboratory studies conducted under a common standard operating procedure identified outlier data and provided clues to specific causes. Moreover, interlaboratory variation reflected by the metrics indicates which system components vary the most between laboratories. Application of these metrics enables rational, quantitative quality assessment for proteomics and other LC-MS/MS analytical applications.


Assuntos
Cromatografia Líquida/métodos , Cromatografia Líquida/normas , Proteômica/métodos , Proteômica/normas , Espectrometria de Massas em Tandem/métodos , Espectrometria de Massas em Tandem/normas , Animais , Galinhas , Proteínas do Ovo/análise , Laboratórios , Proteoma/análise , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/análise , Software
15.
J Forensic Sci ; 65(2): 406-420, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31670846

RESUMO

Recent reports have demonstrated that genetically variant peptides derived from human hair shaft proteins can be used to differentiate individuals of different biogeographic origins. We report a method involving direct extraction of hair shaft proteins more sensitive than previously published methods regarding GVP detection. It involves one step for protein extraction and was found to provide reproducible results. A detailed proteomic analysis of this data is presented that led to the following four results: (i) A peptide spectral library was created and made available for download. It contains all identified peptides from this work, including GVPs that, when appropriately expanded with diverse hair-derived peptides, can provide a routine, reliable, and sensitive means of analyzing hair digests; (ii) an analysis of artifact peptides arising from side reactions is also made using a new method for finding unexpected modifications; (iii) detailed analysis of the gel-based method employed clearly shows the high degree of cross-linking or protein association involved in hair digestion, with major GVPs eluting over a wide range of high molecular weights while others apparently arise from distinct non-cross-linked proteins; and (v) finally, we show that some of the specific GVP identifications depend on the sample preparation method.


Assuntos
Cabelo/metabolismo , Queratinas Específicas do Cabelo/metabolismo , Peptídeos/metabolismo , Proteoma/metabolismo , Artefatos , Cromatografia Líquida , Bases de Dados de Proteínas , Medicina Legal , Humanos , Masculino , Espectrometria de Massas , Proteômica , Reprodutibilidade dos Testes
16.
MAbs ; 10(3): 354-369, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29425077

RESUMO

We describe the creation of a mass spectral library composed of all identifiable spectra derived from the tryptic digest of the NISTmAb IgG1κ. The library is a unique reference spectral collection developed from over six million peptide-spectrum matches acquired by liquid chromatography-mass spectrometry (LC-MS) over a wide range of collision energy. Conventional one-dimensional (1D) LC-MS was used for various digestion conditions and 20- and 24-fraction two-dimensional (2D) LC-MS studies permitted in-depth analyses of single digests. Computer methods were developed for automated analysis of LC-MS isotopic clusters to determine the attributes for all ions detected in the 1D and 2D studies. The library contains a selection of over 12,600 high-quality tandem spectra of more than 3,300 peptide ions identified and validated by accurate mass, differential elution pattern, and expected peptide classes in peptide map experiments. These include a variety of biologically modified peptide spectra involving glycosylated, oxidized, deamidated, glycated, and N/C-terminal modified peptides, as well as artifacts. A complete glycation profile was obtained for the NISTmAb with spectra for 58% and 100% of all possible glycation sites in the heavy and light chains, respectively. The site-specific quantification of methionine oxidation in the protein is described. The utility of this reference library is demonstrated by the analysis of a commercial monoclonal antibody (adalimumab, Humira®), where 691 peptide ion spectra are identifiable in the constant regions, accounting for 60% coverage for both heavy and light chains. The NIST reference library platform may be used as a tool for facile identification of the primary sequence and post-translational modifications, as well as the recognition of LC-MS method-induced artifacts for human and recombinant IgG antibodies. Its development also provides a general method for creating comprehensive peptide libraries of individual proteins.


Assuntos
Adalimumab/análise , Adalimumab/química , Espectrometria de Massas/métodos , Biblioteca de Peptídeos , Animais , Cromatografia Líquida/instrumentação , Humanos
17.
J Am Soc Mass Spectrom ; 28(4): 733-738, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28127680

RESUMO

A method to discover and correct errors in mass spectral libraries is described. Comparing across a set of highly curated reference libraries compounds that have the same chemical structure quickly identifies entries that are outliers. In cases where three or more entries for the same compound are compared, the outlier as determined by visual inspection was almost always found to contain the error. These errors were either in the spectrum itself or in the chemical descriptors that accompanied it. The method is demonstrated on finding errors in compounds of forensic interest in the NIST/EPA/NIH Mass Spectral Library. The target list of compounds checked was the Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG) mass spectral library. Some examples of errors found are described. A checklist of errors that curators should look for when performing inter-library comparisons is provided. Graphical Abstract ᅟ.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa