Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 69
Filtrar
1.
Open Res Eur ; 4: 71, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38903702

RESUMO

Background: Data-dependent, bottom-up proteomics is widely used for identifying proteins and peptides. However, one key challenge is that 70% of fragment ion spectra consistently fail to be assigned by conventional database searching. This 'dark matter' of bottom-up proteomics seems to affect fields where non-model organisms, low-abundance proteins, non-tryptic peptides, and complex modifications may be present. While palaeoproteomics may appear as a niche field, understanding and reporting unidentified ancient spectra require collaborative innovation in bioinformatics strategies. This may advance the analysis of complex datasets. Methods: 14.97 million high-impact ancient spectra published in Nature and Science portfolios were mined from public repositories. Identification rates, defined as the proportion of assigned fragment ion spectra, were collected as part of deposited database search outputs or parsed using open-source python packages. Results and Conclusions: We report that typically 94% of the published ancient spectra remain unidentified. This phenomenon may be caused by multiple factors, notably the limitations of database searching and the selection of user-defined reference data with advanced modification patterns. These 'spectra without stories' highlight the need for widespread data sharing to facilitate methodological development and minimise the loss of often irreplaceable ancient materials. Testing and validating alternative search strategies, such as open searching and de novo sequencing, may also improve overall identification rates. Hence, lessons learnt in palaeoproteomics may benefit other fields grappling with challenging data.

2.
J Proteome Res ; 23(7): 2518-2531, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38810119

RESUMO

Phosphorylation is the most studied post-translational modification, and has multiple biological functions. In this study, we have reanalyzed publicly available mass spectrometry proteomics data sets enriched for phosphopeptides from Asian rice (Oryza sativa). In total we identified 15,565 phosphosites on serine, threonine, and tyrosine residues on rice proteins. We identified sequence motifs for phosphosites, and link motifs to enrichment of different biological processes, indicating different downstream regulation likely caused by different kinase groups. We cross-referenced phosphosites against the rice 3,000 genomes, to identify single amino acid variations (SAAVs) within or proximal to phosphosites that could cause loss of a site in a given rice variety and clustered the data to identify groups of sites with similar patterns across rice family groups. The data has been loaded into UniProt Knowledge-Base─enabling researchers to visualize sites alongside other data on rice proteins, e.g., structural models from AlphaFold2, PeptideAtlas, and the PRIDE database─enabling visualization of source evidence, including scores and supporting mass spectra.


Assuntos
Genoma de Planta , Oryza , Fosfoproteínas , Proteínas de Plantas , Proteômica , Transdução de Sinais , Oryza/genética , Oryza/metabolismo , Oryza/química , Proteômica/métodos , Fosfoproteínas/metabolismo , Fosfoproteínas/genética , Fosfoproteínas/química , Fosfoproteínas/análise , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Fosforilação , Processamento de Proteína Pós-Traducional , Fosfopeptídeos/metabolismo , Fosfopeptídeos/análise , Bases de Dados de Proteínas , Motivos de Aminoácidos , Espectrometria de Massas
3.
Acta Biomater ; 180: 61-81, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38588997

RESUMO

A plethora of biomaterials for heart repair are being tested worldwide for potential clinical application. These therapeutics aim to enhance the quality of life of patients with heart disease using various methods to improve cardiac function. Despite the myriad of therapeutics tested, only a minority of these studied biomaterials have entered clinical trials. This rapid scoping review aims to analyze literature available from 2012 to 2022 with a focus on clinical trials using biomaterials for direct cardiac repair, i.e., where the intended function of the biomaterial is to enhance the repair of the endocardium, myocardium, epicardium or pericardium. This review included neither biomaterials related to stents and valve repair nor biomaterials serving as vehicles for the delivery of drugs. Surprisingly, the literature search revealed that only 8 different biomaterials mentioned in 23 different studies out of 7038 documents (journal articles, conference abstracts or clinical trial entries) have been tested in clinical trials since 2012. All of these, intended to treat various forms of ischaemic heart disease (heart failure, myocardial infarction), were of natural origin and most used direct injections as their delivery method. This review thus reveals notable gaps between groups of biomaterials tested pre-clinically and clinically. STATEMENT OF SIGNIFICANCE: Rapid scoping review of clinical application of biomaterials for cardiac repair. 7038 documents screened; 23 studies mention 8 different biomaterials only. Biomaterials for repair of endocardium, myocardium, epicardium or pericardium. Only 8 different biomaterials entered clinical trials in the past 10 years. All of the clinically translated biomaterials were of natural origin.


Assuntos
Materiais Biocompatíveis , Humanos , Materiais Biocompatíveis/química , Materiais Biocompatíveis/uso terapêutico , Animais
4.
J Proteome Res ; 23(8): 3041-3051, 2024 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-38426863

RESUMO

Neuropeptides represent a unique class of signaling molecules that have garnered much attention but require special consideration when identifications are gleaned from mass spectra. With highly variable sequence lengths, neuropeptides must be analyzed in their endogenous state. Further, neuropeptides share great homology within families, differing by as little as a single amino acid residue, complicating even routine analyses and necessitating optimized computational strategies for confident and accurate identifications. We present EndoGenius, a database searching strategy designed specifically for elucidating neuropeptide identifications from mass spectra by leveraging optimized peptide-spectrum matching approaches, an expansive motif database, and a novel scoring algorithm to achieve broader representation of the neuropeptidome and minimize reidentification. This work describes an algorithm capable of reporting more neuropeptide identifications at 1% false-discovery rate than alternative software in five Callinectes sapidus neuronal tissue types.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Neuropeptídeos , Software , Neuropeptídeos/análise , Neuropeptídeos/química , Animais , Espectrometria de Massas/métodos , Sequência de Aminoácidos , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos
5.
Health Info Libr J ; 41(1): 1-3, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38418378

RESUMO

In this editorial, Anthea Sutton and Veronica Parisi reflect on ChatGPT, how it may contribute to systematic searching, and provide their overview of some recent training they attended on ChatGPT, AI and systematic literature reviews.

6.
Forensic Sci Int Genet ; 69: 103000, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38199167

RESUMO

In the absence of a suspect the forensic aim is investigative, and the focus is one of discerning what genotypes best explain the evidence. In traditional systems, the list of candidate genotypes may become vast if the sample contains DNA from many donors or the information from a minor contributor is swamped by that of major contributors, leading to lower evidential value for a true donor's contribution and, as a result, possibly overlooked or inefficient investigative leads. Recent developments in single-cell analysis offer a way forward, by producing data capable of discriminating genotypes. This is accomplished by first clustering single-cell data by similarity without reference to a known genotype. With good clustering it is reasonable to assume that the scEPGs in a cluster are of a single contributor. With that assumption we determine the probability of a cluster's content given each possible genotype at each locus, which is then used to determine the posterior probability mass distribution for all genotypes by application of Bayes' rule. A decision criterion is then applied such that the sum of the ranked probabilities of all genotypes falling in the set is at least 1-α. This is the credible genotype set and is used to inform database search criteria. Within this work we demonstrate the salience of single-cell analysis by performance testing a set of 630 previously constructed admixtures containing up to 5 donors of balanced and unbalanced contributions. We use scEPGs that were generated by isolating single cells, employing a direct-to-PCR extraction treatment, amplifying STRs that are compliant with existing national databases and applying post-PCR treatments that elicit a detection limit of one DNA copy. We determined that, for these test data, 99.3% of the true genotypes are included in the 99.8% credible set, regardless of the number of donors that comprised the mixture. We also determined that the most probable genotype was the true genotype for 97% of the loci when the number of cells in a cluster was at least two. Since efficient investigative leads will be borne by posterior mass distributions that are narrow and concentrated at the true genotype, we report that, for this test set, 47,900 (86%) loci returned only one credible genotype and of these 47,551 (99%) were the true genotype. When determining the LR for true contributors, 91% of the clusters rendered LR>1018, showing the potential of single-cell data to positively affect investigative reporting.


Assuntos
Impressões Digitais de DNA , Repetições de Microssatélites , Humanos , Impressões Digitais de DNA/métodos , Teorema de Bayes , Genótipo , DNA/genética , Funções Verossimilhança
7.
Health Info Libr J ; 41(1): 76-83, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37574776

RESUMO

BACKGROUND: Latin American and Caribbean Health Sciences Literature (LILACS) is the main reference database in the region; however, the way in which this resource is used in Cochrane systematic reviews has not been studied. OBJECTIVES: To assess the search methods of Cochrane reviews that used LILACS as a source of information and explore the Cochrane community's perceptions about this resource. METHODS: We identified all Cochrane reviews of interventions published during 2019, which included LILACS as a source of information, and analysed their search methods and also ran a survey through the Cochrane Community. RESULTS: We found 133 Cochrane reviews that reported the full search strategies, identifying heterogeneity in search details. The respondents to our survey highlighted many areas for improvement in the use of LILACS, including the usability of the search platform for this purpose. DISCUSSION: The use and reporting of LILACS in Cochrane reviews demonstrate inconsistencies, as evidenced by the analysis of search reports from systematic reviews and surveys conducted among members of the Cochrane community. CONCLUSION: With better guidance on how LILACS database is structured, information specialists working on Cochrane reviews should be able to make more effective use of this unique resource.


Assuntos
Serviços de Informação , Medicina , Humanos , Publicações , Inquéritos e Questionários
8.
Proteomes ; 11(4)2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-37987316

RESUMO

Cannabis has been used historically for both medicinal and recreational purposes, with the most notable cannabinoids being cannabidiol (CBD) and tetrahydrocannabinol (THC). Although their therapeutic effects have been well studied and their recreational use is highly debated, the underlying mechanisms of their biological effects remain poorly defined. In this study, we use isobaric tag-based sample multiplexed proteome profiling to investigate protein abundance differences in the human neuroblastoma SH-SY5Y cell line treated with CBD and THC. We identified significantly regulated proteins by each treatment and performed a pathway classification and associated protein-protein interaction analysis. Our findings suggest that these treatments may lead to mitochondrial dysfunction and induce endoplasmic reticulum stress. These data can potentially be interrogated further to investigate the potential role of CBD and THC in various biological and disease contexts, providing a foundation for future studies.

9.
Health Info Libr J ; 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-38013506

RESUMO

BACKGROUND: Medication discontinuation studies explore the outcomes of stopping a medication compared to continuing it. Comprehensively identifying medication discontinuation articles in bibliographic databases remains challenging due to variability in terminology. OBJECTIVES: To develop and validate search filters to retrieve medication discontinuation articles in Medline and Embase. METHODS: We identified medication discontinuation articles in a convenience sample of systematic reviews. We used primary articles to create two reference sets for Medline and Embase, respectively. The reference sets were equally divided by randomization in development sets and validation sets. Terms relevant for discontinuation were identified by term frequency analysis in development sets and combined to develop two search filters that maximized relative recalls. The filters were validated against validation sets. Relative recalls were calculated with their 95% confidences intervals (95% CI). RESULTS: We included 316 articles for Medline and 407 articles for Embase, from 15 systematic reviews. The Medline optimized search filter combined 7 terms. The Embase optimized search filter combined 8 terms. The relative recalls were respectively 92% (95% CI: 87-96) and 91% (95% CI: 86-94). CONCLUSIONS: We developed two search filters for retrieving medication discontinuation articles in Medline and Embase. Further research is needed to estimate precision and specificity of the filters.

10.
BMC Bioinformatics ; 24(1): 351, 2023 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-37730532

RESUMO

BACKGROUND: Cross-linking mass spectrometry (XL-MS) is a powerful technique for detecting protein-protein interactions (PPIs) and modeling protein structures in a high-throughput manner. In XL-MS experiments, proteins are cross-linked by a chemical reagent (namely cross-linker), fragmented, and then fed into a tandem mass spectrum (MS/MS). Cross-linkers are either cleavable or non-cleavable, and each type requires distinct data analysis tools. However, both types of cross-linkers suffer from imbalanced fragmentation efficiency, resulting in a large number of unidentifiable spectra that hinder the discovery of PPIs and protein conformations. To address this challenge, researchers have sought to improve the sensitivity of XL-MS through invention of novel cross-linking reagents, optimization of sample preparation protocols, and development of data analysis algorithms. One promising approach to developing new data analysis methods is to apply a protein feedback mechanism in the analysis. It has significantly improved the sensitivity of analysis methods in the cleavable cross-linking data. The application of the protein feedback mechanism to the analysis of non-cleavable cross-linking data is expected to have an even greater impact because the majority of XL-MS experiments currently employs non-cleavable cross-linkers. RESULTS: In this study, we applied the protein feedback mechanism to the analysis of both non-cleavable and cleavable cross-linking data and observed a substantial improvement in cross-link spectrum matches (CSMs) compared to conventional methods. Furthermore, we developed a new software program, ECL 3.0, that integrates two algorithms and includes a user-friendly graphical interface to facilitate wider applications of this new program. CONCLUSIONS: ECL 3.0 source code is available at https://github.com/yuweichuan/ECL-PF.git . A quick tutorial is available at https://youtu.be/PpZgbi8V2xI .


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Algoritmos , Reagentes de Ligações Cruzadas , Análise de Dados
11.
Health Info Libr J ; 40(3): 233-261, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37531012

RESUMO

BACKGROUND: Traditional and complementary medicine (T&CM) is highly utilised and draws on traditional knowledge (TK) as evidence, raising a need to explore how TK is currently used. OBJECTIVES: Examine criteria used to select, evaluate and apply TK in contemporary health contexts. METHODS: Systematic search utilising academic databases (AMED, CINAHL, MEDLINE, EMBASE, SSCI, ProQuest Dissertations Theses Global), Trip clinical database and Google search engine. Citations and reference lists of included articles were searched. Reported use of TK in contemporary settings was mapped against a modified 'Exploration-Preparation-Implementation-Sustainment' (EPIS) implementation framework. RESULTS: From the 54 included articles, EPIS mapping found TK is primarily used in the Exploration phase of implementation (n = 54), with little reporting on Preparation (n = 16), Implementation process (n = 6) or Sustainment (n = 4) of TK implementation. Criteria used in selection, evaluation and application of TK commonly involved validation with other scientific/traditional evidence sources, or assessment of factors influencing knowledge translation. DISCUSSION: One of the difficulties in validation of TK (as a co-opted treatment) against other evidence sources is comparing like with like as TK often takes a holistic approach. This complicates further planning and evaluation of implementation. CONCLUSION: This review identifies important criteria for evaluating current and potential contemporary use of TK, identifying gaps in research and practice for finding, appraising and applying relevant TK studies for clinical care.


Assuntos
Educação em Saúde , Conhecimento , Políticas , Humanos
12.
Med Ref Serv Q ; 42(3): 211-227, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37459485

RESUMO

This study examines the frequency of misspellings in health sciences literature and explores how they affect citation retrieval in multiple databases. Searches for commonly misspelled medical words were conducted in PubMed, CINAHL Complete, APA PsycArticles (ProQuest), APA PsycInfo, and ProQuest Psychology databases. Citations that would be retrieved using a word's correct spelling were removed from the search results. Remaining results were citations that could only be retrieved if the word was misspelled in the search. Articles with clinical significance were targeted. The top five most commonly misspelled words were occurrence, ophthalmology, pruritus, sagittal, and resistance. Ophthalmology had the highest number of citations that contained at least one misspelling, with 57% of those citations "missing" when searched with the correct spelling of the word. The word with the highest percentage (82%) of missed citations was arrhythmia. The results of this study indicate that misspellings in scholarly literature are more prevalent than searchers might realize. The ability to retrieve citations is adversely affected by misspellings, which has the potential to affect patient care. Many opportunities exist in the editorial process to identify and correct misspellings before publication. This is less so once a journal is published. The implications for database searching and manuscript evaluation are discussed.


Assuntos
Medicina , Humanos , PubMed , Bases de Dados Factuais
13.
J Proteome Res ; 22(4): 1298-1308, 2023 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-36892105

RESUMO

Single-cell proteomics is emerging as an important subfield in the proteomics and mass spectrometry communities, with potential to reshape our understanding of cell development, cell differentiation, disease diagnosis, and the development of new therapies. Compared with significant advancements in the "hardware" that is used in single-cell proteomics, there has been little work comparing the effects of using different "software" packages to analyze single-cell proteomics datasets. To this end, seven popular proteomics programs were compared here, applying them to search three single-cell proteomics datasets generated by three different platforms. The results suggest that MSGF+, MSFragger, and Proteome Discoverer are generally more efficient in maximizing protein identifications, that MaxQuant is better suited for the identification of low-abundance proteins, that MSFragger is superior in elucidating peptide modifications, and that Mascot and X!Tandem are better for analyzing long peptides. Furthermore, an experiment with different loading amounts was carried out to investigate changes in identification results and to explore areas in which single-cell proteomics data analysis may be improved in the future. We propose that this comparative study may provide insight for experts and beginners alike operating in the emerging subfield of single-cell proteomics.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Ferramenta de Busca/métodos , Software , Proteoma/análise , Bases de Dados de Proteínas
14.
J Proteome Res ; 22(10): 3123-3134, 2023 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-36809008

RESUMO

Protein database search engines are an integral component of mass spectrometry-based peptidomic analyses. Given the unique computational challenges of peptidomics, many factors must be taken into consideration when optimizing search engine selection, as each platform has different algorithms by which tandem mass spectra are scored for subsequent peptide identifications. In this study, four different database search engines, PEAKS, MS-GF+, OMSSA, and X! Tandem, were compared with Aplysia californica and Rattus norvegicus peptidomics data sets, and various metrics were assessed such as the number of unique peptide and neuropeptide identifications, and peptide length distributions. Given the tested conditions, PEAKS was found to have the highest number of peptide and neuropeptide identifications out of the four search engines in both data sets. Furthermore, principal component analysis and multivariate logistic regression were employed to determine whether specific spectral features contribute to false C-terminal amidation assignments by each search engine. From this analysis, it was found that the primary features influencing incorrect peptide assignments were the precursor and fragment ion m/z errors. Finally, an assessment employing a mixed species protein database was performed to evaluate search engine precision and sensitivity when searched against an enlarged search space containing human proteins.


Assuntos
Neuropeptídeos , Ferramenta de Busca , Humanos , Animais , Ratos , Peptídeos , Algoritmos , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Software
15.
J Proteome Res ; 22(2): 482-490, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36695531

RESUMO

Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Software , Proteômica , Ferramenta de Busca , Biblioteca de Peptídeos , Bases de Dados de Proteínas
16.
J Proteome Res ; 22(2): 334-342, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36414539

RESUMO

Stochastic, intensity-based precursor isolation can result in isotopically enriched fragment ions. This problem is exacerbated for large peptides and stable isotope labeling experiments using deuterium or 15N. For stable isotope labeling experiments, incomplete and ubiquitous labeling strategies result in the isolation of peptide ions composed of many distinct structural isomers. Unfortunately, existing proteomics search algorithms do not account for this variability in isotopic incorporation, and thus often yield poor peptide and protein identification rates. We sought to resolve this shortcoming by deriving the expected isotopic distributions of each fragment ion and incorporating them into the theoretical mass spectra used for peptide-spectrum-matching. We adapted the Comet search platform to integrate a modified spectral prediction algorithm we term Conditional fragment Ion Distribution Search (CIDS). Comet-CIDS uses a traditional database searching strategy, but for each candidate peptide we compute the isotopic distribution of each fragment to better match the observed m/z distributions. Evaluating previously generated D2O and 15N labeled data sets, we found that Comet-CIDS identified more confident peptide spectral matches and higher protein sequence coverage compared to traditional theoretical spectra generation, with the magnitude of improvement largely determined by the amount of labeling in the sample.


Assuntos
Peptídeos , Proteínas , Peptídeos/química , Proteínas/metabolismo , Sequência de Aminoácidos , Probabilidade , Íons
17.
Health Info Libr J ; 39(3): 203-206, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36150183

RESUMO

Health science libraries have been using information technology since the late 1960s, shaping both the profession and the mission of these libraries. To explore the impact of technology, a series of articles has been commissioned for the HILJ Regular Feature, International Perspectives and Initiatives. This editorial sets the scene for this series of articles, which starts in this issue. These articles, written by health science librarians from around the globe, will explore the impact of technology on the way health science libraries provide information in the digital age. Some articles will look at national trends and others will focus on a particular library. A key theme is how technology is being used to support the mission of health science libraries and whether technology has altered that mission. This editorial provides a brief overview of the technologies libraries have adopted, from the 1970s to the present day. From this, it is clear that information technology has transformed the way health information is collected, catalogued, and disseminated to users. And it is certain that in the coming decade new technologies will be incorporated into health science libraries, which will pose challenges for both users and librarians. However, librarians will continue to find ways to adapt and use these tools to meet the needs of their users.


Assuntos
Bibliotecários , Bibliotecas Médicas , Biblioteconomia , Humanos , Tecnologia
18.
Res Synth Methods ; 13(6): 760-789, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-35657294

RESUMO

Systematic searches are integral to identifying the evidence that is used in National Institute for Health and Care Excellence (NICE) public health guidelines (PHGs). This study analyses the sources, including bibliographic databases and other techniques, required for PHGs. The aims were to analyse the sources used to identify the publications included in NICE PHGs; and to assess whether fewer sources could have been searched to retrieve these publications. Data showing how the included publications had been identified was collated using search summary tables. Three scenarios were created to test various combinations of sources to determine whether fewer sources could have been used. The sample included 29 evidence reviews, compiled using 13 searches, to support 10 PHG topics. Across the PHGs, 23 databases and six other techniques retrieved included publications. A mean reduction in total results of 6.5% could have been made if the minimum set of sources plus Cochrane Library, Embase, and MEDLINE were searched. On average, Cochrane Library, Embase, and MEDLINE contributed 76.8% of the included publications, with other databases adding 11% and other techniques 12.2%. None of the searches had a minimum set that was comprised entirely of databases. There was not a core set of sources for PHGs. A range of databases and techniques, covering a multi-disciplinary evidence base, was required to identify all included publications. It would be possible to reduce the number of sources searched and make some gains in productivity. It is important to create a tailored set of sources to do an efficient search.


Assuntos
Armazenamento e Recuperação da Informação , Saúde Pública , Bases de Dados Bibliográficas , MEDLINE
19.
Methods Mol Biol ; 2499: 1-41, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35696073

RESUMO

Post-translational modifications (PTMs) regulate complex biological processes through the modulation of protein activity, stability, and localization. Insights into the specific modification type and localization within a protein sequence can help ascertain functional significance. Computational models are increasingly demonstrated to offer a low-cost, high-throughput method for comprehensive PTM predictions. Algorithms are optimized using existing experimental PTM data, thus accurate prediction performance relies on the creation of robust datasets. Herein, advancements in mass spectrometry-based proteomics technologies to maximize PTM coverage are reviewed. Further, requisite experimental validation approaches for PTM predictions are explored to ensure that follow-up mechanistic studies are focused on accurate modification sites.


Assuntos
Biologia Computacional , Processamento de Proteína Pós-Traducional , Biologia Computacional/métodos , Simulação por Computador , Espectrometria de Massas , Proteômica/métodos
20.
J Proteome Res ; 21(7): 1603-1615, 2022 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-35640880

RESUMO

Phosphoproteomic methods are commonly employed to identify and quantify phosphorylation sites on proteins. In recent years, various tools have been developed, incorporating scores or statistics related to whether a given phosphosite has been correctly identified or to estimate the global false localization rate (FLR) within a given data set for all sites reported. These scores have generally been calibrated using synthetic datasets, and their statistical reliability on real datasets is largely unknown, potentially leading to studies reporting incorrectly localized phosphosites, due to inadequate statistical control. In this work, we develop the concept of scoring modifications on a decoy amino acid, that is, one that cannot be modified, to allow for independent estimation of global FLR. We test a variety of amino acids, on both synthetic and real data sets, demonstrating that the selection can make a substantial difference to the estimated global FLR. We conclude that while several different amino acids might be appropriate, the most reliable FLR results were achieved using alanine and leucine as decoys. We propose the use of a decoy amino acid to control false reporting in the literature and in public databases that re-distribute the data. Data are available via ProteomeXchange with identifier PXD028840.


Assuntos
Aminoácidos , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA