Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

ECCB2024: The 23rd European Conference on Computational Biology.

Kukkonen-Macchi, Anu; Hautaniemi, Sampsa; Heil, Katharina F; Heinäniemi, Merja; Jensen, Lars Juhl; Junttila, Sini; Käll, Lukas; Laiho, Asta; Maccallum, Peter; Nykter, Matti; Persson, Bengt; Suomi, Tomi; Van Den Bossche, Tim; Nyrönen, Tommi H; Elo, Laura L.

Bioinformatics ; 40(Suppl 2): ii1-ii3, 2024 09 01.

Artigo em Inglês | MEDLINE | ID: mdl-39230712

Assuntos

Biologia Computacional , Biologia Computacional/métodos

2.

Quantitative proteomics of patient fibroblasts reveal biomarkers and diagnostic signatures of mitochondrial disease.

Correia, Sandrina P; Moedas, Marco F; Taylor, Lucie S; Naess, Karin; Lim, Albert Z; McFarland, Robert; Kazior, Zuzanna; Rumyantseva, Anastasia; Wibom, Rolf; Engvall, Martin; Bruhn, Helene; Lesko, Nicole; Végvári, Ákos; Käll, Lukas; Trost, Matthias; Alston, Charlotte L; Freyer, Christoph; Taylor, Robert W; Wedell, Anna; Wredenberg, Anna.

JCI Insight ; 2024 Sep 17.

Artigo em Inglês | MEDLINE | ID: mdl-39288270

RESUMO

BACKGROUND: Mitochondrial diseases belong to the group of inborn errors of metabolism (IEM), with a prevalence of 1:2,000-1:5,000. They are the most common form of IEM, but despite advances in next-generation sequencing technologies, almost half of the patients are left genetically undiagnosed. METHODS: We investigated a cohort of 61 patients with defined mitochondrial disease to improve diagnostics, identify biomarkers, and correlate metabolic pathways to specific disease groups. Clinical presentations were structured using human phenotype ontology terms, and mass spectrometry-based proteomics was performed on primary fibroblasts. Additionally, we integrated six patients carrying variants of uncertain significance (VUS) to test proteomics as a diagnostic expansion. RESULTS: Proteomic profiles from patient samples could be classified according to their biochemical and genetic characteristics, with the expression of five proteins (GPX4, MORF4L1, MOXD1, MSRA and TMED9) correlating with the disease cohort, and thus, acting as putative biomarkers. Pathway analysis showed a deregulation of inflammatory and mitochondrial stress responses. This included the upregulation of glycosphingolipid metabolism and mitochondrial protein import, as well as the downregulation of arachidonic acid metabolism. Furthermore, we could assign pathogenicity to a VUS in MRPS23 by demonstrating the loss of associated mitochondrial ribosome subunits. CONCLUSION: We established mass spectrometry-based proteomics on patient fibroblasts as a viable and versatile tool for diagnosing patients with mitochondrial disease. FUNDING: The NovoNordisk Foundation, Knut and Alice Wallenberg Foundation, Wellcome Centre for Mitochondrial Research, UK Medical Research Council, and the UK NHS Highly Specialised Service for Rare Mitochondrial Disorders of Adults and Children.

3.

quantms: a cloud-based pipeline for quantitative proteomics enables the reanalysis of public proteomics data.

Dai, Chengxin; Pfeuffer, Julianus; Wang, Hong; Zheng, Ping; Käll, Lukas; Sachsenberg, Timo; Demichev, Vadim; Bai, Mingze; Kohlbacher, Oliver; Perez-Riverol, Yasset.

Nat Methods ; 21(9): 1603-1607, 2024 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-38965444

RESUMO

The volume of public proteomics data is rapidly increasing, causing a computational challenge for large-scale reanalysis. Here, we introduce quantms ( https://quant,ms.org/ ), an open-source cloud-based pipeline for massively parallel proteomics data analysis. We used quantms to reanalyze 83 public ProteomeXchange datasets, comprising 29,354 instrument files from 13,132 human samples, to quantify 16,599 proteins based on 1.03 million unique peptides. quantms is based on standard file formats improving the reproducibility, submission and dissemination of the data to ProteomeXchange.

Assuntos

Computação em Nuvem , Proteômica , Software , Proteômica/métodos , Humanos , Bases de Dados de Proteínas , Proteoma/análise , Reprodutibilidade dos Testes , Biologia Computacional/métodos , Peptídeos/análise , Peptídeos/química

4.

Spatial landmark detection and tissue registration with deep learning.

Ekvall, Markus; Bergenstråhle, Ludvig; Andersson, Alma; Czarnewski, Paulo; Olegård, Johannes; Käll, Lukas; Lundeberg, Joakim.

Nat Methods ; 21(4): 673-679, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38438615

RESUMO

Spatial landmarks are crucial in describing histological features between samples or sites, tracking regions of interest in microscopy, and registering tissue samples within a common coordinate framework. Although other studies have explored unsupervised landmark detection, existing methods are not well-suited for histological image data as they often require a large number of images to converge, are unable to handle nonlinear deformations between tissue sections and are ineffective for z-stack alignment, other modalities beyond image data or multimodal data. We address these challenges by introducing effortless landmark detection, a new unsupervised landmark detection and registration method using neural-network-guided thin-plate splines. Our proposed method is evaluated on a diverse range of datasets including histology and spatially resolved transcriptomics, demonstrating superior performance in both accuracy and stability compared to existing approaches.

Assuntos

Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos

5.

Automated model building and protein identification in cryo-EM maps.

Jamali, Kiarash; Käll, Lukas; Zhang, Rui; Brown, Alan; Kimanius, Dari; Scheres, Sjors H W.

Nature ; 628(8007): 450-457, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38408488

RESUMO

Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high levels of expertise and labour-intensive manual intervention in three-dimensional computer graphics programs1,2. Here we present ModelAngelo, a machine-learning approach for automated atomic model building in cryo-EM maps. By combining information from the cryo-EM map with information from protein sequence and structure in a single graph neural network, ModelAngelo builds atomic models for proteins that are of similar quality to those generated by human experts. For nucleotides, ModelAngelo builds backbones with similar accuracy to those built by humans. By using its predicted amino acid probabilities for each residue in hidden Markov model sequence searches, ModelAngelo outperforms human experts in the identification of proteins with unknown sequences. ModelAngelo will therefore remove bottlenecks and increase objectivity in cryo-EM structure determination.

Assuntos

Microscopia Crioeletrônica , Aprendizado de Máquina , Modelos Moleculares , Proteínas , Sequência de Aminoácidos , Microscopia Crioeletrônica/métodos , Microscopia Crioeletrônica/normas , Cadeias de Markov , Redes Neurais de Computação , Conformação Proteica , Proteínas/química , Proteínas/ultraestrutura , Gráficos por Computador

6.

Pathway analysis through mutual information.

Jeuken, Gustavo S; Käll, Lukas.

Bioinformatics ; 40(1)2024 01 02.

Artigo em Inglês | MEDLINE | ID: mdl-38195928

RESUMO

MOTIVATION: In pathway analysis, we aim to establish a connection between the activity of a particular biological pathway and a difference in phenotype. There are many available methods to perform pathway analysis, many of them rely on an upstream differential expression analysis, and many model the relations between the abundances of the analytes in a pathway as linear relationships. RESULTS: Here, we propose a new method for pathway analysis, MIPath, that relies on information theoretical principles and, therefore, does not model the association between pathway activity and phenotype, resulting in relatively few assumptions. For this, we construct a graph of the data points for each pathway using a nearest-neighbor approach and score the association between the structure of this graph and the phenotype of these same samples using Mutual Information while adjusting for the effects of random chance in each score. The initial nearest neighbor approach evades individual gene-level comparisons, hence making the method scalable and less vulnerable to missing values. These properties make our method particularly useful for single-cell data. We benchmarked our method on several single-cell datasets, comparing it to established and new methods, and found that it produces robust, reproducible, and meaningful scores. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/statisticalbiotechnology/mipath, or through Python Package Index as "mipathway."

Assuntos

Software , Fenótipo , Análise por Conglomerados

7.

The Association of Biomolecular Resource Facilities Proteome Informatics Research Group Study on Metaproteomics (iPRG-2020).

Jagtap, Pratik D; Hoopmann, Michael R; Neely, Benjamin A; Harvey, Antony; Käll, Lukas; Perez-Riverol, Yasset; Abajorga, Milky K; Thomas, Julie A; Weintraub, Susan T; Palmblad, Magnus.

J Biomol Tech ; 34(3)2023 Sep 30.

Artigo em Inglês | MEDLINE | ID: mdl-37969874

RESUMO

Metaproteomics research using mass spectrometry data has emerged as a powerful strategy to understand the mechanisms underlying microbiome dynamics and the interaction of microbiomes with their immediate environment. Recent advances in sample preparation, data acquisition, and bioinformatics workflows have greatly contributed to progress in this field. In 2020, the Association of Biomolecular Research Facilities Proteome Informatics Research Group launched a collaborative study to assess the bioinformatics options available for metaproteomics research. The study was conducted in 2 phases. In the first phase, participants were provided with mass spectrometry data files and were asked to identify the taxonomic composition and relative taxa abundances in the samples without supplying any protein sequence databases. The most challenging question asked of the participants was to postulate the nature of any biological phenomena that may have taken place in the samples, such as interactions among taxonomic species. In the second phase, participants were provided a protein sequence database composed of the species present in the sample and were asked to answer the same set of questions as for phase 1. In this report, we summarize the data processing methods and tools used by participants, including database searching and software tools used for taxonomic and functional analysis. This study provides insights into the status of metaproteomics bioinformatics in participating laboratories and core facilities.

Assuntos

Proteoma , Proteômica , Humanos , Proteômica/métodos , Software , Biologia Computacional , Bases de Dados de Proteínas

8.

Spatial multimodal analysis of transcriptomes and metabolomes in tissues.

Vicari, Marco; Mirzazadeh, Reza; Nilsson, Anna; Shariatgorji, Reza; Bjärterot, Patrik; Larsson, Ludvig; Lee, Hower; Nilsson, Mats; Foyer, Julia; Ekvall, Markus; Czarnewski, Paulo; Zhang, Xiaoqun; Svenningsson, Per; Käll, Lukas; Andrén, Per E; Lundeberg, Joakim.

Nat Biotechnol ; 2023 Sep 04.

Artigo em Inglês | MEDLINE | ID: mdl-37667091

RESUMO

We present a spatial omics approach that combines histology, mass spectrometry imaging and spatial transcriptomics to facilitate precise measurements of mRNA transcripts and low-molecular-weight metabolites across tissue regions. The workflow is compatible with commercially available Visium glass slides. We demonstrate the potential of our method using mouse and human brain samples in the context of dopamine and Parkinson's disease.

9.

Retention Time and Fragmentation Predictors Increase Confidence in Identification of Common Variant Peptides.

Skiadopoulou, Dafni; Vasícek, Jakub; Kuznetsova, Ksenia; Bouyssié, David; Käll, Lukas; Vaudel, Marc.

J Proteome Res ; 22(10): 3190-3199, 2023 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-37656829

RESUMO

Precision medicine focuses on adapting care to the individual profile of patients, for example, accounting for their unique genetic makeup. Being able to account for the effect of genetic variation on the proteome holds great promise toward this goal. However, identifying the protein products of genetic variation using mass spectrometry has proven very challenging. Here we show that the identification of variant peptides can be improved by the integration of retention time and fragmentation predictors into a unified proteogenomic pipeline. By combining these intrinsic peptide characteristics using the search-engine post-processor Percolator, we demonstrate improved discrimination power between correct and incorrect peptide-spectrum matches. Our results demonstrate that the drop in performance that is induced when expanding a protein sequence database can be compensated, hence enabling efficient identification of genetic variation products in proteomics data. We anticipate that this enhancement of proteogenomic pipelines can provide a more refined picture of the unique proteome of patients and thereby contribute to improving patient care.

10.

Automated model building and protein identification in cryo-EM maps.

Jamali, Kiarash; Käll, Lukas; Zhang, Rui; Brown, Alan; Kimanius, Dari; Scheres, Sjors H W.

bioRxiv ; 2023 Oct 17.

Artigo em Inglês | MEDLINE | ID: mdl-37292681

RESUMO

Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high levels of expertise and labour-intensive manual intervention. We present ModelAngelo, a machine-learning approach for automated atomic model building in cryo-EM maps. By combining information from the cryo-EM map with information from protein sequence and structure in a single graph neural network, ModelAngelo builds atomic models for proteins that are of similar quality as those generated by human experts. For nucleotides, ModelAngelo builds backbones with similar accuracy as humans. By using its predicted amino acid probabilities for each residue in hidden Markov model sequence searches, ModelAngelo outperforms human experts in the identification of proteins with unknown sequences. ModelAngelo will thus remove bottlenecks and increase objectivity in cryo-EM structure determination.

11.

Triqler for Protein Summarization of Data from Data-Independent Acquisition Mass Spectrometry.

Truong, Patrick; The, Matthew; Käll, Lukas.

J Proteome Res ; 22(4): 1359-1366, 2023 04 07.

Artigo em Inglês | MEDLINE | ID: mdl-36988210

RESUMO

A frequent goal, or subgoal, when processing data from a quantitative shotgun proteomics experiment is a list of proteins that are differentially abundant under the examined experimental conditions. Unfortunately, obtaining such a list is a challenging process, as the mass spectrometer analyzes the proteolytic peptides of a protein rather than the proteins themselves. We have previously designed a Bayesian hierarchical probabilistic model, Triqler, for combining peptide identification and quantification errors into probabilities of proteins being differentially abundant. However, the model was developed for data from data-dependent acquisition. Here, we show that Triqler is also compatible with data-independent acquisition data after applying minor alterations for the missing value distribution. Furthermore, we find that it has better performance than a set of compared state-of-the-art protein summarization tools when evaluated on data-independent acquisition data.

Assuntos

Peptídeos , Proteínas , Teorema de Bayes , Proteínas/análise , Peptídeos/análise , Espectrometria de Massas/métodos , Proteômica/métodos

12.

Toward an Integrated Machine Learning Model of a Proteomics Experiment.

Neely, Benjamin A; Dorfer, Viktoria; Martens, Lennart; Bludau, Isabell; Bouwmeester, Robbin; Degroeve, Sven; Deutsch, Eric W; Gessulat, Siegfried; Käll, Lukas; Palczynski, Pawel; Payne, Samuel H; Rehfeldt, Tobias Greisager; Schmidt, Tobias; Schwämmle, Veit; Uszkoreit, Julian; Vizcaíno, Juan Antonio; Wilhelm, Mathias; Palmblad, Magnus.

J Proteome Res ; 22(3): 681-696, 2023 03 03.

Artigo em Inglês | MEDLINE | ID: mdl-36744821

RESUMO

In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.

Assuntos

Aprendizado de Máquina , Proteômica , Proteômica/métodos , Algoritmos , Espectrometria de Massas

13.

Integrating Identification and Quantification Uncertainty for Differential Protein Abundance Analysis with Triqler.

The, Matthew; Käll, Lukas.

Methods Mol Biol ; 2426: 91-117, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36308686

RESUMO

Protein quantification for shotgun proteomics is a complicated process where errors can be introduced in each of the steps. Triqler is a Python package that estimates and integrates errors of the different parts of the label-free protein quantification pipeline into a single Bayesian model. Specifically, it weighs the quantitative values by the confidence we have in the correctness of the corresponding PSM. Furthermore, it treats missing values in a way that reflects their uncertainty relative to observed values. Finally, it combines these error estimates in a single differential abundance FDR that not only reflects the errors and uncertainties in quantification but also in identification. In this tutorial, we show how to (1) generate input data for Triqler from quantification packages such as MaxQuant and Quandenser, (2) run Triqler and what the different options are, (3) interpret the results, (4) investigate the posterior distributions of a protein of interest in detail, and (5) verify that the hyperparameter estimations are sensible.

Assuntos

Proteínas , Proteômica , Teorema de Bayes , Incerteza , Proteômica/métodos , Software

14.

A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics.

Luo, Xiyang; Bittremieux, Wout; Griss, Johannes; Deutsch, Eric W; Sachsenberg, Timo; Levitsky, Lev I; Ivanov, Mark V; Bubis, Julia A; Gabriels, Ralf; Webel, Henry; Sanchez, Aniel; Bai, Mingze; Käll, Lukas; Perez-Riverol, Yasset.

J Proteome Res ; 21(6): 1566-1574, 2022 06 03.

Artigo em Inglês | MEDLINE | ID: mdl-35549218

RESUMO

Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.

Assuntos

Proteômica , Espectrometria de Massas em Tandem , Algoritmos , Análise por Conglomerados , Consenso , Bases de Dados de Proteínas , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos

15.

Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities.

Ekvall, Markus; Truong, Patrick; Gabriel, Wassim; Wilhelm, Mathias; Käll, Lukas.

J Proteome Res ; 21(5): 1359-1364, 2022 05 06.

Artigo em Inglês | MEDLINE | ID: mdl-35413196

RESUMO

Machine learning has been an integral part of interpreting data from mass spectrometry (MS)-based proteomics for a long time. Relatively recently, a machine-learning structure appeared successful in other areas of bioinformatics, Transformers. Furthermore, the implementation of Transformers within bioinformatics has become relatively convenient due to transfer learning, i.e., adapting a network trained for other tasks to new functionality. Transfer learning makes these relatively large networks more accessible as it generally requires less data, and the training time improves substantially. We implemented a Transformer based on the pretrained model TAPE to predict MS2 intensities. TAPE is a general model trained to predict missing residues from protein sequences. Despite being trained for a different task, we could modify its behavior by adding a prediction head at the end of the TAPE model and fine-tune it using the spectrum intensity from the training set to the well-known predictor Prosit. We demonstrate that the predictor, which we call Prosit Transformer, outperforms the recurrent neural-network-based predictor Prosit, increasing the median angular similarity on its hold-out set from 0.908 to 0.929. We believe that Transformers will significantly increase prediction accuracy for other types of predictions within MS-based proteomics.

Assuntos

Aprendizado de Máquina , Redes Neurais de Computação , Sequência de Aminoácidos , Espectrometria de Massas , Proteômica

16.

Survival analysis of pathway activity as a prognostic determinant in breast cancer.

Jeuken, Gustavo S; Tobin, Nicholas P; Käll, Lukas.

PLoS Comput Biol ; 18(3): e1010020, 2022 03.

Artigo em Inglês | MEDLINE | ID: mdl-35344554

RESUMO

High throughput biology enables the measurements of relative concentrations of thousands of biomolecules from e.g. tissue samples. The process leaves the investigator with the problem of how to best interpret the potentially large number of differences between samples. Many activities in a cell depend on ordered reactions involving multiple biomolecules, often referred to as pathways. It hence makes sense to study differences between samples in terms of altered pathway activity, using so-called pathway analysis. Traditional pathway analysis gives significance to differences in the pathway components' concentrations between sample groups, however, less frequently used methods for estimating individual samples' pathway activities have been suggested. Here we demonstrate that such a method can be used for pathway-based survival analysis. Specifically, we investigate the pathway activities' association with patients' survival time based on the transcription profiles of the METABRIC dataset. Our implementation shows that pathway activities are better prognostic markers for survival time in METABRIC than the individual transcripts. We also demonstrate that we can regress out the effect of individual pathways on other pathways, which allows us to estimate the other pathways' residual pathway activity on survival. Furthermore, we illustrate how one can visualize the often interdependent measures over hierarchical pathway databases using sunburst plots.

Assuntos

Neoplasias da Mama , Neoplasias da Mama/metabolismo , Feminino , Humanos , Prognóstico , Análise de Sobrevida

17.

Interpretation of the DOME Recommendations for Machine Learning in Proteomics and Metabolomics.

Palmblad, Magnus; Böcker, Sebastian; Degroeve, Sven; Kohlbacher, Oliver; Käll, Lukas; Noble, William Stafford; Wilhelm, Mathias.

J Proteome Res ; 21(4): 1204-1207, 2022 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-35119864

RESUMO

Machine learning is increasingly applied in proteomics and metabolomics to predict molecular structure, function, and physicochemical properties, including behavior in chromatography, ion mobility, and tandem mass spectrometry. These must be described in sufficient detail to apply or evaluate the performance of trained models. Here we look at and interpret the recently published and general DOME (Data, Optimization, Model, Evaluation) recommendations for conducting and reporting on machine learning in the specific context of proteomics and metabolomics.

Assuntos

Metabolômica , Proteômica , Aprendizado de Máquina , Metabolômica/métodos , Proteômica/métodos , Espectrometria de Massas em Tandem

18.

Putting Humpty Dumpty Back Together Again: What Does Protein Quantification Mean in Bottom-Up Proteomics?

Plubell, Deanna L; Käll, Lukas; Webb-Robertson, Bobbie-Jo; Bramer, Lisa M; Ives, Ashley; Kelleher, Neil L; Smith, Lloyd M; Montine, Thomas J; Wu, Christine C; MacCoss, Michael J.

J Proteome Res ; 21(4): 891-898, 2022 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-35220718

RESUMO

Bottom-up proteomics provides peptide measurements and has been invaluable for moving proteomics into large-scale analyses. Commonly, a single quantitative value is reported for each protein-coding gene by aggregating peptide quantities into protein groups following protein inference or parsimony. However, given the complexity of both RNA splicing and post-translational protein modification, it is overly simplistic to assume that all peptides that map to a singular protein-coding gene will demonstrate the same quantitative response. By assuming that all peptides from a protein-coding sequence are representative of the same protein, we may miss the discovery of important biological differences. To capture the contributions of existing proteoforms, we need to reconsider the practice of aggregating protein values to a single quantity per protein-coding gene.

Assuntos

Proteínas , Proteômica , Peptídeos/genética , Peptídeos/metabolismo , Processamento de Proteína Pós-Traducional , Proteínas/metabolismo , Proteoma/genética , Proteoma/metabolismo

19.

Finding haplotypic signatures in proteins.

Vasícek, Jakub; Skiadopoulou, Dafni; Kuznetsova, Ksenia G; Wen, Bo; Johansson, Stefan; Njølstad, Pål R; Bruckner, Stefan; Käll, Lukas; Vaudel, Marc.

Gigascience ; 122022 12 28.

Artigo em Inglês | MEDLINE | ID: mdl-37919975

RESUMO

BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown. FINDINGS: Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches. CONCLUSIONS: As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time.

Assuntos

Proteínas , Proteômica , Proteômica/métodos , Haplótipos , Reprodutibilidade dos Testes , Proteínas/genética , Peptídeos

20.

Triqler for MaxQuant: Enhancing Results from MaxQuant by Bayesian Error Propagation and Integration.

The, Matthew; Käll, Lukas.

J Proteome Res ; 20(4): 2062-2068, 2021 04 02.

Artigo em Inglês | MEDLINE | ID: mdl-33661646

RESUMO

Error estimation for differential protein quantification by label-free shotgun proteomics is challenging due to the multitude of error sources, each contributing uncertainty to the final results. We have previously designed a Bayesian model, Triqler, to combine such error terms into one combined quantification error. Here we present an interface for Triqler that takes MaxQuant results as input, allowing quick reanalysis of already processed data. We demonstrate that Triqler outperforms the original processing for a large set of both engineered and clinical/biological relevant data sets. Triqler and its interface to MaxQuant are available as a Python module under an Apache 2.0 license from https://pypi.org/project/triqler/.

Assuntos

Proteômica , Software , Teorema de Bayes , Proteínas

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA