Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Proteomics ; 24(8): e2300154, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38044297

RESUMEN

We propose an updated approach for approximating the isotope distribution of average peptides given their monoisotopic mass. Our methodology involves in-silico cleavage of the entire UNIPROT database of human-reviewed proteins using Trypsin, generating a theoretical peptide dataset. The isotope distribution is computed using BRAIN. We apply a compositional data modelling strategy that utilizes an additive log-ratio transformation for the isotope probabilities followed by a penalized spline regression. Furthermore, due to the impact of the number of sulphur atoms on the course of the isotope distribution, we develop separate models for peptides containing zero up to five sulphur atoms. Additionally, we propose three methods to estimate the number of sulphur atoms based on an observed isotope distribution. The performance of the spline models and the sulphur prediction approaches is evaluated using a mean squared error and a modified Pearson's χ2 goodness-of-fit measure on an experimental UPS2 data set. Our analysis reveals that the variability in spectral accuracy, that is, the variability between MS1 scans, contributes more to the errors than the approximation of the theoretical isotope distribution by our proposed average peptide model. Moreover, we find that the accuracy of predicting the number of sulphur atoms based on the observed isotope distribution is limited by measurement accuracy.


Asunto(s)
Isótopos , Péptidos , Humanos , Azufre
2.
Rapid Commun Mass Spectrom ; 37(9): e9480, 2023 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-36798055

RESUMEN

RATIONALE: The observed isotope distribution is an important attribute for the identification of peptides and proteins in mass spectrometry-based proteomics. Sulphur atoms have a very distinctive elemental isotope definition, and therefore, the presence of sulphur atoms has a substantial effect on the isotope distribution of biomolecules. Hence, knowledge of the number of sulphur atoms can improve the identification of peptides and proteins. METHODS: In this paper, we conducted a theoretical investigation on the isotope properties of sulphur-containing peptides. We proposed a gradient boosting approach to predict the number of sulphur atoms based on the aggregated isotope distribution. We compared prediction accuracy and assessed the predictive power of the features using the mass and isotope abundance information from the first three, five and eight aggregated isotope peaks. RESULTS: Mass features alone are not sufficient to accurately predict the number of sulphur atoms. However, we reach near-perfect prediction when we include isotope abundance features. The abundance ratios of the eighth and the seventh, the fifth and the fourth, and the third and the second aggregated isotope peaks are the most important abundance features. The mass difference between the eighth, the fifth or the third aggregated isotope peaks and the monoisotopic peak are the most predictive mass features. CONCLUSIONS: Based on the validation analysis it can be concluded that the prediction of the number of sulphur atoms based on the isotope profile fails, because the isotope ratios are not measured accurately. These results indicate that it is valuable for future instrument developments to focus more on improving spectral accuracy to measure peak intensities of higher-order isotope peaks more accurately.


Asunto(s)
Péptidos , Proteínas , Péptidos/química , Proteínas/química , Isótopos/química , Espectrometría de Masas/métodos , Azufre
3.
J Proteome Res ; 18(5): 2221-2227, 2019 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-30942071

RESUMEN

In the context of omics disciplines and especially proteomics and biomarker discovery, the analysis of a clinical sample using label-based tandem mass spectrometry (MS) can be affected by sample preparation effects or by the measurement process itself, resulting in an incorrect outcome. Detection and correction of these mistakes using state-of-the-art methods based on mixed models can use large amounts of (computing) time. MS-based proteomics laboratories are high-throughput and need to avoid a bottleneck in their quantitative pipeline by quickly discriminating between high- and low-quality data. To this end we developed an easy-to-use web-tool called QCQuan (available at qcquan.net ) which is built around the CONSTANd normalization algorithm. It automatically provides the user with exploratory and quality control information as well as a differential expression analysis based on conservative, simple statistics. In this document we describe in detail the scientifically relevant steps that constitute the workflow and assess its qualitative and quantitative performance on three reference data sets. We find that QCQuan provides clear and accurate indications about the scientific value of both a high- and a low-quality data set. Moreover, it performed quantitatively better on a third data set than a comparable workflow assembled using established, reliable software.


Asunto(s)
Algoritmos , Proteínas Bacterianas/aislamiento & purificación , Exactitud de los Datos , Pectobacterium carotovorum/química , Proteómica/estadística & datos numéricos , Programas Informáticos , Animales , Bovinos , Cromatografía Liquida , Mezclas Complejas/química , Citocromos c/aislamiento & purificación , Conjuntos de Datos como Asunto , Glucógeno Fosforilasa/aislamiento & purificación , Internet , Fosfopiruvato Hidratasa/aislamiento & purificación , Proteómica/métodos , Control de Calidad , Conejos , Albúmina Sérica Bovina/aislamiento & purificación , Coloración y Etiquetado/métodos , Espectrometría de Masas en Tándem
4.
Comput Biol Med ; 171: 108231, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38422965

RESUMEN

Spatial heterogeneity of cells in liver biopsies can be used as biomarker for disease severity of patients. This heterogeneity can be quantified by non-parametric statistics of point pattern data, which make use of an aggregation of the point locations. The method and scale of aggregation are usually chosen ad hoc, despite values of the aforementioned statistics being heavily dependent on them. Moreover, in the context of measuring heterogeneity, increasing spatial resolution will not endlessly provide more accuracy. The question then becomes how changes in resolution influence heterogeneity indicators, and subsequently how they influence their predictive abilities. In this paper, cell level data of liver biopsy tissue taken from chronic Hepatitis B patients is used to analyze this issue. Firstly, Morisita-Horn indices, Shannon indices and Getis-Ord statistics were evaluated as heterogeneity indicators of different types of cells, using multiple resolutions. Secondly, the effect of resolution on the predictive performance of the indices in an ordinal regression model was investigated, as well as their importance in the model. A simulation study was subsequently performed to validate the aforementioned methods. In general, for specific heterogeneity indicators, a downward trend in predictive performance could be observed. While for local measures of heterogeneity a smaller grid-size is outperforming, global measures have a better performance with medium-sized grids. In addition, the use of both local and global measures of heterogeneity is recommended to improve the predictive performance.


Asunto(s)
Cirrosis Hepática , Humanos , Cirrosis Hepática/diagnóstico , Biopsia , Simulación por Computador , Biomarcadores
5.
Comput Biol Med ; 165: 107382, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37634463

RESUMEN

The organization and interaction between hepatocytes and other hepatic non-parenchymal cells plays a pivotal role in maintaining normal liver function and structure. Although spatial heterogeneity within the tumor micro-environment has been proven to be a fundamental feature in cancer progression, the role of liver tissue topology and micro-environmental factors in the context of liver damage in chronic infection has not been widely studied yet. We obtained images from 110 core needle biopsies from a cohort of chronic hepatitis B patients with different fibrosis stages according to METAVIR score. The tissue sections were immunofluorescently stained and imaged to determine the locations of CD45 positive immune cells and HBsAg-negative and HBsAg-positive hepatocytes within the tissue. We applied several descriptive techniques adopted from ecology, including Getis-Ord, the Shannon Index and the Morisita-Horn Index, to quantify the extent to which immune cells and different types of liver cells co-localize in the tissue biopsies. Additionally, we modeled the spatial distribution of the different cell types using a joint log-Gaussian Cox process and proposed several features to quantify spatial heterogeneity. We then related these measures to the patient fibrosis stage by using a linear discriminant analysis approach. Our analysis revealed that the co-localization of HBsAg-negative hepatocytes with immune cells and the co-localization of HBsAg-positive hepatocytes with immune cells are equally important factors for explaining the METAVIR score in chronic hepatitis B patients. Moreover, we found that if we allow for an error of 1 on the METAVIR score, we are able to reach an accuracy of around 80%. With this study we demonstrate how methods adopted from ecology and applied to the liver tissue micro-environment can be used to quantify heterogeneity and how these approaches can be valuable in biomarker analyses for liver topology.


Asunto(s)
Hepatitis B Crónica , Humanos , Antígenos de Superficie de la Hepatitis B , Hígado/patología , Hepatocitos/metabolismo , Hepatocitos/patología , Fibrosis , Cirrosis Hepática
6.
Metabolites ; 11(6)2021 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-34207227

RESUMEN

Structural modifications of DNA and RNA molecules play a pivotal role in epigenetic and posttranscriptional regulation. To characterise these modifications, more and more MS and MS/MS- based tools for the analysis of nucleic acids are being developed. To identify an oligonucleotide in a mass spectrum, it is useful to compare the obtained isotope pattern of the molecule of interest to the one that is theoretically expected based on its elemental composition. However, this is not straightforward when the identity of the molecule under investigation is unknown. Here, we present a modelling approach for the prediction of the aggregated isotope distribution of an average DNA or RNA molecule when a particular (monoisotopic) mass is available. For this purpose, a theoretical database of all possible DNA/RNA oligonucleotides up to a mass of 25 kDa is created, and the aggregated isotope distribution for the entire database of oligonucleotides is generated using the BRAIN algorithm. Since this isotope information is compositional in nature, the modelling method is based on the additive log-ratio analysis of Aitchison. As a result, a univariate weighted polynomial regression model of order 10 is fitted to predict the first 20 isotope peaks for DNA and RNA molecules. The performance of the prediction model is assessed by using a mean squared error approach and a modified Pearson's χ2 goodness-of-fit measure on experimental data. Our analysis has indicated that the variability in spectral accuracy contributed more to the errors than the approximation of the theoretical isotope distribution by our proposed average DNA/RNA model. The prediction model is implemented as an online tool. An R function can be downloaded to incorporate the method in custom analysis workflows to process mass spectral data.

7.
J Mass Spectrom ; 55(8): e4471, 2020 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-31713933

RESUMEN

There is a trend in the analysis of shotgun proteomics data that aims to combine information from multiple search engines to increase the number of peptide annotations in an experiment. Typically, the degree of search engine complementarity and search engine agreement is visually illustrated by means of Venn diagrams that present the findings of a database search on the level of the nonredundant peptide annotations. We argue this practice to be not fit-for-purpose since the diagrams do not take into account and often conceal the information on complementarity and agreement at the level of the spectrum identification. We promote a new type of visualization that provides insight on the peptide sequence agreement at the level of the peptide-spectrum match (PSM) as a measure of consensus between two search engines with nominal outcomes. We applied the visualizations and percentage sequence agreement to an in-house data set of our benchmark organism, Caenorhabditis elegans, and illustrated that when assessing the agreement between search engine, one should disentangle the notion of PSM confidence and PSM identity. The visualizations presented in this manuscript provide a more informative assessment of pairs of search engines and are made available as an R function in the Supporting Information.


Asunto(s)
Bases de Datos de Proteínas , Péptidos , Proteómica , Péptidos/análisis , Péptidos/química , Péptidos/clasificación , Proteómica/métodos , Proteómica/normas , Motor de Búsqueda/métodos , Motor de Búsqueda/normas , Espectrometría de Masas en Tándem
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA