Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 413
Filtrar
1.
J Proteome Res ; 23(6): 1983-1999, 2024 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-38728051

RESUMEN

In recent years, several deep learning-based methods have been proposed for predicting peptide fragment intensities. This study aims to provide a comprehensive assessment of six such methods, namely Prosit, DeepMass:Prism, pDeep3, AlphaPeptDeep, Prosit Transformer, and the method proposed by Guan et al. To this end, we evaluated the accuracy of the predicted intensity profiles for close to 1.7 million precursors (including both tryptic and HLA peptides) corresponding to more than 18 million experimental spectra procured from 40 independent submissions to the PRIDE repository that were acquired for different species using a variety of instruments and different dissociation types/energies. Specifically, for each method, distributions of similarity (measured by Pearson's correlation and normalized angle) between the predicted and the corresponding experimental b and y fragment intensities were generated. These distributions were used to ascertain the prediction accuracy and rank the prediction methods for particular types of experimental conditions. The effect of variables like precursor charge, length, and collision energy on the prediction accuracy was also investigated. In addition to prediction accuracy, the methods were evaluated in terms of prediction speed. The systematic assessment of these six methods may help in choosing the right method for MS/MS spectra prediction for particular needs.


Asunto(s)
Aprendizaje Profundo , Humanos , Fragmentos de Péptidos/química , Fragmentos de Péptidos/análisis , Espectrometría de Masas en Tándem/métodos , Espectrometría de Masas en Tándem/estadística & datos numéricos , Proteómica/métodos , Proteómica/estadística & datos numéricos
2.
J Proteome Res ; 23(6): 2078-2089, 2024 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-38666436

RESUMEN

Data-independent acquisition (DIA) has become a well-established method for MS-based proteomics. However, the list of options to analyze this type of data is quite extensive, and the use of spectral libraries has become an important factor in DIA data analysis. More specifically the use of in silico predicted libraries is gaining more interest. By working with a differential spike-in of human standard proteins (UPS2) in a constant yeast tryptic digest background, we evaluated the sensitivity, precision, and accuracy of the use of in silico predicted libraries in data DIA data analysis workflows compared to more established workflows. Three commonly used DIA software tools, DIA-NN, EncyclopeDIA, and Spectronaut, were each tested in spectral library mode and spectral library-free mode. In spectral library mode, we used independent spectral library prediction tools PROSIT and MS2PIP together with DeepLC, next to classical data-dependent acquisition (DDA)-based spectral libraries. In total, we benchmarked 12 computational workflows for DIA. Our comparison showed that DIA-NN reached the highest sensitivity while maintaining a good compromise on the reproducibility and accuracy levels in either library-free mode or using in silico predicted libraries pointing to a general benefit in using in silico predicted libraries.


Asunto(s)
Simulación por Computador , Proteómica , Programas Informáticos , Flujo de Trabajo , Proteómica/métodos , Proteómica/estadística & datos numéricos , Humanos , Reproducibilidad de los Resultados , Análisis de Datos , Biblioteca de Péptidos
3.
Comput Math Methods Med ; 2022: 4049169, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35186113

RESUMEN

Sport is a type of comprehensive activity that the human body consciously engages in to improve physical fitness. Proteomics is a comprehensive technology dedicated to the study of all protein profiles expressed by a species, individual organ, tissue, or cell under specific conditions and specific times. Proteomics is a science that studies the protein composition of cells, tissues, or organisms and their changing laws with proteomics as the research object. Related technologies are now widely used in sports and other fields. The purpose of this article is to study myocardial proteomic technology and its application in sports. During the research process, the main methods used in this study are literature survey and controlled experiment. The results achieved and the problems in this field, followed by selecting 30 SD rats into 3 groups for control experiments. The results of the study showed that among the three groups of rats, the left ventricular ejection fraction of the sham operation group was the highest, which was 7.7% and 4.6% higher than that of the operation group and the model group, respectively. The operation group had the highest left ventricular short axis shortening rate, and the left ventricle diastolic inner diameter is the longest. It can be seen that myocardial proteomics can accurately reflect the heart condition of rats. In addition, the length, diastolic velocity, and diastolic time of cardiomyocytes of the three groups of rats were different. Among them, the cardiomyocytes of the operation group had the longest time and the longest diastolic time, which were 37.1% and 8.5% higher than those of the sham operation group and the model group.


Asunto(s)
Algoritmos , Miocardio/metabolismo , Proteómica/estadística & datos numéricos , Deportes/fisiología , Animales , Biología Computacional , Humanos , Ratas , Ratas Sprague-Dawley
4.
Nat Biotechnol ; 40(5): 692-702, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35102292

RESUMEN

Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.


Asunto(s)
Bases del Conocimiento , Medicina de Precisión/métodos , Proteómica , Algoritmos , Toma de Decisiones Asistida por Computador , Aprendizaje Automático , Reconocimiento de Normas Patrones Automatizadas , Medicina de Precisión/normas , Proteómica/normas , Proteómica/estadística & datos numéricos
5.
Sci Rep ; 12(1): 1067, 2022 01 20.
Artículo en Inglés | MEDLINE | ID: mdl-35058491

RESUMEN

Missing values are a major issue in quantitative proteomics analysis. While many methods have been developed for imputing missing values in high-throughput proteomics data, a comparative assessment of imputation accuracy remains inconclusive, mainly because mechanisms contributing to true missing values are complex and existing evaluation methodologies are imperfect. Moreover, few studies have provided an outlook of future methodological development. We first re-evaluate the performance of eight representative methods targeting three typical missing mechanisms. These methods are compared on both simulated and masked missing values embedded within real proteomics datasets, and performance is evaluated using three quantitative measures. We then introduce fused regularization matrix factorization, a low-rank global matrix factorization framework, capable of integrating local similarity derived from additional data types. We also explore a biologically-inspired latent variable modeling strategy-convex analysis of mixtures-for missing value imputation and present preliminary experimental results. While some winners emerged from our comparative assessment, the evaluation is intrinsically imperfect because performance is evaluated indirectly on artificial missing or masked values not authentic missing values. Nevertheless, we show that our fused regularization matrix factorization provides a novel incorporation of external and local information, and the exploratory implementation of convex analysis of mixtures presents a biologically plausible new approach.


Asunto(s)
Interpretación Estadística de Datos , Proteómica/estadística & datos numéricos , Algoritmos , Proteómica/métodos
6.
PLoS Comput Biol ; 17(11): e1009161, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34762640

RESUMEN

Network propagation refers to a class of algorithms that integrate information from input data across connected nodes in a given network. These algorithms have wide applications in systems biology, protein function prediction, inferring condition-specifically altered sub-networks, and prioritizing disease genes. Despite the popularity of network propagation, there is a lack of comparative analyses of different algorithms on real data and little guidance on how to select and parameterize the various algorithms. Here, we address this problem by analyzing different combinations of network normalization and propagation methods and by demonstrating schemes for the identification of optimal parameter settings on real proteome and transcriptome data. Our work highlights the risk of a 'topology bias' caused by the incorrect use of network normalization approaches. Capitalizing on the fact that network propagation is a regularization approach, we show that minimizing the bias-variance tradeoff can be utilized for selecting optimal parameters. The application to real multi-omics data demonstrated that optimal parameters could also be obtained by either maximizing the agreement between different omics layers (e.g. proteome and transcriptome) or by maximizing the consistency between biological replicates. Furthermore, we exemplified the utility and robustness of network propagation on multi-omics datasets for identifying ageing-associated genes in brain and liver tissues of rats and for elucidating molecular mechanisms underlying prostate cancer progression. Overall, this work compares different network propagation approaches and it presents strategies for how to use network propagation algorithms to optimally address a specific research question at hand.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Envejecimiento/genética , Envejecimiento/metabolismo , Animales , Sesgo , Encéfalo/metabolismo , Biología Computacional/estadística & datos numéricos , Interpretación Estadística de Datos , Progresión de la Enfermedad , Perfilación de la Expresión Génica/estadística & datos numéricos , Redes Reguladoras de Genes , Genómica/estadística & datos numéricos , Humanos , Hígado/metabolismo , Masculino , Neoplasias de la Próstata/etiología , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/metabolismo , Mapas de Interacción de Proteínas , Proteómica/estadística & datos numéricos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ratas , Biología de Sistemas
7.
Comput Math Methods Med ; 2021: 5799348, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34646335

RESUMEN

The biological mechanism underlying the pathogenesis of systemic lupus erythematosus (SLE) remains unclear. In this study, we found 21 proteins upregulated and 38 proteins downregulated by SLE relative to normal protein metabolism in our samples using liquid chromatography-mass spectrometry. By PPI network analysis, we identified 9 key proteins of SLE, including AHSG, VWF, IGF1, ORM2, ORM1, SERPINA1, IGF2, IGFBP3, and LEP. In addition, we identified 4569 differentially expressed metabolites in SLE sera, including 1145 reduced metabolites and 3424 induced metabolites. Bioinformatics analysis showed that protein alterations in SLE were associated with modulation of multiple immune pathways, TP53 signaling, and AMPK signaling. In addition, we found altered metabolites associated with valine, leucine, and isoleucine biosynthesis; one carbon pool by folate; tyrosine metabolism; arginine and proline metabolism; glycine, serine, and threonine metabolism; limonene and pinene degradation; tryptophan metabolism; caffeine metabolism; vitamin B6 metabolism. We also constructed differently expressed protein-metabolite network to reveal the interaction among differently expressed proteins and metabolites in SLE. A total of 481 proteins and 327 metabolites were included in this network. Although the role of altered metabolites and proteins in the diagnosis and therapy of SLE needs to be further investigated, the present study may provide new insights into the role of metabolites in SLE.


Asunto(s)
Lupus Eritematoso Sistémico/genética , Lupus Eritematoso Sistémico/metabolismo , Biomarcadores/metabolismo , Cromatografía Liquida , Biología Computacional , Femenino , Marcadores Genéticos , Humanos , Lupus Eritematoso Sistémico/inmunología , Masculino , Espectrometría de Masas , Redes y Vías Metabólicas/genética , Redes y Vías Metabólicas/inmunología , Metabolómica/estadística & datos numéricos , Mapas de Interacción de Proteínas/genética , Mapas de Interacción de Proteínas/inmunología , Proteómica/estadística & datos numéricos
8.
Sci Rep ; 11(1): 18936, 2021 09 23.
Artículo en Inglés | MEDLINE | ID: mdl-34556748

RESUMEN

Prostate cancer (PCa) is a heterogeneous group of tumors with variable clinical courses. In order to improve patient outcomes, it is critical to clinically separate aggressive PCa (AG) from non-aggressive PCa (NAG). Although recent genomic studies have identified a spectrum of molecular abnormalities associated with aggressive PCa, it is still challenging to separate AG from NAG. To better understand the functional consequences of PCa progression and the unique features of the AG subtype, we studied the proteomic signatures of primary AG, NAG and metastatic PCa. 39 PCa and 10 benign prostate controls in a discovery cohort and 57 PCa in a validation cohort were analyzed using a data-independent acquisition (DIA) SWATH-MS platform. Proteins with the highest variances (top 500 proteins) were annotated for the pathway enrichment analysis. Functional analysis of differentially expressed proteins in NAG and AG was performed. Data was further validated using a validation cohort; and was also compared with a TCGA mRNA expression dataset and confirmed by immunohistochemistry (IHC) using PCa tissue microarray (TMA). 4,415 proteins were identified in the tumor and benign control tissues, including 158 up-regulated and 116 down-regulated proteins in AG tumors. A functional analysis of tumor-associated proteins revealed reduced expressions of several proteinases, including dipeptidyl peptidase 4 (DPP4), carboxypeptidase E (CPE) and prostate specific antigen (KLK3) in AG and metastatic PCa. A targeted analysis further identified that the reduced expression of DPP4 was associated with the accumulation of DPP4 substrates and the reduced ratio of DPP4 cleaved peptide to intact substrate peptide. Findings were further validated using an independently-collected tumor cohort, correlated with a TCGA mRNA dataset, and confirmed by immunohistochemical stains of PCa tumor microarray (TMA). Our study is the first large-scale proteomics analysis of PCa tissue using a DIA SWATH-MS platform. It provides not only an interrogative proteomic signature of PCa subtypes, but also indicates the critical roles played by certain proteinases during tumor progression. The spectrum map and protein profile generated in the study can be used to investigate potential biological mechanisms involved in PCa and for the development of a clinical assay to distinguish aggressive from indolent PCa.


Asunto(s)
Carboxipeptidasa H/metabolismo , Dipeptidil Peptidasa 4/metabolismo , Regulación Neoplásica de la Expresión Génica , Calicreínas/metabolismo , Antígeno Prostático Específico/metabolismo , Neoplasias de la Próstata/genética , Conjuntos de Datos como Asunto , Estudios de Seguimiento , Perfilación de la Expresión Génica , Humanos , Masculino , Clasificación del Tumor , Próstata/patología , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/patología , Proteómica/estadística & datos numéricos , Análisis de Matrices Tisulares
9.
Int J Mol Sci ; 22(17)2021 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-34502557

RESUMEN

Analysis of differential abundance in proteomics data sets requires careful application of missing value imputation. Missing abundance values widely vary when performing comparisons across different sample treatments. For example, one would expect a consistent rate of "missing at random" (MAR) across batches of samples and varying rates of "missing not at random" (MNAR) depending on the inherent difference in sample treatments within the study. The missing value imputation strategy must thus be selected that best accounts for both MAR and MNAR simultaneously. Several important issues must be considered when deciding the appropriate missing value imputation strategy: (1) when it is appropriate to impute data; (2) how to choose a method that reflects the combinatorial manner of MAR and MNAR that occurs in an experiment. This paper provides an evaluation of missing value imputation strategies used in proteomics and presents a case for the use of hybrid left-censored missing value imputation approaches that can handle the MNAR problem common to proteomics data.


Asunto(s)
Exactitud de los Datos , Bases de Datos de Proteínas/estadística & datos numéricos , Espectrometría de Masas/métodos , Proteómica/estadística & datos numéricos , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/patología , Línea Celular Tumoral , Glucosa/metabolismo , Humanos , Proteómica/métodos , Proteómica/normas
10.
Sci Rep ; 11(1): 17170, 2021 08 26.
Artículo en Inglés | MEDLINE | ID: mdl-34446747

RESUMEN

The present study aimed to construct and evaluate a novel experiment-based hypoxia signature to help evaluations of GBM patient status. First, the 426 proteins, which were previously found to be differentially expressed between normal and hypoxia groups in glioblastoma cells with statistical significance, were converted into the corresponding genes, among which 212 genes were found annotated in TCGA. Second, after evaluated by single-variable Cox analysis, 19 different expressed genes (DEGs) with prognostic value were identified. Based on λ value by LASSO, a gene-based survival risk score model, named RiskScore, was built by 7 genes with LASSO coefficient, which were FKBP2, GLO1, IGFBP5, NSUN5, RBMX, TAGLN2 and UBE2V2. Kaplan-Meier (K-M) survival curve analysis and the area under the curve (AUC) were plotted to further estimate the efficacy of this risk score model. Furthermore, the survival curve analysis was also plotted based on the subtypes of age, IDH, radiotherapy and chemotherapy. Meanwhile, immune infiltration, GSVA, GSEA and chemo drug sensitivity of this risk score model were evaluated. Third, the 7 genes expression were evaluated by AUC, overall survival (OS) and IDH subtype in datasets, importantly, also experimentally verified in GBM cell lines exposed to hypoxic or normal oxygen condition, which showed significant higher expression in hypoxia than in normal group. Last, combing the hypoxia RiskScore with clinical and molecular features, a prognostic composite nomogram was generated, showing the good sensitivity and specificity by AUC and OS. Meanwhile, univariate analysis and multivariate analysis were used for performed to identify variables in nomogram that were significant in independently predicting duration of survival. It is a first time that we successfully established and validated an independent prognostic risk model based on hypoxia microenvironment from glioblastoma cells and public database. The 7 key genes may provide potential directions for future biochemical and pharmaco-therapeutic research.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica , Glioblastoma/genética , Proteoma/metabolismo , Proteómica/métodos , Microambiente Tumoral/genética , Anciano , Línea Celular Tumoral , Bases de Datos Factuales/estadística & datos numéricos , Femenino , Perfilación de la Expresión Génica/estadística & datos numéricos , Glioblastoma/diagnóstico , Glioblastoma/metabolismo , Humanos , Hipoxia , Estimación de Kaplan-Meier , Masculino , Persona de Mediana Edad , Análisis Multivariante , Nomogramas , Farmacogenética/métodos , Farmacogenética/estadística & datos numéricos , Pronóstico , Proteoma/genética , Proteómica/estadística & datos numéricos
11.
J Hepatol ; 75(6): 1377-1386, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34329660

RESUMEN

BACKGROUND & AIMS: The microenvironment of intrahepatic cholangiocarcinoma (iCCA) is hypovascularized, with an extensive lymphatic network. This leads to rapid cancer spread into regional lymph nodes and the liver parenchyma, precluding curative treatments. Herein, we investigated which factors released in the iCCA stroma drive the inhibition of angiogenesis and promote lymphangiogenesis. METHODS: Quantitative proteomics was performed on extracellular fluid (ECF) proteins extracted both from cancerous and non-cancerous tissues (NCT) of patients with iCCA. Computational biology was applied on a proteomic dataset to identify proteins involved in the regulation of vessel formation. Endothelial cells incubated with ECF from either iCCA or NCT specimens were used to assess the role of candidate proteins in 3D vascular assembly, cell migration, proliferation and viability. Angiogenesis and lymphangiogenesis were further investigated in vivo by a heterotopic transplantation of bone marrow stromal cells, along with endothelial cells in SCID/beige mice. RESULTS: Functional analysis of upregulated proteins in iCCA unveils a soluble angio-inhibitory milieu made up of thrombospondin (THBS)1, THBS2 and pigment epithelium-derived factor (PEDF). iCCA ECF was able to inhibit in vitro vessel morphogenesis and viability. Antibodies blocking THBS1, THBS2 and PEDF restored tube formation and endothelial cell viability to levels observed in NCT ECF. Moreover, in transplanted mice, the inhibition of blood vessel formation, the de novo generation of the lymphatic network and the dissemination of iCCA cells in lymph nodes were shown to depend on THBS1, THBS2 and PEDF expression. CONCLUSIONS: THBS1, THBS2 and PEDF reduce blood vessel formation and promote tumor-associated lymphangiogenesis in iCCA. Our results identify new potential targets for interventions to counteract the dissemination process in iCCA. LAY SUMMARY: Intrahepatic cholangiocarcinoma is a highly aggressive cancer arising from epithelial cells lining the biliary tree, characterized by dissemination into the liver parenchyma via lymphatic vessels. Herein, we show that the proteins THBS1, THBS2 and PEDF, once released in the tumor microenvironment, inhibit vascular growth, while promoting cancer-associated lymphangiogenesis. Therefore, targeting THBS1, THBS2 and PEDF may be a promising strategy to reduce cancer-associated lymphangiogenesis and counteract the invasiveness of intrahepatic cholangiocarcinoma.


Asunto(s)
Inductores de la Angiogénesis/metabolismo , Colangiocarcinoma/etiología , Linfangiogénesis/efectos de los fármacos , Trombospondina 1/farmacología , Trombospondinas/farmacología , Inhibidores de la Angiogénesis/farmacología , Inhibidores de la Angiogénesis/uso terapéutico , Animales , Colangiocarcinoma/fisiopatología , Modelos Animales de Enfermedad , Ratones , Proteómica/métodos , Proteómica/estadística & datos numéricos , Trombospondina 1/administración & dosificación , Trombospondinas/administración & dosificación , Microambiente Tumoral/efectos de los fármacos
12.
Clin Epigenetics ; 13(1): 145, 2021 07 28.
Artículo en Inglés | MEDLINE | ID: mdl-34315505

RESUMEN

BACKGROUND: Increasing evidence linking epigenetic mechanisms and different diseases, including cancer, has prompted in the last 15 years the investigation of histone post-translational modifications (PTMs) in clinical samples. Methods allowing the isolation of histones from patient samples followed by the accurate and comprehensive quantification of their PTMs by mass spectrometry (MS) have been developed. However, the applicability of these methods is limited by the requirement for substantial amounts of material. RESULTS: To address this issue, in this study we streamlined the protein extraction procedure from low-amount clinical samples and tested and implemented different in-gel digestion strategies, obtaining a protocol that allows the MS-based analysis of the most common histone PTMs from laser microdissected tissue areas containing as low as 1000 cells, an amount approximately 500 times lower than what is required by available methods. We then applied this protocol to breast cancer patient laser microdissected tissues in two proof-of-concept experiments, identifying differences in histone marks in heterogeneous regions selected by either morphological evaluation or MALDI MS imaging. CONCLUSIONS: These results demonstrate that analyzing histone PTMs from very small tissue areas and detecting differences from adjacent tumor regions is technically feasible. Our method opens the way for spatial epi-proteomics, namely the investigation of epigenetic features in the context of tissue and tumor heterogeneity, which will be instrumental for the identification of novel epigenetic biomarkers and aberrant epigenetic mechanisms.


Asunto(s)
Histonas/efectos de los fármacos , Procesamiento Proteico-Postraduccional/genética , Línea Celular Tumoral/efectos de los fármacos , Metilación de ADN , Histonas/genética , Humanos , Proteómica/métodos , Proteómica/estadística & datos numéricos
13.
Methods Mol Biol ; 2228: 1-20, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33950479

RESUMEN

Mass spectrometry is frequently used in quantitative proteomics to detect differentially regulated proteins. A very important but unfortunately oftentimes neglected part in detecting differential proteins is the statistical analysis. Data from proteomics experiments are usually high-dimensional and hence require profound statistical methods. It is especially important to already correctly design a proteomic experiment before it is conducted in the laboratory. Only this can ensure that the statistical analysis is capable of detecting truly differential proteins afterward. This chapter thus covers aspects of both statistical planning as well as the actual analysis of quantitative proteomic experiments.


Asunto(s)
Espectrometría de Masas/estadística & datos numéricos , Proteínas/análisis , Proteoma , Proteómica/estadística & datos numéricos , Proyectos de Investigación/estadística & datos numéricos , Animales , Interpretación Estadística de Datos , Humanos , Modelos Estadísticos
14.
Methods Mol Biol ; 2228: 409-417, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33950506

RESUMEN

In mass spectrometry-based proteomics, relative quantitative approaches enable differential protein abundance analysis. Isobaric labeling strategies, such as tandem mass tags (TMT), provide simultaneous quantification of several samples (e.g., up to 16 using 16plex TMTpro) owing to its multiplexing capability. This technology improves sample throughput and thereby minimizes both measurement time and overall experimental variation. However, TMT-based MS data processing and statistical analysis are probably the crucial parts of this pipeline to obtain reliable, plausible, and significantly quantified results. Here, we provide a step-by-step guide to the analysis and evaluation of TMT quantitative proteomics data.


Asunto(s)
Proteínas/análisis , Proteoma , Proteómica , Espectrometría de Masas en Tándem , Animales , Cromatografía Líquida de Alta Presión , Interpretación Estadística de Datos , Humanos , Proteómica/estadística & datos numéricos , Proyectos de Investigación , Espectrometría de Masas en Tándem/estadística & datos numéricos
15.
Methods Mol Biol ; 2228: 433-451, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33950508

RESUMEN

Data clustering facilitates the identification of biologically relevant molecular features in quantitative proteomics experiments with thousands of measurements over multiple conditions. It finds groups of proteins or peptides with similar quantitative behavior across multiple experimental conditions. This co-regulatory behavior suggests that the proteins of such a group share their functional behavior and thus often can be mapped to the same biological processes and molecular subnetworks.While usual clustering approaches dismiss the variance of the measured proteins, VSClust combines statistical testing with pattern recognition into a common algorithm. Here, we show how to use the VSClust web service on a large proteomics data set and present further tools to assess the quantitative behavior of protein complexes.


Asunto(s)
Neoplasias de la Mama/metabolismo , Proteínas de Neoplasias/análisis , Proteoma , Proteómica , Análisis por Conglomerados , Interpretación Estadística de Datos , Bases de Datos de Proteínas , Femenino , Humanos , Complejos Multiproteicos , Unión Proteica , Proteómica/estadística & datos numéricos , Proyectos de Investigación , Programas Informáticos
16.
Biochemistry (Mosc) ; 86(3): 338-349, 2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33838633

RESUMEN

One of the main goals of quantitative proteomics is molecular profiling of cellular response to stress at the protein level. To perform this profiling, statistical analysis of experimental data involves multiple testing of a hypothesis about the equality of protein concentrations between the cells under normal and stress conditions. This analysis is then associated with the multiple testing problem dealing with the increased chance of obtaining false positive results. A number of solutions to this problem are known, yet, they may lead to the loss of potentially important biological information when applied with commonly accepted thresholds of statistical significance. Using the proteomic data obtained earlier for the yeast samples containing proteins at known concentrations and the biological models of early and late cellular responses to stress, we analyzed dependences of distributions of false positive and false negative rates on the protein fold changes and thresholds of statistical significance. Based on the analysis of the density of data points in the volcano plots, Benjamini-Hochberg method, and gene ontology analysis, visual approach for optimization of the statistical threshold and selection of the differentially regulated proteins has been suggested, which could be useful for researchers working in the field of quantitative proteomics.


Asunto(s)
Astrocitos/fisiología , Proteómica/normas , Saccharomyces cerevisiae/fisiología , Estrés Fisiológico , Astrocitos/metabolismo , Reacciones Falso Positivas , Humanos , Proteómica/estadística & datos numéricos , Saccharomyces cerevisiae/metabolismo
17.
Sci Rep ; 11(1): 2932, 2021 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-33536534

RESUMEN

Chronic lymphocytic leukaemia (CLL) exhibits variable clinical course and response to therapy, but the molecular basis of this variability remains incompletely understood. Data independent acquisition (DIA)-MS technologies, such as SWATH (Sequential Windowed Acquisition of all THeoretical fragments), provide an opportunity to study the pathophysiology of CLL at the proteome level. Here, a CLL-specific spectral library (7736 proteins) is described alongside an analysis of sample replication and data handling requirements for quantitative SWATH-MS analysis of clinical samples. The analysis was performed on 6 CLL samples, incorporating biological (IGHV mutational status), sample preparation and MS technical replicates. Quantitative information was obtained for 5169 proteins across 54 SWATH-MS acquisitions: the sources of variation and different computational approaches for batch correction were assessed. Functional enrichment analysis of proteins associated with IGHV mutational status showed significant overlap with previous studies based on gene expression profiling. Finally, an approach to perform statistical power analysis in proteomics studies was implemented. This study provides a valuable resource for researchers working on the proteomics of CLL. It also establishes a sound framework for the design of sufficiently powered clinical proteomics studies. Indeed, this study shows that it is possible to derive biologically plausible hypotheses from a relatively small dataset.


Asunto(s)
Variación Biológica Poblacional/genética , Heterogeneidad Genética , Leucemia Linfocítica Crónica de Células B/patología , Proteómica/estadística & datos numéricos , Anciano , Conjuntos de Datos como Asunto , Femenino , Perfilación de la Expresión Génica , Humanos , Leucemia Linfocítica Crónica de Células B/genética , Masculino , Persona de Mediana Edad , Mutación , Proteoma , Receptores de Antígenos de Linfocitos B/genética , Espectrometría de Masas en Tándem
18.
PLoS Comput Biol ; 17(2): e1008101, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33617527

RESUMEN

Proteases are an important class of enzymes, whose activity is central to many physiologic and pathologic processes. Detailed knowledge of protease specificity is key to understanding their function. Although many methods have been developed to profile specificities of proteases, few have the diversity and quantitative grasp necessary to fully define specificity of a protease, both in terms of substrate numbers and their catalytic efficiencies. We have developed a concept of "selectome"; the set of substrate amino acid sequences that uniquely represent the specificity of a protease. We applied it to two closely related members of the Matrixin family-MMP-2 and MMP-9 by using substrate phage display coupled with Next Generation Sequencing and information theory-based data analysis. We have also derived a quantitative measure of substrate specificity, which accounts for both the number of substrates and their relative catalytic efficiencies. Using these advances greatly facilitates elucidation of substrate selectivity between closely related members of a protease family. The study also provides insight into the degree to which the catalytic cleft defines substrate recognition, thus providing basis for overcoming two of the major challenges in the field of proteolysis: 1) development of highly selective activity probes for studying proteases with overlapping specificities, and 2) distinguishing targeted proteolysis from bystander proteolytic events.


Asunto(s)
Modelos Biológicos , Péptido Hidrolasas/genética , Péptido Hidrolasas/metabolismo , Secuencia de Aminoácidos , Dominio Catalítico/genética , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento , Teoría de la Información , Metaloproteinasa 2 de la Matriz/química , Metaloproteinasa 2 de la Matriz/genética , Metaloproteinasa 2 de la Matriz/metabolismo , Metaloproteinasa 9 de la Matriz/química , Metaloproteinasa 9 de la Matriz/genética , Metaloproteinasa 9 de la Matriz/metabolismo , Modelos Moleculares , Péptido Hidrolasas/clasificación , Biblioteca de Péptidos , Pliegue de Proteína , Proteolisis , Proteómica/métodos , Proteómica/estadística & datos numéricos , Especificidad por Sustrato/genética , Especificidad por Sustrato/fisiología
19.
J Proteome Res ; 20(3): 1457-1463, 2021 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-33617253

RESUMEN

Since the outset of COVID-19, the pandemic has prompted immediate global efforts to sequence SARS-CoV-2, and over 450 000 complete genomes have been publicly deposited over the course of 12 months. Despite this, comparative nucleotide and amino acid sequence analyses often fall short in answering key questions in vaccine design. For example, the binding affinity between different ACE2 receptors and SARS-COV-2 spike protein cannot be fully explained by amino acid similarity at ACE2 contact sites because protein structure similarities are not fully reflected by amino acid sequence similarities. To comprehensively compare protein homology, secondary structure (SS) analysis is required. While protein structure is slow and difficult to obtain, SS predictions can be made rapidly, and a well-predicted SS structure may serve as a viable proxy to gain biological insight. Here we review algorithms and information used in predicting protein SS to highlight its potential application in pandemics research. We also showed examples of how SS predictions can be used to compare ACE2 proteins and to evaluate the zoonotic origins of viruses. As computational tools are much faster than wet-lab experiments, these applications can be important for research especially in times when quickly obtained biological insights can help in speeding up response to pandemics.


Asunto(s)
COVID-19/virología , SARS-CoV-2/química , SARS-CoV-2/genética , Glicoproteína de la Espiga del Coronavirus/química , Glicoproteína de la Espiga del Coronavirus/genética , Algoritmos , Enzima Convertidora de Angiotensina 2/química , Enzima Convertidora de Angiotensina 2/genética , Animales , COVID-19/genética , Genoma Viral , Interacciones Microbiota-Huesped/genética , Humanos , Modelos Moleculares , Pandemias , Dominios y Motivos de Interacción de Proteínas , Estructura Secundaria de Proteína , Proteómica/estadística & datos numéricos , Receptores Virales/química , Receptores Virales/genética , SARS-CoV-2/patogenicidad , Alineación de Secuencia
20.
J Proteome Res ; 20(3): 1464-1475, 2021 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-33605735

RESUMEN

The SARS-CoV-2 virus is the causative agent of the 2020 pandemic leading to the COVID-19 respiratory disease. With many scientific and humanitarian efforts ongoing to develop diagnostic tests, vaccines, and treatments for COVID-19, and to prevent the spread of SARS-CoV-2, mass spectrometry research, including proteomics, is playing a role in determining the biology of this viral infection. Proteomics studies are starting to lead to an understanding of the roles of viral and host proteins during SARS-CoV-2 infection, their protein-protein interactions, and post-translational modifications. This is beginning to provide insights into potential therapeutic targets or diagnostic strategies that can be used to reduce the long-term burden of the pandemic. However, the extraordinary situation caused by the global pandemic is also highlighting the need to improve mass spectrometry data and workflow sharing. We therefore describe freely available data and computational resources that can facilitate and assist the mass spectrometry-based analysis of SARS-CoV-2. We exemplify this by reanalyzing a virus-host interactome data set to detect protein-protein interactions and identify host proteins that could potentially be used as targets for drug repurposing.


Asunto(s)
COVID-19/virología , Difusión de la Información/métodos , Espectrometría de Masas/métodos , SARS-CoV-2/química , COVID-19/epidemiología , Prueba de COVID-19/métodos , Prueba de COVID-19/estadística & datos numéricos , Biología Computacional , Bases de Datos de Proteínas/estadística & datos numéricos , Reposicionamiento de Medicamentos , Interacciones Microbiota-Huesped/fisiología , Humanos , Espectrometría de Masas/estadística & datos numéricos , Pandemias , Dominios y Motivos de Interacción de Proteínas , Mapas de Interacción de Proteínas , Procesamiento Proteico-Postraduccional , Proteómica/métodos , Proteómica/estadística & datos numéricos , SARS-CoV-2/patogenicidad , SARS-CoV-2/fisiología , Proteínas Virales/química , Proteínas Virales/fisiología , Tratamiento Farmacológico de COVID-19
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA