Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
1.
Brief Bioinform ; 23(2)2022 03 10.
Article in English | MEDLINE | ID: mdl-35189633

ABSTRACT

Unrestrained cellular growth and immune escape of a tumor are associated with the incidental errors of the genome and transcriptome. Advances in next-generation sequencing have identified thousands of genomic and transcriptomic aberrations that generate variant peptides that assemble the hidden proteome, further expanding the immunopeptidome. Emerging next-generation sequencing technologies and a number of computational methods estimated the abundance of immune infiltration from bulk transcriptome have advanced our understanding of tumor microenvironments. Here, we will characterize several major types of tumor-specific antigens arising from single-nucleotide variants, insertions and deletions, gene fusion, alternative splicing, RNA editing and non-coding RNAs. Finally, we summarize the current state-of-the-art computational and experimental approaches or resources and provide an integrative pipeline for the identification of candidate tumor antigens. Together, the systematic investigation of the hidden proteome in cancer will help facilitate the development of effective and durable immunotherapy targets for cancer.


Subject(s)
Neoplasms , Proteome , Antigens, Neoplasm/genetics , Genomics , High-Throughput Nucleotide Sequencing , Humans , Neoplasms/genetics , Proteome/genetics , Transcriptome , Tumor Microenvironment
2.
Brief Bioinform ; 23(3)2022 05 13.
Article in English | MEDLINE | ID: mdl-35368072

ABSTRACT

Liquid chromatography-mass spectrometry-based quantitative proteomics can measure the expression of thousands of proteins from biological samples and has been increasingly applied in cancer research. Identifying differentially expressed proteins (DEPs) between tumors and normal controls is commonly used to investigate carcinogenesis mechanisms. While differential expression analysis (DEA) at an individual level is desired to identify patient-specific molecular defects for better patient stratification, most statistical DEP analysis methods only identify deregulated proteins at the population level. To date, robust individualized DEA algorithms have been proposed for ribonucleic acid data, but their performance on proteomics data is underexplored. Herein, we performed a systematic evaluation on five individualized DEA algorithms for proteins on cancer proteomic datasets from seven cancer types. Results show that the within-sample relative expression orderings (REOs) of protein pairs in normal tissues were highly stable, providing the basis for individualized DEA for proteins using REOs. Moreover, individualized DEA algorithms achieve higher precision in detecting sample-specific deregulated proteins than population-level methods. To facilitate the utilization of individualized DEA algorithms in proteomics for prognostic biomarker discovery and personalized medicine, we provide Individualized DEP Analysis IDEPAXMBD (XMBD: Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.) (https://github.com/xmuyulab/IDEPA-XMBD), which is a user-friendly and open-source Python toolkit that integrates individualized DEA algorithms for DEP-associated deregulation pattern recognition.


Subject(s)
Neoplasms , Proteome , Humans , Mass Spectrometry/methods , Neoplasms/genetics , Proteome/analysis , Proteomics/methods , Software
3.
BMC Cancer ; 23(1): 412, 2023 May 08.
Article in English | MEDLINE | ID: mdl-37158852

ABSTRACT

Papillary thyroid cancer (PTC) is the most frequent subtype of thyroid cancer, but 20% of cases are indeterminate (i.e., cannot be accurately diagnosed) based on preoperative cytology, which might lead to surgical removal of a normal thyroid gland. To address this concern, we performed an in-depth analysis of the serum proteomes of 26 PTC patients and 23 healthy controls using antibody microarrays and data-independent acquisition mass spectrometry (DIA-MS). We identified a total of 1091 serum proteins spanning 10-12 orders of magnitude. 166 differentially expressed proteins were identified that participate in complement activation, coagulation cascades, and platelet degranulation pathways. Furthermore, the analysis of serum proteomes before and after surgery indicated that the expression of proteins such as lactate dehydrogenase A and olfactory receptor family 52 subfamily B member 4, which participate in fibrin clot formation and extracellular matrix-receptor interaction pathways, were changed. Further analysis of the proteomes of PTC and neighboring tissues revealed integrin-mediated pathways with possible crosstalk between the tissue and circulating compartments. Among these cross-talk proteins, circulating fibronectin 1 (FN1), gelsolin (GSN) and UDP-glucose 4-epimerase (GALE) were indicated as promising biomarkers for PTC identification and validated in an independent cohort. In differentiating between patients with benign nodules or PTC, FN1 produced the best ELISA result (sensitivity = 96.89%, specificity = 91.67%). Overall, our results present proteomic landscapes of PTC before and after surgery as well as the crosstalk between tissue and the circulatory system, which is valuable to understand PTC pathology and improve PTC diagnostics in the future.


Subject(s)
Fibronectins , Thyroid Neoplasms , Humans , Thyroid Cancer, Papillary/diagnosis , Proteome , Proteomics , Thyroid Neoplasms/diagnosis , Thyroid Neoplasms/surgery , Biomarkers
4.
Entropy (Basel) ; 25(7)2023 Jun 29.
Article in English | MEDLINE | ID: mdl-37509950

ABSTRACT

Feature selection plays an important role in improving the performance of classification or reducing the dimensionality of high-dimensional datasets, such as high-throughput genomics/proteomics data in bioinformatics. As a popular approach with computational efficiency and scalability, information theory has been widely incorporated into feature selection. In this study, we propose a unique weight-based feature selection (WBFS) algorithm that assesses selected features and candidate features to identify the key protein biomarkers for classifying lung cancer subtypes from The Cancer Proteome Atlas (TCPA) database and we further explored the survival analysis between selected biomarkers and subtypes of lung cancer. Results show good performance of the combination of our WBFS method and Bayesian network for mining potential biomarkers. These candidate signatures have valuable biological significance in tumor classification and patient survival analysis. Taken together, this study proposes the WBFS method that helps to explore candidate biomarkers from biomedical datasets and provides useful information for tumor diagnosis or therapy strategies.

5.
Adv Exp Med Biol ; 1188: 113-147, 2019.
Article in English | MEDLINE | ID: mdl-31820386

ABSTRACT

Reverse phase protein array (RPPA) is a functional proteomics technology amenable to moderately high throughputs of samples and antibodies. The University of Texas MD Anderson Cancer Center RPPA Core Facility has implemented various processes and techniques to maximize RPPA throughput; key among them are maximizing array configuration and relying on database management and automation. One major tool used by the RPPA Core is a semi-automated RPPA process management system referred to as the RPPA Pipeline. The RPPA Pipeline, developed with the aid of MD Avnderson's Department of Bioinformatics and Computational Biology and InSilico Solutions, has streamlined sample and antibody tracking as well as advanced quality control measures of various RPPA processes. This chapter covers RPPA Core processes associated with the RPPA Pipeline workflow from sample receipt to sample printing to slide staining and RPPA report generation that enables the RPPA Core to process at least 13,000 samples per year with approximately 450 individual RPPA-quality antibodies. Additionally, this chapter will cover results of large-scale clinical sample processing, including The Cancer Genome Atlas Project and The Cancer Proteome Atlas.


Subject(s)
Protein Array Analysis , Proteomics , Clinical Studies as Topic , Humans , Proteome , Proteomics/instrumentation , Proteomics/methods , Proteomics/trends , Quality Control
6.
Clin Proteomics ; 15: 4, 2018.
Article in English | MEDLINE | ID: mdl-29416445

ABSTRACT

The Human Cancer Proteome Project (Cancer-HPP) is an international initiative organized by HUPO whose key objective is to decipher the human cancer proteome through a coordinated effort by cancer proteome researchers around the world. The ultimate goal is to map the entire human cancer proteome to disclose tumor biology and drive improved diagnostics, treatment and management of cancer. Here we report the progress in the cancer proteomics field to date, and discuss future proteomic developments that will be needed to optimally delineate cancer phenotypes and advance the molecular characterization of this significant disease that is one of the leading causes of death worldwide.

7.
J Proteome Res ; 16(5): 1936-1943, 2017 05 05.
Article in English | MEDLINE | ID: mdl-28317375

ABSTRACT

Proteogenomic studies aiming at identification of variant peptides using customized database searches of mass spectrometry data are facing a dilemma of selecting the most efficient database search strategy: A choice has to be made between using combined or sequential searches against reference (wild-type) and mutant protein databases or directly against the mutant database without the wild-type one. Here we called these approaches "all-together", "one-by-one", and "direct", respectively. We share the results of the comparison of these search strategies obtained for large data sets of publicly available proteogenomic data. On the basis of the results of this evaluation, we found that the "all-together" strategy provided, in general, more variant peptide identifications compared with the "one-by-one" approach, while showing similar performance for some specific cases. To validate further the results of this study, we performed a control comparison of the strategies in question using publicly available data for a mixture of the annotated human protein standard UPS1 and E. coli. For these data, both "all-together" and "one-by-one" approaches showed similar sensitivity and specificity of the searches, while the "direct" approach resulted in an increased number of false identifications.


Subject(s)
Databases, Protein , Proteogenomics/methods , Databases, Factual , Escherichia coli Proteins , Humans , Mass Spectrometry , Mutant Proteins , Peptides/genetics , Proteogenomics/standards , Sensitivity and Specificity
8.
J Proteomics ; 280: 104895, 2023 05 30.
Article in English | MEDLINE | ID: mdl-37024076

ABSTRACT

The Cancer Proteome Atlas (TCPA) project collects reverse-phase protein arrays (RPPA)-based proteome datasets from nearly 8000 samples across 32 cancer types. This study aims to investigate the pan-cancer proteome signature and identify cancer subtypes of glioma, kidney cancer, and lung cancer based on TCPA data. We first visualized the tumor clustering models using t-distributed stochastic neighbour embedding (t-SNE) and bi-clustering heatmap. Then, three feature selection methods (pyHSICLasso, XGBoost, and Random Forest) were performed to select protein features for classifying cancer subtypes in training dataset, and the LibSVM algorithm was empolyed to test classification accuracy in the validation dataset. Clustering analysis revealed that different kinds of tumors have relatively distinct proteomic profiling based on tissue or origin. We identified 20, 10, and 20 protein features with the highest accuracies in classifying subtypes of glioma, kidney cancer, and lung cancer, respectively. The predictive abilities of the selected proteins were confirmed by receiving operating characteristic (ROC) analysis. Finally, the Bayesian network was utilized to explore the protein biomarkers that have direct causal relationships with cancer subtypes. Overall, we highlight the theoretical and technical applications of machine learning based feature selection approaches in the analysis of high-throughput biological data, particularly for cancer biomarker research. SIGNIFICANCE: Functional proteomics is a powerful approach for characterizing cell signaling pathways and understanding their phenotypic effects on cancer development. The TCPA database provides a platform to explore and analyze TCGA pan-cancer RPPA-based protein expression. With the advent of the RPPA technology, the availability of high-throughput data in TCPA platform has made it possible to use machine learning methods to identify protein biomarkers and further differentiate subtypes of cancer based on proteomic data. In this study, we highlight the role of feature selection and Bayesian network in discovery protein biomarker for classifying cancer subtypes based on functional proteomic data. The application of machine learning methods in the analysis of high-throughput biological data, particularly for cancer biomarker researches, which have potential clinical values in developing individualized treatment strategies.


Subject(s)
Carcinoma, Renal Cell , Glioma , Kidney Neoplasms , Lung Neoplasms , Humans , Proteomics/methods , Proteome/metabolism , Bayes Theorem , Biomarkers, Tumor/metabolism
9.
Aging (Albany NY) ; 12(19): 19740-19755, 2020 Oct 13.
Article in English | MEDLINE | ID: mdl-33049713

ABSTRACT

Currently no reliable indicators are available for predicting the clinical outcome of head and neck squamous cell carcinoma (HNSCC). This study aimed to develop a protein-based model to improve the prognosis prediction of HNSCC. The proteome data of HNSCC cohort was downloaded from The Cancer Proteome Atlas (TCPA) portal. The TCPA HNSCC cohort was randomly divided into the discovery and validation cohort. A protein-based risk signature was developed with the discovery cohort, and then verified with the validation cohort. The prognostic value of HER3_pY1289 was further determined. We have constructed a five-protein risk signature which was strongly associated with the overall survival (OS) in the discovery cohort. Similar findings were observed in the validation cohort. The protein-based risk signature was identified as an independent prognostic factor for HNSCC. A nomogram model built on the protein-based risk signature exhibited good performance for predicting OS. Our immunohistochemistry (IHC) analysis showed that higher HER3_pY1289 staining intensity was closely associated with unfavorable prognosis of HNSCC. HER3 suppression inhibited the proliferation and invasion capacity of HNSCC cells. Collectively, we have developed a protein-based risk signature for accurately predicting the prognosis of HNSCC, which might provide valuable information for optimal individualized treatment regimens.

10.
Oncotarget ; 9(10): 9400-9414, 2018 Feb 06.
Article in English | MEDLINE | ID: mdl-29507698

ABSTRACT

Glioblastoma (GBM) is a highly aggressive brain cancer with poor prognosis and low survival rate. Invasive cancer stem-like cells (CSCs) are responsible for tumor recurrence because they escape current treatments. Our main goal was to study the proteome of three GBM subpopulations to identify key molecules behind GBM cell phenotypes and potential cell markers for migrating cells. We used SuperQuant-an enhanced quantitative proteome approach-to increase proteome coverage. We found 148 proteins differentially regulated in migrating CSCs and 199 proteins differentially regulated in differentiated cells. We used Ingenuity Pathway Analysis (IPA) to predict upstream regulators, downstream effects and canonical pathways associated with regulated proteins. IPA analysis predicted activation of integrin-linked kinase (ILK) signaling, actin cytoskeleton signaling, and lysine demethylase 5B (KDM5B) in CSC migration. Moreover, our data suggested that microRNA-122 (miR-122) is a potential upstream regulator of GBM phenotypes as miR-122 activation was predicted for differentiated cells while its inhibition was predicted for migrating CSCs. Finally, we validated transferrin (TF) and procollagen-lysine 2-oxoglutarate 5-dioxygenase 2 (PLOD2) as potential markers for migrating cells.

11.
Proteomes ; 5(4)2017 Oct 25.
Article in English | MEDLINE | ID: mdl-29068423

ABSTRACT

During the past century, our understanding of cancer diagnosis and treatment has been based on a monogenic approach, and as a consequence our knowledge of the clinical genetic underpinnings of cancer is incomplete. Since the completion of the human genome in 2003, it has steered us into therapeutic target discovery, enabling us to mine the genome using cutting edge proteogenomics tools. A number of novel and promising cancer targets have emerged from the genome project for diagnostics, therapeutics, and prognostic markers, which are being used to monitor response to cancer treatment. The heterogeneous nature of cancer has hindered progress in understanding the underlying mechanisms that lead to abnormal cellular growth. Since, the start of The Cancer Genome Atlas (TCGA), and the International Genome consortium projects, there has been tremendous progress in genome sequencing and immense numbers of cancer genomes have been completed, and this approach has transformed our understanding of the diagnosis and treatment of different types of cancers. By employing Genomics and proteomics technologies, an immense amount of genomic data is being generated on clinical tumors, which has transformed the cancer landscape and has the potential to transform cancer diagnosis and prognosis. A complete molecular view of the cancer landscape is necessary for understanding the underlying mechanisms of cancer initiation to improve diagnosis and prognosis, which ultimately will lead to personalized treatment. Interestingly, cancer proteome analysis has also allowed us to identify biomarkers to monitor drug and radiation resistance in patients undergoing cancer treatment. Further, TCGA-funded studies have allowed for the genomic and transcriptomic characterization of targeted cancers, this analysis aiding the development of targeted therapies for highly lethal malignancy. High-throughput technologies, such as complete proteome, epigenome, protein-protein interaction, and pharmacogenomics data, are indispensable to glean into the cancer genome and proteome and these approaches have generated multidimensional universal studies of genes and proteins (OMICS) data which has the potential to facilitate precision medicine. However, due to slow progress in computational technologies, the translation of big omics data into their clinical aspects have been slow. In this review, attempts have been made to describe the role of high-throughput genomic and proteomic technologies in identifying a panel of biomarkers which could be used for the early diagnosis and prognosis of cancer.

SELECTION OF CITATIONS
SEARCH DETAIL