RESUMO
BACKGROUND: Breast cancer (BC) is the most commonly diagnosed cancer and the leading cause of cancer death among women globally. Despite advances, there is considerable variation in clinical outcomes for patients with non-luminal A tumors, classified as difficult-to-treat breast cancers (DTBC). This study aims to delineate the proteogenomic landscape of DTBC tumors compared to luminal A (LumA) tumors. METHODS: We retrospectively collected a total of 117 untreated primary breast tumor specimens, focusing on DTBC subtypes. Breast tumors were processed by laser microdissection (LMD) to enrich tumor cells. DNA, RNA, and protein were simultaneously extracted from each tumor preparation, followed by whole genome sequencing, paired-end RNA sequencing, global proteomics and phosphoproteomics. Differential feature analysis, pathway analysis and survival analysis were performed to better understand DTBC and investigate biomarkers. RESULTS: We observed distinct variations in gene mutations, structural variations, and chromosomal alterations between DTBC and LumA breast tumors. DTBC tumors predominantly had more mutations in TP53, PLXNB3, Zinc finger genes, and fewer mutations in SDC2, CDH1, PIK3CA, SVIL, and PTEN. Notably, Cytoband 1q21, which contains numerous cell proliferation-related genes, was significantly amplified in the DTBC tumors. LMD successfully minimized stromal components and increased RNA-protein concordance, as evidenced by stromal score comparisons and proteomic analysis. Distinct DTBC and LumA-enriched clusters were observed by proteomic and phosphoproteomic clustering analysis, some with survival differences. Phosphoproteomics identified two distinct phosphoproteomic profiles for high relapse-risk and low relapse-risk basal-like tumors, involving several genes known to be associated with breast cancer oncogenesis and progression, including KIAA1522, DCK, FOXO3, MYO9B, ARID1A, EPRS, ZC3HAV1, and RBM14. Lastly, an integrated pathway analysis of multi-omics data highlighted a robust enrichment of proliferation pathways in DTBC tumors. CONCLUSIONS: This study provides an integrated proteogenomic characterization of DTBC vs LumA with tumor cells enriched through laser microdissection. We identified many common features of DTBC tumors and the phosphopeptides that could serve as potential biomarkers for high/low relapse-risk basal-like BC and possibly guide treatment selections.
Assuntos
Biomarcadores Tumorais , Neoplasias da Mama , Proteogenômica , Humanos , Feminino , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Neoplasias da Mama/metabolismo , Neoplasias da Mama/mortalidade , Biomarcadores Tumorais/genética , Proteogenômica/métodos , Mutação , Microdissecção e Captura a Laser , Pessoa de Meia-Idade , Estudos Retrospectivos , Idoso , Adulto , Proteômica/métodos , PrognósticoRESUMO
PURPOSE: To explore the association of clinicopathologic and molecular factors with the occurrence of positive margins after first surgery in breast cancer. METHODS: The clinical and RNA-Seq data for 951 (75 positive and 876 negative margins) primary breast cancer patients from The Cancer Genome Atlas (TCGA) were used. The role of each clinicopathologic factor for margin prediction and also their impact on survival were evaluated using logistic regression, Fisher's exact test, and Cox proportional hazards regression models. In addition, differential expression analysis on a matched dataset (71 positive and 71 negative margins) was performed using Deseq2 and LASSO regression. RESULTS: Association studies showed that higher stage, larger tumor size (T), positive lymph nodes (N), and presence of distant metastasis (M) significantly contributed (p ≤ 0.05) to positive surgical margins. In case of surgery, lumpectomy was significantly associated with positive margin compared to mastectomy. Moreover, PAM50 Luminal A subtype had higher chance of positive margin resection compared to Basal-like subtype. Survival models demonstrated that positive margin status along with higher stage, higher TNM, and negative hormone receptor status was significant for disease progression. We also found that margin status might be a surrogate of tumor stage. In addition, 29 genes that could be potential positive margin predictors and 8 pathways were identified from molecular data analysis. CONCLUSION: The occurrence of positive margins after surgery was associated with various clinical factors, similar to the findings reported in earlier studies. In addition, we found that the PAM50 intrinsic subtype Luminal A has more chance of obtaining positive margins compared to Basal type. As the first effort to pursue molecular understanding of the margin status, a gene panel of 29 genes including 17 protein-coding genes was also identified for potential prediction of the margin status which needs to be validated using a larger sample set.
Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/genética , Neoplasias da Mama/cirurgia , Neoplasias da Mama/metabolismo , Mastectomia , Margens de Excisão , Mama/patologia , Mastectomia Segmentar , Estudos Retrospectivos , Recidiva Local de Neoplasia/patologiaRESUMO
PURPOSE: Molecular similarities have been reported between basal-like breast cancer (BLBC) and high-grade serous ovarian cancer (HGSOC). To date, there have been no prognostic biomarkers that can provide risk stratification and inform treatment decisions for both BLBC and HGSOC. In this study, we developed a molecular signature for risk stratification in BLBC and further validated this signature in HGSOC. METHODS: RNA-seq data was downloaded from The Cancer Genome Atlas (TCGA) project for 190 BLBC and 314 HGSOC patients. Analyses of differentially expressed genes between recurrent vs. non-recurrent cases were performed using different bioinformatics methods. Gene Signature was established using weighted linear combination of gene expression levels. Their prognostic performance was evaluated using survival analysis based on progression-free interval (PFI) and disease-free interval (DFI). RESULTS: 63 genes were differentially expressed between 18 recurrent and 40 non-recurrent BLBC patients by two different methods. The recurrence index (RI) calculated from this 63-gene signature significantly stratified BLBC patients into two risk groups with 38 and 152 patients in the low-risk (RI-Low) and high-risk (RI-High) groups, respectively (p = 0.0004 and 0.0023 for PFI and DFI, respectively). Similar performance was obtained in the HGSOC cohort (p = 0.0131 and 0.004 for PFI and DFI, respectively). Multivariate Cox regression adjusting for age, grade, and stage showed that the 63-gene signature remained statistically significant in stratifying HGSOC patients (p = 0.0005). CONCLUSION: A gene signature was identified to predict recurrence in BLBC and HGSOC patients. With further validation, this signature may provide an additional prognostic tool for clinicians to better manage BLBC, many of which are triple-negative and HGSOC patients who are currently difficult to treat.
Assuntos
Neoplasias da Mama , Cistadenocarcinoma Seroso , Neoplasias Ovarianas , Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Cistadenocarcinoma Seroso/genética , Feminino , Humanos , Recidiva Local de Neoplasia/genética , Neoplasias Ovarianas/genética , PrognósticoRESUMO
BACKGROUND: Proteomic studies are typically conducted using flash-frozen (FF) samples utilizing tandem mass spectrometry (MS). However, FF specimens are comprised of multiple cell types, making it difficult to ascertain the proteomic profiles of specific cells. Conversely, OCT-embedded (Optimal Cutting Temperature compound) specimens can undergo laser microdissection (LMD) to capture and study specific cell types separately from the cell mixture. In the current study, we compared proteomic data obtained from FF and OCT samples to determine if samples that are stored and processed differently produce comparable results. METHODS: Proteins were extracted from FF and OCT-embedded invasive breast tumors from 5 female patients. FF specimens were lysed via homogenization (FF/HOM) while OCT-embedded specimens underwent LMD to collect only tumor cells (OCT/LMD-T) or both tumor and stromal cells (OCT/LMD-TS) followed by incubation at 37 °C. Proteins were extracted using the illustra triplePrep kit and then trypsin-digested, TMT-labeled, and processed by two-dimensional liquid chromatography-tandem mass spectrometry (2D LC-MS/MS). Proteins were identified and quantified with Proteome Discoverer v1.4 and comparative analyses performed to identify proteins that were significantly differentially expressed amongst the different processing methods. RESULTS: Among the 4,950 proteins consistently quantified across all samples, 216 and 171 proteins were significantly differentially expressed (adjusted p-value < 0.05; |log2 FC|> 1) between FF/HOM vs. OCT/LMD-T and FF/HOM vs. OCT/LMD-TS, respectively, with most proteins being more highly abundant in the FF/HOM samples. PCA and unsupervised hierarchical clustering analysis with these 216 and 171 proteins were able to distinguish FF/HOM from OCT/LMD-T and OCT/LMD-TS samples, respectively. Similar analyses using significantly differentially enriched GO terms also discriminated FF/HOM from OCT/LMD samples. No significantly differentially expressed proteins were detected between the OCT/LMD-T and OCT/LMD-TS samples but trended differences were detected. CONCLUSIONS: The proteomic profiles of the OCT/LMD-TS samples were more similar to those from OCT/LMD-T samples than FF/HOM samples, suggesting a strong influence from the sample processing methods. These results indicate that in LC-MS/MS proteomic studies, FF/HOM samples exhibit different protein expression profiles from OCT/LMD samples and thus, results from these two different methods cannot be directly compared.
RESUMO
Alternatively spliced introns are the ones that are usually spliced but can be occasionally retained in a transcript isoform. They are the most frequently used alternative splice form in plants (~50% of alternative splicing events). Chlamydomonas reinhardtii, a unicellular alga, is a good model to understand alternative splicing (AS) in plants from an evolutionary perspective as it diverged from land plants a billion years ago. Using over 7 million cDNA sequences from both pyrosequencing and Sanger sequencing, we found that a much higher percentage of genes (~20% of multi-exon genes) undergo AS than previously reported (3-5%). We found a full component of SR and SR-like proteins possibly involved in AS. The most prevalent type of AS event (40%) was retention of introns, most of which were supported by multiple cDNA evidence (72%) while only 20% of them have coding capacity. By comparing retained and constitutive introns, we identified sequence features potentially responsible for the retention of introns, in the framework of an "intron definition" model for splicing. We find that retained introns tend to have a weaker 5' splice site, more Gs in their poly-pyrimidine tract and a lesser conservation of nucleotide 'C' at position -3 of the 3' splice site. In addition, the sequence motifs found in the potential branch-point region differed between retained and constitutive introns. Furthermore, the enrichment of G-triplets and C-triplets among the first and last 50 nt of the introns significantly differ between constitutive and retained introns. These could serve as intronic splicing enhancers. All the alternative splice forms can be accessed at http://bioinfolab.miamioh.edu/cgi-bin/PASA_r20140417/cgi-bin/status_report.cgi?db=Chre_AS .
Assuntos
Processamento Alternativo/genética , Chlamydomonas reinhardtii/genética , Simulação por Computador , Regulação da Expressão Gênica/fisiologia , Íntrons/genética , Sequência de BasesRESUMO
Numerous multi-omic investigations of cancer tissue have documented varying and poor pairwise transcript:protein quantitative correlations, and most deconvolution tools aiming to predict cell type proportions (cell admixture) have been developed and credentialed using transcript-level data alone. To estimate cell admixture using protein abundance data, we analyzed proteome and transcriptome data generated from contrived admixtures of tumor, stroma, and immune cell models or those selectively harvested from the tissue microenvironment by laser microdissection from high grade serous ovarian cancer (HGSOC) tumors. Co-quantified transcripts and proteins performed similarly to estimate stroma and immune cell admixture (r ≥ 0.63) in two commonly used deconvolution algorithms, ESTIMATE or ConsensusTME. We further developed and optimized protein-based signatures estimating cell admixture proportions and benchmarked these using bulk tumor proteomic data from over 150 patients with HGSOC. The optimized protein signatures supporting cell type proportion estimates from bulk tissue proteomic data are available at https://lmdomics.org/ProteoMixture/.
RESUMO
The PAM50 classifier is widely used for breast tumor intrinsic subtyping based on gene expression. Clinical subtyping, however, is based on immunohistochemistry assays of 3-4 biomarkers. Subtype calls by these two methods do not completely match even on comparable subtypes. Nevertheless, the estrogen receptor (ER)-balanced subset for gene-centering in PAM50 subtyping, is selected based on clinical ER status. Here we present a new method called Principle Component Analysis-based iterative PAM50 subtyping (PCA-PAM50) to perform intrinsic subtyping in ER status unbalanced cohorts. This method leverages PCA and iterative PAM50 calls to derive the gene expression-based ER status and a subsequent ER-balanced subset for gene centering. Applying PCA-PAM50 to three different breast cancer study cohorts, we observed improved consistency (by 6-9.3%) between intrinsic and clinical subtyping for all three cohorts. Particularly, a more aggressive subset of luminal A (LA) tumors as evidenced by higher MKI67 gene expression and worse patient survival outcomes, were reclassified as luminal B (LB) increasing the LB subtype consistency with IHC by 25-49%. In conclusion, we show that PCA-PAM50 enhances the consistency of breast cancer intrinsic and clinical subtyping by reclassifying an aggressive subset of LA tumors into LB. PCA-PAM50 code is available at ftp://ftp.wriwindber.org/ .
Assuntos
Biomarcadores Tumorais/genética , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Receptor alfa de Estrogênio/genética , Antígeno Ki-67/genética , Receptor ErbB-2/genética , Biomarcadores Tumorais/metabolismo , Neoplasias da Mama/classificação , Neoplasias da Mama/mortalidade , Estudos de Coortes , Receptor alfa de Estrogênio/metabolismo , Feminino , Expressão Gênica , Perfilação da Expressão Gênica , Humanos , Imuno-Histoquímica , Antígeno Ki-67/metabolismo , Análise de Componente Principal , Prognóstico , Análise Serial de Proteínas , Receptor ErbB-2/metabolismo , Análise de Sobrevida , Terminologia como AssuntoRESUMO
One of the important modes of pre-mRNA post-transcriptional modification is alternative splicing. Alternative splicing allows creation of many distinct mature mRNA transcripts from a single gene by utilizing different splice sites. In plants like Arabidopsis thaliana, the most common type of alternative splicing is intron retention. Many studies in the past focus on positional distribution of retained introns (RIs) among different genic regions and their expression regulations, while little systematic classification of RIs from constitutively spliced introns (CSIs) has been conducted using machine learning approaches. We used random forest and support vector machine (SVM) with radial basis kernel function (RBF) to differentiate these two types of introns in Arabidopsis. By comparing coordinates of introns of all annotated mRNAs from TAIR10, we obtained our high-quality experimental data. To distinguish RIs from CSIs, We investigated the unique characteristics of RIs in comparison with CSIs and finally extracted 37 quantitative features: local and global nucleotide sequence features of introns, frequent motifs, the signal strength of splice sites, and the similarity between sequences of introns and their flanking regions. We demonstrated that our proposed feature extraction approach was more accurate in effectively classifying RIs from CSIs in comparison with other four approaches. The optimal penalty parameter C and the RBF kernel parameter [Formula: see text] in SVM were set based on particle swarm optimization algorithm (PSOSVM). Our classification performance showed F-Measure of 80.8% (random forest) and 77.4% (PSOSVM). Not only the basic sequence features and positional distribution characteristics of RIs were obtained, but also putative regulatory motifs in intron splicing were predicted based on our feature extraction approach. Clearly, our study will facilitate a better understanding of underlying mechanisms involved in intron retention.