RESUMEN
Breast cancer is a heterogeneous disease, and treatment is guided by biomarker profiles representing distinct molecular subtypes. Breast cancer arises from the breast ductal epithelium, and experimental data suggests breast cancer subtypes have different cells of origin within that lineage. The precise cells of origin for each subtype and the transcriptional networks that characterize these tumor-normal lineages are not established. In this work, we applied bulk, single-cell (sc), and single-nucleus (sn) multi-omic techniques as well as spatial transcriptomics and multiplex imaging on 61 samples from 37 breast cancer patients to show characteristic links in gene expression and chromatin accessibility between breast cancer subtypes and their putative cells of origin. We applied the PAM50 subtyping algorithm in tandem with bulk RNA-seq and snRNA-seq to reliably subtype even low-purity tumor samples and confirm promoter accessibility using snATAC. Trajectory analysis of chromatin accessibility and differentially accessible motifs clearly connected progenitor populations with breast cancer subtypes supporting the cell of origin for basal-like and luminal A and B tumors. Regulatory network analysis of transcription factors underscored the importance of BHLHE40 in luminal breast cancer and luminal mature cells, and KLF5 in basal-like tumors and luminal progenitor cells. Furthermore, we identify key genes defining the basal-like ( PRKCA , SOX6 , RGS6 , KCNQ3 ) and luminal A/B ( FAM155A , LRP1B ) lineages, with expression in both precursor and cancer cells and further upregulation in tumors. Exhausted CTLA4-expressing CD8+ T cells were enriched in basal-like breast cancer, suggesting altered means of immune dysfunction among breast cancer subtypes. We used spatial transcriptomics and multiplex imaging to provide spatial detail for key markers of benign and malignant cell types and immune cell colocation. These findings demonstrate analysis of paired transcription and chromatin accessibility at the single cell level is a powerful tool for investigating breast cancer lineage development and highlight transcriptional networks that define basal and luminal breast cancer lineages.
RESUMEN
The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) investigates tumors from a proteogenomic perspective, creating rich multi-omics datasets connecting genomic aberrations to cancer phenotypes. To facilitate pan-cancer investigations, we have generated harmonized genomic, transcriptomic, proteomic, and clinical data for >1000 tumors in 10 cohorts to create a cohesive and powerful dataset for scientific discovery. We outline efforts by the CPTAC pan-cancer working group in data harmonization, data dissemination, and computational resources for aiding biological discoveries. We also discuss challenges for multi-omics data integration and analysis, specifically the unique challenges of working with both nucleotide sequencing and mass spectrometry proteomics data.
Asunto(s)
Neoplasias , Proteogenómica , Humanos , Proteómica , Genómica , Neoplasias/genética , Perfilación de la Expresión GénicaRESUMEN
Cancer driver events refer to key genetic aberrations that drive oncogenesis; however, their exact molecular mechanisms remain insufficiently understood. Here, our multi-omics pan-cancer analysis uncovers insights into the impacts of cancer drivers by identifying their significant cis-effects and distal trans-effects quantified at the RNA, protein, and phosphoprotein levels. Salient observations include the association of point mutations and copy-number alterations with the rewiring of protein interaction networks, and notably, most cancer genes converge toward similar molecular states denoted by sequence-based kinase activity profiles. A correlation between predicted neoantigen burden and measured T cell infiltration suggests potential vulnerabilities for immunotherapies. Patterns of cancer hallmarks vary by polygenic protein abundance ranging from uniform to heterogeneous. Overall, our work demonstrates the value of comprehensive proteogenomics in understanding the functional states of oncogenic drivers and their links to cancer development, surpassing the limitations of studying individual cancer types.
Asunto(s)
Neoplasias , Proteogenómica , Humanos , Neoplasias/genética , Oncogenes , Transformación Celular Neoplásica/genética , Variaciones en el Número de Copia de ADNRESUMEN
Identifying tumor-cell-specific markers and elucidating their epigenetic regulation and spatial heterogeneity provides mechanistic insights into cancer etiology. Here, we perform snRNA-seq and snATAC-seq in 34 and 28 human clear cell renal cell carcinoma (ccRCC) specimens, respectively, with matched bulk proteogenomics data. By identifying 20 tumor-specific markers through a multi-omics tiered approach, we reveal an association between higher ceruloplasmin (CP) expression and reduced survival. CP knockdown, combined with spatial transcriptomics, suggests a role for CP in regulating hyalinized stroma and tumor-stroma interactions in ccRCC. Intratumoral heterogeneity analysis portrays tumor cell-intrinsic inflammation and epithelial-mesenchymal transition (EMT) as two distinguishing features of tumor subpopulations. Finally, BAP1 mutations are associated with widespread reduction of chromatin accessibility, while PBRM1 mutations generally increase accessibility, with the former affecting five times more accessible peaks than the latter. These integrated analyses reveal the cellular architecture of ccRCC, providing insights into key markers and pathways in ccRCC tumorigenesis.
Asunto(s)
Carcinoma de Células Renales , Neoplasias Renales , Humanos , Carcinoma de Células Renales/patología , Neoplasias Renales/patología , Transcriptoma , Epigénesis Genética , Proteínas Supresoras de Tumor/genética , Regulación Neoplásica de la Expresión GénicaRESUMEN
Tumor-associated macrophages (TAMs) are abundant in pancreatic ductal adenocarcinomas (PDACs). While TAMs are known to proliferate in cancer tissues, the impact of this on macrophage phenotype and disease progression is poorly understood. We showed that in PDAC, proliferation of TAMs could be driven by colony stimulating factor-1 (CSF1) produced by cancer-associated fibroblasts. CSF1 induced high levels of p21 in macrophages, which regulated both TAM proliferation and phenotype. TAMs in human and mouse PDACs with high levels of p21 had more inflammatory and immunosuppressive phenotypes. p21 expression in TAMs was induced by both stromal interaction and/or chemotherapy treatment. Finally, by modeling p21 expression levels in TAMs, we found that p21-driven macrophage immunosuppression in vivo drove tumor progression. Serendipitously, the same p21-driven pathways that drive tumor progression also drove response to CD40 agonist. These data suggest that stromal or therapy-induced regulation of cell cycle machinery can regulate both macrophage-mediated immune suppression and susceptibility to innate immunotherapy.
Asunto(s)
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Animales , Ratones , Humanos , Neoplasias Pancreáticas/metabolismo , Macrófagos/metabolismo , Carcinoma Ductal Pancreático/metabolismo , Inmunoterapia , Proliferación Celular , Microambiente Tumoral , Línea Celular TumoralRESUMEN
Multiple myeloma (MM) is a highly refractory hematologic cancer. Targeted immunotherapy has shown promise in MM but remains hindered by the challenge of identifying specific yet broadly representative tumor markers. We analyzed 53 bone marrow (BM) aspirates from 41 MM patients using an unbiased, high-throughput pipeline for therapeutic target discovery via single-cell transcriptomic profiling, yielding 38 MM marker genes encoding cell-surface proteins and 15 encoding intracellular proteins. Of these, 20 candidate genes were highlighted that are not yet under clinical study, 11 of which were previously uncharacterized as therapeutic targets. The findings were cross-validated using bulk RNA sequencing, flow cytometry, and proteomic mass spectrometry of MM cell lines and patient BM, demonstrating high overall concordance across data types. Independent discovery using bulk RNA sequencing reiterated top candidates, further affirming the ability of single-cell transcriptomics to accurately capture marker expression despite limitations in sample size or sequencing depth. Target dynamics and heterogeneity were further examined using both transcriptomic and immuno-imaging methods. In summary, this study presents a robust and broadly applicable strategy for identifying tumor markers to better inform the development of targeted cancer therapy. SIGNIFICANCE: Single-cell transcriptomic profiling and multiomic cross-validation to uncover therapeutic targets identifies 38 myeloma marker genes, including 11 transcribing surface proteins with previously uncharacterized potential for targeted antitumor therapy.
Asunto(s)
Mieloma Múltiple , Humanos , Mieloma Múltiple/tratamiento farmacológico , Mieloma Múltiple/genética , Multiómica , Proteómica , Biomarcadores de Tumor/genética , Perfilación de la Expresión Génica/métodosRESUMEN
Changes in metabolism are a hallmark of cancer, but molecular signatures of altered bioenergetics to aid in clinical decision-making do not currently exist. We recently identified a group of human tumors with constitutively reduced expression of the mitochondrial structural protein, Mic60, also called mitofilin or inner membrane mitochondrial protein (IMMT). These Mic60-low tumors exhibit severe loss of mitochondrial fitness, paradoxically accompanied by increased metastatic propensity and upregulation of a unique transcriptome of Interferon (IFN) signaling and Senescence-Associated Secretory Phenotype (SASP). Here, we show that an optimized, 11-gene signature of Mic60-low tumors is differentially expressed in multiple malignancies, compared to normal tissues, and correlates with poor patient outcome. When analyzed in three independent patient cohorts of pancreatic ductal adenocarcinoma (PDAC), the Mic60-low gene signature was associated with aggressive disease variants, local inflammation, FOLFIRINOX failure and shortened survival, independently of age, gender, or stage. Therefore, the 11-gene Mic60-low signature may provide an easily accessible molecular tool to stratify patient risk in PDAC and potentially other malignancies.
Asunto(s)
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Protocolos de Quimioterapia Combinada Antineoplásica , Carcinoma Ductal Pancreático/patología , Humanos , Interferones , Proteínas Mitocondriales/metabolismo , Neoplasias Pancreáticas/patología , Neoplasias PancreáticasRESUMEN
Motivation: The use of single-cell methods is expanding at an ever-increasing rate. While there are established algorithms that address cell classification, they are limited in terms of cross platform compatibility, reliance on the availability of a reference dataset and classification interpretability. Here, we introduce Pollock, a suite of algorithms for cell type identification that is compatible with popular single-cell methods and analysis platforms, provides a set of pretrained human cancer reference models, and reports interpretability scores that identify the genes that drive cell type classifications. Results: Pollock performs comparably to existing classification methods, while offering easily deployable pretrained classification models across a wide variety of tissue and data types. Additionally, it demonstrates utility in immune pan-cancer analysis. Availability and implementation: Source code and documentation are available at https://github.com/ding-lab/pollock. Pretrained models and datasets are available for download at https://zenodo.org/record/5895221. Supplementary information: Supplementary data are available at Bioinformatics Advances online.
RESUMEN
Advances in mass-spectrometry have generated increasingly large-scale proteomics datasets containing tens of thousands of phosphorylation sites (phosphosites) that require prioritization. We develop a bioinformatics tool called HotPho and systematically discover 3D co-clustering of phosphosites and cancer mutations on protein structures. HotPho identifies 474 such hybrid clusters containing 1255 co-clustering phosphosites, including RET p.S904/Y928, the conserved HRAS/KRAS p.Y96, and IDH1 p.Y139/IDH2 p.Y179 that are adjacent to recurrent mutations on protein structures not found by linear proximity approaches. Hybrid clusters, enriched in histone and kinase domains, frequently include expression-associated mutations experimentally shown as activating and conferring genetic dependency. Approximately 300 co-clustering phosphosites are verified in patient samples of 5 cancer types or previously implicated in cancer, including CTNNB1 p.S29/Y30, EGFR p.S720, MAPK1 p.S142, and PTPN12 p.S275. In summary, systematic 3D clustering analysis highlights nearly 3,000 likely functional mutations and over 1000 cancer phosphosites for downstream investigation and evaluation of potential clinical relevance.
Asunto(s)
Biología Computacional/métodos , Mutación , Neoplasias/genética , Proteómica/métodos , Sitios de Unión/genética , Análisis por Conglomerados , Receptores ErbB/metabolismo , Humanos , Espectrometría de Masas/métodos , Neoplasias/metabolismo , Fosforilación , Proteína Tirosina Fosfatasa no Receptora Tipo 12/metabolismo , beta Catenina/metabolismoRESUMEN
We present a proteogenomic study of 108 human papilloma virus (HPV)-negative head and neck squamous cell carcinomas (HNSCCs). Proteomic analysis systematically catalogs HNSCC-associated proteins and phosphosites, prioritizes copy number drivers, and highlights an oncogenic role for RNA processing genes. Proteomic investigation of mutual exclusivity between FAT1 truncating mutations and 11q13.3 amplifications reveals dysregulated actin dynamics as a common functional consequence. Phosphoproteomics characterizes two modes of EGFR activation, suggesting a new strategy to stratify HNSCCs based on EGFR ligand abundance for effective treatment with inhibitory EGFR monoclonal antibodies. Widespread deletion of immune modulatory genes accounts for low immune infiltration in immune-cold tumors, whereas concordant upregulation of multiple immune checkpoint proteins may underlie resistance to anti-programmed cell death protein 1 monotherapy in immune-hot tumors. Multi-omic analysis identifies three molecular subtypes with high potential for treatment with CDK inhibitors, anti-EGFR antibody therapy, and immunotherapy, respectively. Altogether, proteogenomics provides a systematic framework to inform HNSCC biology and treatment.
Asunto(s)
Antineoplásicos Inmunológicos/uso terapéutico , Infecciones por Papillomavirus/genética , Carcinoma de Células Escamosas de Cabeza y Cuello/tratamiento farmacológico , Carcinoma de Células Escamosas de Cabeza y Cuello/genética , Adulto , Anciano , Anciano de 80 o más Años , Receptores ErbB/genética , Femenino , Humanos , Inmunoterapia/métodos , Masculino , Persona de Mediana Edad , Infecciones por Papillomavirus/tratamiento farmacológico , Infecciones por Papillomavirus/virología , Proteogenómica/métodos , Proteómica/métodos , Adulto JovenRESUMEN
MOTIVATION: Microsatellite instability (MSI) is a promising biomarker for cancer prognosis and chemosensitivity. Techniques are rapidly evolving for the detection of MSI from tumor-normal paired or tumor-only sequencing data. However, tumor tissues are often insufficient, unavailable, or otherwise difficult to procure. Increasing clinical evidence indicates the enormous potential of plasma circulating cell-free DNA (cfNDA) technology as a noninvasive MSI detection approach. RESULTS: We developed MSIsensor-ct, a bioinformatics tool based on a machine learning protocol, dedicated to detecting MSI status using cfDNA sequencing data with a potential stable MSIscore threshold of 20%. Evaluation of MSIsensor-ct on independent testing datasets with various levels of circulating tumor DNA (ctDNA) and sequencing depth showed 100% accuracy within the limit of detection (LOD) of 0.05% ctDNA content. MSIsensor-ct requires only BAM files as input, rendering it user-friendly and readily integrated into next generation sequencing (NGS) analysis pipelines. AVAILABILITY: MSIsensor-ct is freely available at https://github.com/niu-lab/MSIsensor-ct. SUPPLEMENTARY INFORMATION: Supplementary data are available at Briefings in Bioinformatics online.
Asunto(s)
ADN Tumoral Circulante/genética , Aprendizaje Automático , Inestabilidad de Microsatélites , Neoplasias/genética , Programas Informáticos , ADN Tumoral Circulante/sangre , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Límite de Detección , Repeticiones de Microsatélite , Neoplasias/sangre , Neoplasias/diagnóstico , Neoplasias/patología , Análisis de Secuencia de ADNRESUMEN
Non-coding mutations can create splice sites, however the true extent of how such somatic non-coding mutations affect RNA splicing are largely unexplored. Here we use the MiSplice pipeline to analyze 783 cancer cases with WGS data and 9494 cases with WES data, discovering 562 non-coding mutations that lead to splicing alterations. Notably, most of these mutations create new exons. Introns associated with new exon creation are significantly larger than the genome-wide average intron size. We find that some mutation-induced splicing alterations are located in genes important in tumorigenesis (ATRX, BCOR, CDKN2B, MAP3K1, MAP3K4, MDM2, SMAD4, STK11, TP53 etc.), often leading to truncated proteins and affecting gene expression. The pattern emerging from these exon-creating mutations suggests that splice sites created by non-coding mutations interact with pre-existing potential splice sites that originally lacked a suitable splicing pair to induce new exon formation. Our study suggests the importance of investigating biological and clinical consequences of noncoding splice-inducing mutations that were previously neglected by conventional annotation pipelines. MiSplice will be useful for automatically annotating the splicing impact of coding and non-coding mutations in future large-scale analyses.
Asunto(s)
Neoplasias/genética , Precursores del ARN/genética , Sitios de Empalme de ARN , Empalme del ARN , Quinasas de la Proteína-Quinasa Activada por el AMP , Inhibidor p15 de las Quinasas Dependientes de la Ciclina/genética , Inhibidor p15 de las Quinasas Dependientes de la Ciclina/metabolismo , Bases de Datos Genéticas , Exones , Regulación Neoplásica de la Expresión Génica/genética , Humanos , Intrones , Quinasa 1 de Quinasa de Quinasa MAP/genética , Quinasa 1 de Quinasa de Quinasa MAP/metabolismo , MAP Quinasa Quinasa Quinasa 4/genética , MAP Quinasa Quinasa Quinasa 4/metabolismo , Mutación , Neoplasias/metabolismo , Proteínas Serina-Treonina Quinasas/genética , Proteínas Serina-Treonina Quinasas/metabolismo , Proteínas Proto-Oncogénicas/genética , Proteínas Proto-Oncogénicas/metabolismo , Proteínas Proto-Oncogénicas c-mdm2/genética , Proteínas Proto-Oncogénicas c-mdm2/metabolismo , ARN no Traducido , RNA-Seq , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Proteína Smad4/genética , Proteína Smad4/metabolismo , Proteína p53 Supresora de Tumor/genética , Proteína p53 Supresora de Tumor/metabolismo , Secuenciación del Exoma , Proteína Nuclear Ligada al Cromosoma X/genética , Proteína Nuclear Ligada al Cromosoma X/metabolismoRESUMEN
Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.
Asunto(s)
Neoplasias/patología , Algoritmos , Antígeno B7-H1/genética , Biología Computacional , Bases de Datos Genéticas , Entropía , Humanos , Inestabilidad de Microsatélites , Mutación , Neoplasias/genética , Neoplasias/inmunología , Análisis de Componente Principal , Receptor de Muerte Celular Programada 1/genéticaRESUMEN
We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer.
Asunto(s)
Células Germinativas/metabolismo , Neoplasias/patología , Variaciones en el Número de Copia de ADN , Bases de Datos Genéticas , Eliminación de Gen , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Genotipo , Células Germinativas/citología , Mutación de Línea Germinal , Humanos , Pérdida de Heterocigocidad/genética , Mutación Missense , Neoplasias/genética , Polimorfismo de Nucleótido Simple , Proteínas Proto-Oncogénicas c-met/genética , Proteínas Proto-Oncogénicas c-ret/genética , Proteínas Supresoras de Tumor/genéticaRESUMEN
The functional impact of the vast majority of cancer somatic mutations remains unknown, representing a critical knowledge gap for implementing precision oncology. Here, we report the development of a moderate-throughput functional genomic platform consisting of efficient mutant generation, sensitive viability assays using two growth factor-dependent cell models, and functional proteomic profiling of signaling effects for select aberrations. We apply the platform to annotate >1,000 genomic aberrations, including gene amplifications, point mutations, indels, and gene fusions, potentially doubling the number of driver mutations characterized in clinically actionable genes. Further, the platform is sufficiently sensitive to identify weak drivers. Our data are accessible through a user-friendly, public data portal. Our study will facilitate biomarker discovery, prediction algorithm improvement, and drug development.