Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Brief Bioinform ; 23(3)2022 05 13.
Article in English | MEDLINE | ID: mdl-35380614

ABSTRACT

High-dimensional, localized ribonucleic acid (RNA) sequencing is now possible owing to recent developments in spatial transcriptomics (ST). ST is based on highly multiplexed sequence analysis and uses barcodes to match the sequenced reads to their respective tissue locations. ST expression data suffer from high noise and dropout events; however, smoothing techniques have the promise to improve the data interpretability prior to performing downstream analyses. Single-cell RNA sequencing (scRNA-seq) data similarly suffer from these limitations, and smoothing methods developed for scRNA-seq can only utilize associations in transcriptome space (also known as one-factor smoothing methods). Since they do not account for spatial relationships, these one-factor smoothing methods cannot take full advantage of ST data. In this study, we present a novel two-factor smoothing technique, spatial and pattern combined smoothing (SPCS), that employs the k-nearest neighbor (kNN) technique to utilize information from transcriptome and spatial relationships. By performing SPCS on multiple ST slides from pancreatic ductal adenocarcinoma (PDAC), dorsolateral prefrontal cortex (DLPFC) and simulated high-grade serous ovarian cancer (HGSOC) datasets, smoothed ST slides have better separability, partition accuracy and biological interpretability than the ones smoothed by preexisting one-factor methods. Source code of SPCS is provided in Github (https://github.com/Usos/SPCS).


Subject(s)
Single-Cell Analysis , Transcriptome , Gene Expression Profiling/methods , RNA , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Software
2.
Bioinformatics ; 38(9): 2422-2427, 2022 04 28.
Article in English | MEDLINE | ID: mdl-35191489

ABSTRACT

MOTIVATION: Tumor-specific antigen (TSA) identification in human cancer predicts response to immunotherapy and provides targets for cancer vaccine and adoptive T-cell therapies with curative potential, and TSAs that are highly expressed at the RNA level are more likely to be presented on major histocompatibility complex (MHC)-I. Direct measurements of the RNA expression of peptides would allow for generalized prediction of TSAs. Human leukocyte antigen (HLA)-I genotypes were predicted with seq2HLA. RNA sequencing (RNAseq) fastq files were translated into all possible peptides of length 8-11, and peptides with high and low expressions in the tumor and control samples, respectively, were tested for their MHC-I binding potential with netMHCpan-4.0. RESULTS: A novel pipeline for TSA prediction from RNAseq was used to predict all possible unique peptides size 8-11 on previously published murine and human lung and lymphoma tumors and validated on matched tumor and control lung adenocarcinoma (LUAD) samples. We show that neoantigens predicted by exomeSeq are typically poorly expressed at the RNA level, and a fraction is expressed in matched normal samples. TSAs presented in the proteomics data have higher RNA abundance and lower MHC-I binding percentile, and these attributes are used to discover high confidence TSAs within the validation cohort. Finally, a subset of these high confidence TSAs is expressed in a majority of LUAD tumors and represents attractive vaccine targets. AVAILABILITY AND IMPLEMENTATION: The datasets were derived from sources in the public domain as follows: TSAFinder is open-source software written in python and R. It is licensed under CC-BY-NC-SA and can be downloaded at https://github.com/RNAseqTSA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Adenocarcinoma of Lung , Lung Neoplasms , Animals , Humans , Mice , Antigens, Neoplasm/genetics , Lung Neoplasms/genetics , Peptides/metabolism , RNA , Sequence Analysis, RNA
3.
Bioinformatics ; 35(10): 1653-1659, 2019 05 15.
Article in English | MEDLINE | ID: mdl-30329022

ABSTRACT

MOTIVATION: Technologies that generate high-throughput omics data are flourishing, creating enormous, publicly available repositories of multi-omics data. As many data repositories continue to grow, there is an urgent need for computational methods that can leverage these data to create comprehensive clusters of patients with a given disease. RESULTS: Our proposed approach creates a patient-to-patient similarity graph for each data type as an intermediate representation of each omics data type and merges the graphs through subspace analysis on a Grassmann manifold. We hypothesize that this approach generates more informative clusters by preserving the complementary information from each level of omics data. We applied our approach to The Cancer Genome Atlas (TCGA) breast cancer dataset and show that by integrating gene expression, microRNA and DNA methylation data, our proposed method can produce clinically useful subtypes of breast cancer. We then investigate the molecular characteristics underlying these subtypes. We discover a highly expressed cluster of genes on chromosome 19p13 that strongly correlates with survival in TCGA breast cancer patients and validate these results in three additional breast cancer datasets. We also compare our approach with previous integrative clustering approaches and obtain comparable or superior results. AVAILABILITY AND IMPLEMENTATION: https://github.com/michaelsharpnack/GrassmannCluster. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Breast Neoplasms , Cluster Analysis , DNA Methylation , Genome , Humans
4.
Cancer Res ; 81(16): 4194-4204, 2021 08 15.
Article in English | MEDLINE | ID: mdl-34045189

ABSTRACT

STK11 (liver kinase B1, LKB1) is the fourth most frequently mutated gene in lung adenocarcinoma, with loss of function observed in up to 30% of all cases. Our previous work identified a 16-gene signature for LKB1 loss of function through mutational and nonmutational mechanisms. In this study, we applied this genetic signature to The Cancer Genome Atlas (TCGA) lung adenocarcinoma samples and discovered a novel association between LKB1 loss and widespread DNA demethylation. LKB1-deficient tumors showed depletion of S-adenosyl-methionine (SAM-e), which is the primary substrate for DNMT1 activity. Lower methylation following LKB1 loss involved repetitive elements (RE) and altered RE transcription, as well as decreased sensitivity to azacytidine. Demethylated CpGs were enriched for FOXA family consensus binding sites, and nuclear expression, localization, and turnover of FOXA was dependent upon LKB1. Overall, these findings demonstrate that a large number of lung adenocarcinomas exhibit global hypomethylation driven by LKB1 loss, which has implications for both epigenetic therapy and immunotherapy in these cancers. SIGNIFICANCE: Lung adenocarcinomas with LKB1 loss demonstrate global genomic hypomethylation associated with depletion of SAM-e, reduced expression of DNMT1, and increased transcription of repetitive elements.


Subject(s)
AMP-Activated Protein Kinase Kinases/physiology , Adenocarcinoma/genetics , DNA Methylation , Lung Neoplasms/genetics , S-Adenosylmethionine/metabolism , AMP-Activated Protein Kinase Kinases/genetics , Adenocarcinoma/metabolism , Cell Line , Cell Survival , Cluster Analysis , Computational Biology , CpG Islands , Databases, Genetic , Epigenesis, Genetic , Genes, ras , Humans , Lung Neoplasms/metabolism , Methionine , Mutation , Oligonucleotide Array Sequence Analysis , Proto-Oncogene Proteins p21(ras)/genetics , Repetitive Sequences, Nucleic Acid
5.
Pac Symp Biocomput ; 25: 475-486, 2020.
Article in English | MEDLINE | ID: mdl-31797620

ABSTRACT

Integration of transcriptomic and proteomic data should reveal multi-layered regulatory processes governing cancer cell behaviors. Traditional correlation-based analyses have demonstrated limited ability to identify the post-transcriptional regulatory (PTR) processes that drive the non-linear relationship between transcript and protein abundances. In this work, we ideate an integrative approach to explore the variety of post-transcriptional mechanisms that dictate relationships between genes and corresponding proteins. The proposed workflow utilizes the intuitive technique of scatterplot diagnostics or scagnostics, to characterize and examine the diverse scatterplots built from transcript and protein abundances in a proteogenomic experiment. The workflow includes representing gene-protein relationships as scatterplots, clustering on geometric scagnostic features of these scatterplots, and finally identifying and grouping the potential gene-protein relationships according to their disposition to various PTR mechanisms. Our study verifies the efficacy of the implemented approach to excavate possible regulatory mechanisms by utilizing comprehensive tests on a synthetic dataset. We also propose a variety of 2D pattern-specific downstream analyses methodologies such as mixture modeling, and mapping miRNA post-transcriptional effects to explore each mechanism further. This work suggests that the proposed methodology has the potential for discovering and categorizing post-transcriptional regulatory mechanisms, manifesting in proteogenomic trends. These trends subsequently provide evidence for cancer specificity, miRNA targeting, and identification of regulation impacted by biological functionality and different types of degradation. (Supplementary Material - https://github.com/arunima2/PTRE_PSB_2020).


Subject(s)
MicroRNAs , Proteogenomics , Computational Biology , Gene Expression Regulation , Proteomics
6.
Lung Cancer ; 146: 36-41, 2020 08.
Article in English | MEDLINE | ID: mdl-32505734

ABSTRACT

INTRODUCTION: Recent clinical studies have identified tumor mutation burden (TMB) as a promising therapeutic biomarker of anti-tumor immune checkpoint blockade. However, given the relatively slow turnaround time and high expense in measuring TMB, tobacco smoking history (TSH) is an attractive replacement biomarker. The carcinogenic effects of tobacco smoking may be modified by the protective effects of genome stability genes. This study aims to test the associations between tobacco smoking, genome stability gene inactivation, and TMB. METHODS: Publicly available TSH and DNA somatic alteration data from NSCLC were downloaded from The Cancer Genome Atlas. Correlations and enrichments were calculated with Spearman and Fisher's exact test methods, respectively. Multivariate modeling of TMB was performed with penalized linear regression. RESULTS: 85% of never smokers in adenocarcinomas (LUAD) had low TMB, but a positive TSH was not predictive of hypermutancy. The limited utility of TSH in predicting TMB was reproduced on an independent LUAD dataset. To expand our search for predictors of TMB, we further investigated the contributions of genome stability related genes (GSGs) to TMB. 242/461 (52%) and 300/465 (65%) patients with LUAD and squamous carcinomas (LUSC), respectively, showed evidence of loss of function in at least one of the 182 GSGs. 182 GSGs from 16 pathways were assessed for associations with TMB high tumor status using Fisher's exact test. We performed univariate gene and pathway enrichments in TMB high tumors and found roles forPOLE, REV3L, and FANCE genes, as well as several key GSG pathways. CONCLUSIONS: This study comprehensively tested the association between GSG, tobacco smoking, and TMB in NSCLC. In LUAD, never-smoking status was predictive of low TMB, but overall TSH was not an adequate surrogate biomarker for TMB in NSCLC. Furthermore, we identified an association between GSG inactivation and TMB.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , B7-H1 Antigen , Biomarkers, Tumor/genetics , Carcinoma, Non-Small-Cell Lung/genetics , DNA-Binding Proteins , DNA-Directed DNA Polymerase , Humans , Lung Neoplasms/genetics , Mutation
7.
EBioMedicine ; 27: 167-175, 2018 Jan.
Article in English | MEDLINE | ID: mdl-29273356

ABSTRACT

Despite tremendous advances in targeted therapies against lung adenocarcinoma, the majority of patients do not benefit from personalized treatments. A deeper understanding of potential therapeutic targets is crucial to increase the survival of patients. One promising target, ADAR, is amplified in 13% of lung adenocarcinomas and in-vitro studies have demonstrated the potential of its therapeutic inhibition to inhibit tumor growth. ADAR edits millions of adenosines to inosines within the transcriptome, and while previous studies of ADAR in cancer have solely focused on protein-coding edits, >99% of edits occur in non-protein coding regions. Here, we develop a pipeline to discover the regulatory potential of RNA editing sites across the entire transcriptome and apply it to lung adenocarcinoma tumors from The Cancer Genome Atlas. This method predicts that 1413 genes contain regulatory edits, predominantly in non-coding regions. Genes with the largest numbers of regulatory edits are enriched in both apoptotic and innate immune pathways, providing a link between these known functions of ADAR and its role in cancer. We further show that despite a positive association between ADAR RNA expression and apoptotic and immune pathways, ADAR copy number is negatively associated with apoptosis and several immune cell types' signatures.


Subject(s)
Adenocarcinoma/genetics , Adenosine Deaminase/metabolism , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Lung Neoplasms/genetics , RNA, Neoplasm/genetics , RNA-Binding Proteins/metabolism , 3' Untranslated Regions/genetics , Adenocarcinoma of Lung , Adenosine Deaminase/genetics , Apolipoprotein L1/genetics , Apoptosis/genetics , Gene Amplification , Gene Dosage , Humans , Immunity, Innate/genetics , RNA Editing , RNA, Neoplasm/metabolism , RNA-Binding Proteins/genetics , Survival Analysis
8.
J Thorac Oncol ; 13(10): 1519-1529, 2018 10.
Article in English | MEDLINE | ID: mdl-30017829

ABSTRACT

INTRODUCTION: Despite apparently complete surgical resection, approximately half of resected early-stage lung cancer patients relapse and die of their disease. Adjuvant chemotherapy reduces this risk by only 5% to 8%. Thus, there is a need for better identifying who benefits from adjuvant therapy, the drivers of relapse, and novel targets in this setting. METHODS: RNA sequencing and liquid chromatography/liquid chromatography-mass spectrometry proteomics data were generated from 51 surgically resected non-small cell lung tumors with known recurrence status. RESULTS: We present a rationale and framework for the incorporation of high-content RNA and protein measurements into integrative biomarkers and show the potential of this approach for predicting risk of recurrence in a group of lung adenocarcinomas. In addition, we characterize the relationship between mRNA and protein measurements in lung adenocarcinoma and show that it is outcome specific. CONCLUSIONS: Our results suggest that mRNA and protein data possess independent biological and clinical importance, which can be leveraged to create higher-powered expression biomarkers.


Subject(s)
Adenocarcinoma of Lung/surgery , Lung Neoplasms/surgery , Proteogenomics/methods , Adenocarcinoma of Lung/pathology , Female , Humans , Lung Neoplasms/pathology , Male
9.
Article in English | MEDLINE | ID: mdl-26306231

ABSTRACT

Biological pathway regulation is complex, yet it underlies the functional coordination in a cell. Cancer is a disease that is characterized by unregulated growth, driven by underlying pathway deregulation. This pathway deregulation is both within pathways and between pathways. Here, we propose a method to detect inter-pathway coordination using distance correlation. Utilizing data generated from microarray experiments, we separate the genes into pathways and calculate the pairwise distance correlation between them. The result is intuitively viewed as a network of differentially dependent pathways. We find intuitive, yet surprising significant hub pathways, including glycophosphatidylinositol anchor synthesis in lung cancer.

SELECTION OF CITATIONS
SEARCH DETAIL