Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 87
Filter
1.
Cell Rep Methods ; 4(6): 100799, 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38889686

ABSTRACT

The cellular components of tumors and their microenvironment play pivotal roles in tumor progression, patient survival, and the response to cancer treatments. Unveiling a comprehensive cellular profile within bulk tumors via single-cell RNA sequencing (scRNA-seq) data is crucial, as it unveils intrinsic tumor cellular traits that elude identification through conventional cancer subtyping methods. Our contribution, scBeacon, is a tool that derives cell-type signatures by integrating and clustering multiple scRNA-seq datasets to extract signatures for deconvolving unrelated tumor datasets on bulk samples. Through the employment of scBeacon on the The Cancer Genome Atlas (TCGA) cohort, we find cellular and molecular attributes within specific tumor categories, many with patient outcome relevance. We developed a tumor cell-type map to visually depict the relationships among TCGA samples based on the cell-type inferences.


Subject(s)
Neoplasms , Single-Cell Analysis , Tumor Microenvironment , Humans , Tumor Microenvironment/genetics , Single-Cell Analysis/methods , Neoplasms/genetics , Neoplasms/pathology , Sequence Analysis, RNA , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Cluster Analysis
2.
Sci Rep ; 14(1): 11794, 2024 05 23.
Article in English | MEDLINE | ID: mdl-38782963

ABSTRACT

We present the Manatee variational autoencoder model to predict transcription factor (TF) perturbation-induced transcriptomes. We demonstrate that the Manatee in silico perturbation analysis recapitulates target transcriptomic phenotypes in diverse cellular lineage transitions. We further propose the Manatee in silico screening analysis for prioritizing TF combinations targeting desired transcriptomic phenotypes.


Subject(s)
Transcription Factors , Transcriptome , Transcription Factors/metabolism , Transcription Factors/genetics , Humans , Gene Expression Profiling , Computer Simulation , Computational Biology/methods , Algorithms
3.
bioRxiv ; 2023 Oct 26.
Article in English | MEDLINE | ID: mdl-37961271

ABSTRACT

Human pluripotent stem cell-derived tissue engineering offers great promise in designer cell-based personalized therapeutics. To harness such potential, a broader approach requires a deeper understanding of tissue-level interactions. We previously developed a manufacturing system for the ectoderm-derived skin epithelium for cell replacement therapy. However, it remains challenging to manufacture the endoderm-derived esophageal epithelium, despite both possessing similar stratified structure. Here we employ single cell and spatial technologies to generate a spatiotemporal multi-omics cell atlas for human esophageal development. We illuminate the cellular diversity, dynamics and signal communications for the developing esophageal epithelium and stroma. Using the machine-learning based Manatee, we prioritize the combinations of candidate human developmental signals for in vitro derivation of esophageal basal cells. Functional validation of the Manatee predictions leads to a clinically-compatible system for manufacturing human esophageal mucosa. Our approach creates a versatile platform to accelerate human tissue manufacturing for future cell replacement therapies to treat human genetic defects and wounds.

5.
Sci Rep ; 12(1): 1329, 2022 01 25.
Article in English | MEDLINE | ID: mdl-35079083

ABSTRACT

The SARS-CoV-2 pandemic has challenged humankind's ability to quickly determine the cascade of health effects caused by a novel infection. Even with the unprecedented speed at which vaccines were developed and introduced into society, identifying therapeutic interventions and drug targets for patients infected with the virus remains important as new strains of the virus evolve, or future coronaviruses may emerge that are resistant to current vaccines. The application of transcriptomic RNA sequencing of infected samples may shed new light on the pathways involved in viral mechanisms and host responses. We describe the application of the previously developed "dual RNA-seq" approach to investigate, for the first time, the co-regulation between the human and SARS-CoV-2 transcriptomes. Together with differential expression analysis, we describe the tissue specificity of SARS-CoV-2 expression, an inferred lipopolysaccharide response, and co-regulation of CXCL's, SPRR's, S100's with SARS-CoV-2 expression. Lipopolysaccharide response pathways in particular offer promise for future therapeutic research and the prospect of subgrouping patients based on chemokine expression that may help explain the vastly different reactions patients have to infection. Taken together these findings highlight unappreciated SARS-CoV-2 expression signatures and emphasize new considerations and mechanisms for SARS-CoV-2 therapeutic intervention.


Subject(s)
COVID-19 , Gene Expression Regulation, Viral , RNA-Seq , SARS-CoV-2 , Transcriptome , A549 Cells , COVID-19/genetics , COVID-19/metabolism , Humans , SARS-CoV-2/genetics , SARS-CoV-2/metabolism
6.
Nat Commun ; 12(1): 6545, 2021 11 11.
Article in English | MEDLINE | ID: mdl-34764310

ABSTRACT

The characteristic ionic currents of nucleotide kmers are commonly used in analyzing nanopore sequencing readouts. We present a graph convolutional network-based deep learning framework for predicting kmer characteristic ionic currents from corresponding chemical structures. We show such a framework can generalize the chemical information of the 5-methyl group from thymine to cytosine by correctly predicting 5-methylcytosine-containing DNA 6mers, thus shedding light on the de novo detection of nucleotide modifications.


Subject(s)
Nucleotides/metabolism , Cytosine/metabolism , Nanopore Sequencing/methods , Sequence Analysis, DNA/methods
7.
Nat Commun ; 12(1): 5684, 2021 09 28.
Article in English | MEDLINE | ID: mdl-34584103

ABSTRACT

Deep learning architectures such as variational autoencoders have revolutionized the analysis of transcriptomics data. However, the latent space of these variational autoencoders offers little to no interpretability. To provide further biological insights, we introduce a novel sparse Variational Autoencoder architecture, VEGA (VAE Enhanced by Gene Annotations), whose decoder wiring mirrors user-provided gene modules, providing direct interpretability to the latent variables. We demonstrate the performance of VEGA in diverse biological contexts using pathways, gene regulatory networks and cell type identities as the gene modules that define its latent space. VEGA successfully recapitulates the mechanism of cellular-specific response to treatments, the status of master regulators as well as jointly revealing the cell type and cellular state identity in developing cells. We envision the approach could serve as an explanatory biological model for development and drug treatment experiments.


Subject(s)
Deep Learning , Gene Regulatory Networks , Models, Genetic , RNA-Seq/methods , Single-Cell Analysis/methods , Animals , Datasets as Topic , Humans , Mice
8.
Cell Syst ; 12(8): 827-838.e5, 2021 08 18.
Article in English | MEDLINE | ID: mdl-34146471

ABSTRACT

The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information.


Subject(s)
Neoplasms , Humans , Neoplasms/genetics , Protein Isoforms/genetics , RNA/genetics , RNA-Seq , Sequence Analysis, RNA
9.
PLoS Comput Biol ; 17(4): e1008878, 2021 04.
Article in English | MEDLINE | ID: mdl-33861732

ABSTRACT

Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene "signatures"-patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality.


Subject(s)
Genomics , Machine Learning , Neoplasms/classification , Neoplasms/genetics , Cell Line, Tumor , Gene Knockdown Techniques , Humans , Phenotype , RNA, Small Interfering/genetics , Survival Analysis
10.
iScience ; 24(1): 102017, 2021 Jan 22.
Article in English | MEDLINE | ID: mdl-33490923

ABSTRACT

Biological states are controlled by orchestrated transcriptional factors (TFs) within gene regulatory networks. Here we show TFs responsible for the dynamic changes of biological states can be prioritized with temporal PageRank. We further show such TF prioritization can be extended by integrating gene regulatory networks reverse engineered from multi-omics profiles, e.g. gene expression, chromatin accessibility, and chromosome conformation assays, using multiplex PageRank.

11.
Dev Cell ; 56(3): 292-309.e9, 2021 02 08.
Article in English | MEDLINE | ID: mdl-33321106

ABSTRACT

Haploinsufficiency of transcriptional regulators causes human congenital heart disease (CHD); however, the underlying CHD gene regulatory network (GRN) imbalances are unknown. Here, we define transcriptional consequences of reduced dosage of the CHD transcription factor, TBX5, in individual cells during cardiomyocyte differentiation from human induced pluripotent stem cells (iPSCs). We discovered highly sensitive dysregulation of TBX5-dependent pathways-including lineage decisions and genes associated with heart development, cardiomyocyte function, and CHD genetics-in discrete subpopulations of cardiomyocytes. Spatial transcriptomic mapping revealed chamber-restricted expression for many TBX5-sensitive transcripts. GRN analysis indicated that cardiac network stability, including vulnerable CHD-linked nodes, is sensitive to TBX5 dosage. A GRN-predicted genetic interaction between Tbx5 and Mef2c, manifesting as ventricular septation defects, was validated in mice. These results demonstrate exquisite and diverse sensitivity to TBX5 dosage in heterogeneous subsets of iPSC-derived cardiomyocytes and predicts candidate GRNs for human CHDs, with implications for quantitative transcriptional regulation in disease.


Subject(s)
Gene Regulatory Networks , Haploinsufficiency/genetics , Heart Defects, Congenital/genetics , Models, Biological , T-Box Domain Proteins/genetics , Animals , Body Patterning/genetics , Cell Differentiation , Gene Dosage , Heart Ventricles/pathology , Humans , MEF2 Transcription Factors/metabolism , Mice , Mutation/genetics , Myocytes, Cardiac/metabolism , Transcription, Genetic
12.
Prostate Cancer Prostatic Dis ; 24(1): 81-87, 2021 03.
Article in English | MEDLINE | ID: mdl-32286548

ABSTRACT

BACKGROUND: Metastatic disease burden out of proportion to serum PSA has been used as a marker of aggressive phenotype prostate cancer but is not well defined as a distinct subgroup. We sought to prospectively characterize the molecular features and clinical outcomes of Low PSA Secretors. METHODS: Eligible metastatic castration resistant prostate cancer (mCRPC) patients without prior small cell histology underwent metastatic tumor biopsy with molecular characterization. Low PSA secretion was defined as serum PSA < 2, 5, or 10 ng/mL plus >5 metastases with radiographic progression at study entry. Clinical and molecular features were compared between low PSA vs. normal secretors in a post-hoc fashion. RESULTS: 183 patients were enrolled, including 15 (8%) identified as Low PSA Secretors using optimal PSA cut point of 5 ng/mL. Biopsies from Low PSA Secretors demonstrated higher t-SCNC and RB1 loss and lower AR transcriptional signature scores compared with normal secretors. Genomic loss of RB1 and/or TP53 was more common in Low PSA Secretors (80% vs. 41%). Overall survival (OS) was shorter in Low PSA Secretors (median OS = 26.7 vs. 46.0 months, hazard ratio = 2.465 (95% CI: 0.982-6.183). Progression-free survival (PFS) on post-biopsy treatment with AR-targeted therapy was shorter than with chemotherapy (median PFS 6.2 vs. 4.1 months). CONCLUSIONS: Low PSA secretion in relation to metastatic tumor burden may be a readily available clinical selection tool for de-differentiated mCRPC with molecular features consistent with t-SCNC. Prospective validation is warranted.


Subject(s)
Adenocarcinoma/blood , Neoplasm Staging , Prostatic Neoplasms, Castration-Resistant/blood , Retinoblastoma Binding Proteins/genetics , Tumor Suppressor Protein p53/genetics , Ubiquitin-Protein Ligases/genetics , Adenocarcinoma/genetics , Adenocarcinoma/secondary , Aged , Aged, 80 and over , Biomarkers, Tumor/blood , Biopsy , DNA, Neoplasm/genetics , Disease-Free Survival , Female , Follow-Up Studies , Genomics , Humans , Male , Middle Aged , Neoplasm Metastasis , Prostate-Specific Antigen/blood , Prostatic Neoplasms, Castration-Resistant/genetics , Prostatic Neoplasms, Castration-Resistant/pathology , Retinoblastoma Binding Proteins/metabolism , Retrospective Studies , Tumor Suppressor Protein p53/metabolism , Ubiquitin-Protein Ligases/metabolism
13.
Clin Cancer Res ; 26(17): 4616-4624, 2020 09 01.
Article in English | MEDLINE | ID: mdl-32727885

ABSTRACT

PURPOSE: The purpose of this study was to measure genomic changes that emerge with enzalutamide treatment using analyses of whole-genome sequencing and RNA sequencing. EXPERIMENTAL DESIGN: One hundred and one tumors from men with metastatic castration-resistant prostate cancer (mCRPC) who had not been treated with enzalutamide (n = 64) or who had enzalutamide-resistant mCRPC (n = 37) underwent whole genome sequencing. Ninety-nine of these tumors also underwent RNA sequencing. We analyzed the genomes and transcriptomes of these mCRPC tumors. RESULTS: Copy number loss was more common than gain in enzalutamide-resistant tumors. Specially, we identified 124 protein-coding genes that were more commonly lost in enzalutamide-resistant samples. These 124 genes included eight putative tumor suppressors located at nine distinct genomic regions. We demonstrated that focal deletion of the 17q22 locus that includes RNF43 and SRSF1 was not present in any patient with enzalutamide-naïve mCRPC but was present in 16% (6/37) of patients with enzalutamide-resistant mCRPC. 17q22 loss was associated with lower RNF43 and SRSF1 expression and poor overall survival from time of biopsy [median overall survival of 19.3 months in 17q22 intact vs. 8.9 months in 17q22 loss, HR, 3.44 95% confidence interval (CI), 1.338-8.867, log-rank P = 0.006]. Finally, 17q22 loss was linked with activation of several targetable factors, including CDK1/2, Akt, and PLK1, demonstrating the potential therapeutic relevance of 17q22 loss in mCRPC. CONCLUSIONS: Copy number loss is common in enzalutamide-resistant tumors. Focal deletion of chromosome 17q22 defines a previously unappreciated molecular subset of enzalutamide-resistant mCRPC associated with poor clinical outcome.


Subject(s)
Benzamides/pharmacology , Biomarkers, Tumor/genetics , Chromosomes, Human, Pair 17/genetics , Drug Resistance, Neoplasm/genetics , Nitriles/pharmacology , Phenylthiohydantoin/pharmacology , Prostatic Neoplasms, Castration-Resistant/genetics , Benzamides/therapeutic use , Biopsy , DNA Copy Number Variations , Disease-Free Survival , Humans , Male , Nitriles/therapeutic use , Phenylthiohydantoin/therapeutic use , Prostate/pathology , Prostatic Neoplasms, Castration-Resistant/drug therapy , Prostatic Neoplasms, Castration-Resistant/mortality , Prostatic Neoplasms, Castration-Resistant/pathology , RNA-Seq , Survival Analysis
14.
Urol Oncol ; 38(12): 931.e9-931.e16, 2020 12.
Article in English | MEDLINE | ID: mdl-32624423

ABSTRACT

OBJECTIVES: The net oncogenic effect of ß2-adrenergic receptor ADRB2, whose downstream elements induce neuroendocrine differentiation and whose expression is regulated by EZH2, is unclear. ADRB2 expression and associated clinical outcomes in metastatic castration-resistant prostate cancer (mCRPC) are unknown. METHODS AND MATERIALS: This was a retrospective analysis of a multi-center, prospectively enrolled cohort of mCRPC patients. Metastatic biopsies were obtained at progression, and specimens underwent laser capture microdissection and RNA-seq. ADRB2 expression was stratified by histology and clustering based on unsupervised hierarchical transcriptome analysis and correlated with EZH2 expression; an external dataset was used for validation. The association between ADRB2 expression and overall survival (OS) was assessed by log-rank test and a multivariable Cox proportional hazard model. RESULTS: One hundred and twenty-seven patients with progressive mCRPC had sufficient metastatic tumor for RNA-seq. ADRB2 expression was lowest in the small cell-enriched transcriptional cluster (P < 0.01) and correlated inversely with EZH2 expression (r = -0.28, P < 0.01). These findings were validated in an external cohort enriched for neuroendocrine differentiation. Patients with tumors harboring low ADRB2 expression (lowest quartile) had a shorter median OS than those with higher (9.5 vs. 20.5 months, P = 0.02). In multivariable analysis, low ADRB2 expression was associated with a trend toward shorter OS (HR for death = 1.54, 95%CI 0.98-2.44). Conversely, higher expression of upstream transcriptional regulator EZH2 was associated with shortened OS (HR for death = 3.01, 95%CI 1.12-8.09). CONCLUSIONS: Low ADRB2 expression is associated with neuroendocrine differentiation and is associated with shortened survival. EZH2 is a potential therapeutic target for preventing neuroendocrine transdifferentiation and improving outcomes in mCRPC. Further studies of agents targeting ß-adrenergic signaling are warranted.


Subject(s)
Carcinoma, Neuroendocrine/genetics , Carcinoma, Small Cell/genetics , Gene Expression Regulation, Neoplastic , Prostatic Neoplasms, Castration-Resistant/genetics , Aged , Aged, 80 and over , Carcinoma, Neuroendocrine/mortality , Carcinoma, Small Cell/mortality , Down-Regulation , Humans , Male , Middle Aged , Prostatic Neoplasms, Castration-Resistant/mortality , Receptors, Adrenergic, beta-2 , Retrospective Studies , Survival Rate
15.
Proc Natl Acad Sci U S A ; 117(22): 12315-12323, 2020 06 02.
Article in English | MEDLINE | ID: mdl-32424106

ABSTRACT

The androgen receptor (AR) antagonist enzalutamide is one of the principal treatments for men with castration-resistant prostate cancer (CRPC). However, not all patients respond, and resistance mechanisms are largely unknown. We hypothesized that genomic and transcriptional features from metastatic CRPC biopsies prior to treatment would be predictive of de novo treatment resistance. To this end, we conducted a phase II trial of enzalutamide treatment (160 mg/d) in 36 men with metastatic CRPC. Thirty-four patients were evaluable for the primary end point of a prostate-specific antigen (PSA)50 response (PSA decline ≥50% at 12 wk vs. baseline). Nine patients were classified as nonresponders (PSA decline <50%), and 25 patients were classified as responders (PSA decline ≥50%). Failure to achieve a PSA50 was associated with shorter progression-free survival, time on treatment, and overall survival, demonstrating PSA50's utility. Targeted DNA-sequencing was performed on 26 of 36 biopsies, and RNA-sequencing was performed on 25 of 36 biopsies that contained sufficient material. Using computational methods, we measured AR transcriptional function and performed gene set enrichment analysis (GSEA) to identify pathways whose activity state correlated with de novo resistance. TP53 gene alterations were more common in nonresponders, although this did not reach statistical significance (P = 0.055). AR gene alterations and AR expression were similar between groups. Importantly, however, transcriptional measurements demonstrated that specific gene sets-including those linked to low AR transcriptional activity and a stemness program-were activated in nonresponders. Our results suggest that patients whose tumors harbor this program should be considered for clinical trials testing rational agents to overcome de novo enzalutamide resistance.


Subject(s)
Antineoplastic Agents/administration & dosage , Drug Resistance, Neoplasm , Phenylthiohydantoin/analogs & derivatives , Prostatic Neoplasms, Castration-Resistant/genetics , Receptors, Androgen/administration & dosage , Receptors, Androgen/genetics , Aged , Aged, 80 and over , Benzamides , Gene Expression Profiling , Humans , Male , Middle Aged , Nitriles , Phenylthiohydantoin/administration & dosage , Prostate-Specific Antigen/metabolism , Prostatic Neoplasms, Castration-Resistant/drug therapy , Prostatic Neoplasms, Castration-Resistant/metabolism , Receptors, Androgen/metabolism
16.
JCO Clin Cancer Inform ; 4: 147-159, 2020 02.
Article in English | MEDLINE | ID: mdl-32097025

ABSTRACT

PURPOSE: The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis. METHODS: We introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations. RESULTS: The construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query-based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross-data set analysis to show the utility of the system. CONCLUSION: The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available.


Subject(s)
Antineoplastic Agents/therapeutic use , Biomarkers, Tumor/genetics , Computational Biology/methods , Gene Expression Regulation, Neoplastic/drug effects , Medical Informatics , Neoplasms/diagnosis , Neoplasms/drug therapy , Computer Graphics , Databases, Factual , Gene Regulatory Networks , Humans , Neoplasms/genetics , Signal Transduction
17.
Nat Commun ; 11(1): 729, 2020 02 05.
Article in English | MEDLINE | ID: mdl-32024854

ABSTRACT

The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.


Subject(s)
Gene Expression Regulation, Neoplastic , Mutation , Neoplasms/genetics , RNA Splicing , Chromatin Assembly and Disassembly , Computational Biology/methods , Databases, Genetic , Genome, Human , Humans , Metabolic Networks and Pathways/genetics , Neoplasms/metabolism , Promoter Regions, Genetic
18.
Nat Biotechnol ; 38(1): 97-107, 2020 01.
Article in English | MEDLINE | ID: mdl-31919445

ABSTRACT

Tumor DNA sequencing data can be interpreted by computational methods that analyze genomic heterogeneity to infer evolutionary dynamics. A growing number of studies have used these approaches to link cancer evolution with clinical progression and response to therapy. Although the inference of tumor phylogenies is rapidly becoming standard practice in cancer genome analyses, standards for evaluating them are lacking. To address this need, we systematically assess methods for reconstructing tumor subclonality. First, we elucidate the main algorithmic problems in subclonal reconstruction and develop quantitative metrics for evaluating them. Then we simulate realistic tumor genomes that harbor all known clonal and subclonal mutation types and processes. Finally, we benchmark 580 tumor reconstructions, varying tumor read depth, tumor type and somatic variant detection. Our analysis provides a baseline for the establishment of gold-standard methods to analyze tumor heterogeneity.


Subject(s)
Algorithms , Neoplasms/pathology , Clone Cells , Computer Simulation , DNA Copy Number Variations/genetics , Gene Dosage , Genome , Humans , Mutation/genetics , Neoplasms/genetics , Polymorphism, Single Nucleotide/genetics , Reference Standards
19.
Pac Symp Biocomput ; 25: 343-354, 2020.
Article in English | MEDLINE | ID: mdl-31797609

ABSTRACT

Cancer genome projects have produced multidimensional datasets on thousands of samples. Yet, depending on the tumor type, 5-50% of samples have no known driving event. We introduce a semi-supervised method called Learning UnRealized Events (LURE) that uses a progressive label learning framework and minimum spanning analysis to predict cancer drivers based on their altered samples sharing a gene expression signature with the samples of a known event. We demonstrate the utility of the method on the TCGA Pan-Cancer Atlas dataset for which it produced a high-confidence result relating 59 new connections to 18 known mutation events including alterations in the same gene, family, and pathway. We give examples of predicted drivers involved in TP53, telomere maintenance, and MAPK/RTK signaling pathways. LURE identifies connections between genes with no known prior relationship, some of which may offer clues for targeting specific forms of cancer. Code and Supplemental Material are available on the LURE website: https://sysbiowiki.soe.ucsc.edu/lure.


Subject(s)
Computational Biology , Neoplasms , Humans , Mutation , Neoplasms/genetics
20.
Nat Commun ; 10(1): 4899, 2019 10 25.
Article in English | MEDLINE | ID: mdl-31653878

ABSTRACT

The maintenance and transition of cellular states are controlled by biological processes. Here we present a gene set-based transformation of single cell RNA-Seq data into biological process activities that provides a robust description of cellular states. Moreover, as these activities represent species-independent descriptors, they facilitate the alignment of single cell states across different organisms.


Subject(s)
Computational Biology/methods , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Animals , Gene Expression , Gene Expression Regulation, Developmental/genetics , Humans , Leukocytes, Mononuclear/metabolism , Mice , Mouse Embryonic Stem Cells/metabolism , Signal-To-Noise Ratio , Single-Cell Analysis/methods , Systems Biology , Zebrafish/genetics
SELECTION OF CITATIONS
SEARCH DETAIL