RESUMO
Tailoring optimal treatment for individual cancer patients remains a significant challenge. To address this issue, we developed PERCEPTION (PERsonalized Single-Cell Expression-Based Planning for Treatments In ONcology), a precision oncology computational pipeline. Our approach uses publicly available matched bulk and single-cell (sc) expression profiles from large-scale cell-line drug screens. These profiles help build treatment response models based on patients' sc-tumor transcriptomics. PERCEPTION demonstrates success in predicting responses to targeted therapies in cultured and patient-tumor-derived primary cells, as well as in two clinical trials for multiple myeloma and breast cancer. It also captures the resistance development in patients with lung cancer treated with tyrosine kinase inhibitors. PERCEPTION outperforms published state-of-the-art sc-based and bulk-based predictors in all clinical cohorts. PERCEPTION is accessible at https://github.com/ruppinlab/PERCEPTION . Our work, showcasing patient stratification using sc-expression profiles of their tumors, will encourage the adoption of sc-omics profiling in clinical settings, enhancing precision oncology tools based on sc-omics.
Assuntos
Resistencia a Medicamentos Antineoplásicos , Medicina de Precisão , Análise de Célula Única , Transcriptoma , Humanos , Análise de Célula Única/métodos , Medicina de Precisão/métodos , Resistencia a Medicamentos Antineoplásicos/genética , Neoplasias/genética , Neoplasias/tratamento farmacológico , Perfilação da Expressão Gênica/métodos , Feminino , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/tratamento farmacológico , Regulação Neoplásica da Expressão Gênica , Linhagem Celular Tumoral , Biologia Computacional/métodosRESUMO
Mining a large cohort of single-cell transcriptomics data, here we employ combinatorial optimization techniques to chart the landscape of optimal combination therapies in cancer. We assume that each individual therapy can target any one of 1269 genes encoding cell surface receptors, which may be targets of CAR-T, conjugated antibodies or coated nanoparticle therapies. We find that in most cancer types, personalized combinations composed of at most four targets are then sufficient for killing at least 80% of tumor cells while sparing at least 90% of nontumor cells in the tumor microenvironment. However, as more stringent and selective killing is required, the number of targets needed rises rapidly. Emerging individual targets include PTPRZ1 for brain and head and neck cancers and EGFR in multiple tumor types. In sum, this study provides a computational estimate of the identity and number of targets needed in combination to target cancers selectively and precisely.
Assuntos
Neoplasias de Cabeça e Pescoço , Microambiente Tumoral , Neoplasias de Cabeça e Pescoço/tratamento farmacológico , Neoplasias de Cabeça e Pescoço/genética , Humanos , Proteínas Tirosina Fosfatases Classe 5 Semelhantes a ReceptoresRESUMO
The mammalian male-specific Y chromosome plays a critical role in sex determination and male fertility. However, because of its repetitive and haploid nature, it is frequently absent from genome assemblies and remains enigmatic. The Y chromosomes of great apes represent a particular puzzle: their gene content is more similar between human and gorilla than between human and chimpanzee, even though human and chimpanzee share a more recent common ancestor. To solve this puzzle, here we constructed a dataset including Ys from all extant great ape genera. We generated assemblies of bonobo and orangutan Ys from short and long sequencing reads and aligned them with the publicly available human, chimpanzee, and gorilla Y assemblies. Analyzing this dataset, we found that the genus Pan, which includes chimpanzee and bonobo, experienced accelerated substitution rates. Pan also exhibited elevated gene death rates. These observations are consistent with high levels of sperm competition in Pan Furthermore, we inferred that the great ape common ancestor already possessed multicopy sequences homologous to most human and chimpanzee palindromes. Nonetheless, each species also acquired distinct ampliconic sequences. We also detected increased chromatin contacts between and within palindromes (from Hi-C data), likely facilitating gene conversion and structural rearrangements. Our results highlight the dynamic mode of Y chromosome evolution and open avenues for studies of male-specific dispersal in endangered great ape species.
Assuntos
Hominidae/genética , Cromossomo Y/genética , Animais , Evolução Biológica , Evolução Molecular , Conversão Gênica , Gorilla gorilla/genética , Humanos , Pan paniscus/genética , Pan troglodytes/genética , Pongo/genética , Análise de Sequência de DNARESUMO
Multicopy ampliconic gene families on the Y chromosome play an important role in spermatogenesis. Thus, studying their genetic variation in endangered great ape species is critical. We estimated the sizes (copy number) of nine Y ampliconic gene families in population samples of chimpanzee, bonobo, and orangutan with droplet digital polymerase chain reaction, combined these estimates with published data for human and gorilla, and produced genome-wide testis gene expression data for great apes. Analyzing this comprehensive data set within an evolutionary framework, we, first, found high inter- and intraspecific variation in gene family size, with larger families exhibiting higher variation as compared with smaller families, a pattern consistent with random genetic drift. Second, for four gene families, we observed significant interspecific size differences, sometimes even between sister species-chimpanzee and bonobo. Third, despite substantial variation in copy number, Y ampliconic gene families' expression levels did not differ significantly among species, suggesting dosage regulation. Fourth, for three gene families, size was positively correlated with gene expression levels across species, suggesting that, given sufficient evolutionary time, copy number influences gene expression. Our results indicate high variability in size but conservation in gene expression levels in Y ampliconic gene families, significantly advancing our understanding of Y-chromosome evolution in great apes.
Assuntos
Evolução Biológica , Dosagem de Genes , Expressão Gênica , Hominidae/genética , Cromossomo Y , Animais , Hominidae/metabolismo , Masculino , Família MultigênicaRESUMO
The Y chromosome harbors nine multi-copy ampliconic gene families expressed exclusively in testis. The gene copies within each family are >99% identical to each other, which poses a major challenge in evaluating their copy number. Recent studies demonstrated high variation in Y ampliconic gene copy number among humans. However, how this variation affects expression levels in human testis remains understudied. Here we developed a novel computational tool Ampliconic Copy Number Estimator (AmpliCoNE) that utilizes read sequencing depth information to estimate Y ampliconic gene copy number per family. We applied this tool to whole-genome sequencing data of 149 men with matched testis expression data whose samples are part of the Genotype-Tissue Expression (GTEx) project. We found that the Y ampliconic gene families with low copy number in humans were deleted or pseudogenized in non-human great apes, suggesting relaxation of functional constraints. Among the Y ampliconic gene families, higher copy number leads to higher expression. Within the Y ampliconic gene families, copy number does not influence gene expression, rather a high tolerance for variation in gene expression was observed in testis of presumably healthy men. No differences in gene expression levels were found among major Y haplogroups. Age positively correlated with expression levels of the HSFY and PRY gene families in the African subhaplogroup E1b, but not in the European subhaplogroups R1b and I1. We also found that expression of five Y ampliconic gene families is coordinated with that of their non-Y (i.e. X or autosomal) homologs. Indeed, five ampliconic gene families had consistently lower expression levels when compared to their non-Y homologs suggesting dosage regulation, while the HSFY family had higher expression levels than its X homolog and thus lacked dosage regulation.
Assuntos
Cromossomos Humanos Y/genética , Genes Ligados ao Cromossomo Y/genética , Análise de Sequência de DNA/métodos , Animais , Cromossomos Humanos Y/fisiologia , Variações do Número de Cópias de DNA/genética , Bases de Dados Genéticas , Mecanismo Genético de Compensação de Dose/genética , Mecanismo Genético de Compensação de Dose/fisiologia , Epigênese Genética/genética , Dosagem de Genes/genética , Expressão Gênica/genética , Regulação da Expressão Gênica/genética , Genes Ligados ao Cromossomo Y/fisiologia , Fatores de Transcrição de Choque Térmico/genética , Fatores de Transcrição de Choque Térmico/metabolismo , Humanos , Masculino , Família Multigênica/genética , Testículo/metabolismoRESUMO
BACKGROUND: Next-generation sequencing requires sufficient DNA to be available. If limited, whole-genome amplification is applied to generate additional amounts of DNA. Such amplification often results in many chimeric DNA fragments, in particular artificial palindromic sequences, which limit the usefulness of long sequencing reads. RESULTS: Here, we present Pacasus, a tool for correcting such errors. Two datasets show that it markedly improves read mapping and de novo assembly, yielding results similar to these that would be obtained with non-amplified DNA. CONCLUSIONS: With Pacasus long-read technologies become available for sequencing targets with very small amounts of DNA, such as single cells or even single chromosomes.
Assuntos
Arabidopsis/genética , DNA/análise , Gorilla gorilla/genética , Nucleotídeos/genética , Análise de Sequência de DNA/métodos , Sequenciamento Completo do Genoma/métodos , Cromossomo Y/genética , Algoritmos , Animais , DNA/genética , Projetos de PesquisaRESUMO
SUMMARY: Technological advances in high-throughput sequencing necessitate improved computational tools for processing and analyzing large-scale datasets in a systematic automated manner. For that purpose, we have developed PRADA (Pipeline for RNA-Sequencing Data Analysis), a flexible, modular and highly scalable software platform that provides many different types of information available by multifaceted analysis starting from raw paired-end RNA-seq data: gene expression levels, quality metrics, detection of unsupervised and supervised fusion transcripts, detection of intragenic fusion variants, homology scores and fusion frame classification. PRADA uses a dual-mapping strategy that increases sensitivity and refines the analytical endpoints. PRADA has been used extensively and successfully in the glioblastoma and renal clear cell projects of The Cancer Genome Atlas program. AVAILABILITY AND IMPLEMENTATION: http://sourceforge.net/projects/prada/ CONTACT: gadgetz@broadinstitute.org or rverhaak@mdanderson.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Análise de Sequência de RNA/métodos , Software , Estatística como Assunto/métodos , Sequência de Bases , Fusão Gênica , Genoma Humano/genética , Humanos , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , RNA Mensageiro/metabolismoRESUMO
Halogen bonding interactions between halogenated ligands and proteins were examined using the crystal structures deposited to date in the PDB. The data was analyzed as a function of halogen bonding to main chain Lewis bases, viz. oxygen of backbone carbonyl and backbone amide nitrogen. This analysis also examined halogen bonding to side-chain Lewis bases (O, N, and S) and to the electron-rich aromatic amino acids. All interactions were restricted to van der Waals radii with respective atoms. The data reveals that while fluorine and chlorine have strong tendencies favoring interactions with the backbone Lewis bases at glycine, the trend is not restricted to the achiral amino acid backbone for larger halogens. Halogen side-chain interactions are not restricted to amino acids containing O, N, and S as Lewis bases. Electron-rich aromatic amino acids host a high frequency of halogen bonds as does Leu. A closer examination of the latter hydrophobic side chain reveals that the "propensity of interactions" of halogen ligands at this oily residue is an outcome of strong classical halogen bonds with Lewis bases in the vicinity. Finally, an examination of Θ1 (C-X···O and C-X···N) and Θ2 (X···O-Z and X···N-Z) angles reveals that very few ligands adopt classical halogen bonding angles, suggesting that steric and other factors may influence these angles. The data is discussed in the context of ligand design for pharmaceutical applications.
Assuntos
Aminoácidos/química , Elétrons , Bases de Lewis/química , Proteínas/química , Bases de Dados de Proteínas , Desenho de Fármacos , Interações Hidrofóbicas e Hidrofílicas , Cinética , Ligantes , Nitrogênio/química , Oxigênio/química , Enxofre/química , TermodinâmicaRESUMO
We describe the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer.
Assuntos
Neoplasias Encefálicas/genética , Glioblastoma/genética , Neoplasias Encefálicas/metabolismo , Feminino , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Glioblastoma/metabolismo , Humanos , Masculino , Mutação , Proteoma/análise , Transdução de SinaisRESUMO
Infiltrating stromal and immune cells form the major fraction of normal cells in tumour tissue and not only perturb the tumour signal in molecular studies but also have an important role in cancer biology. Here we describe 'Estimation of STromal and Immune cells in MAlignant Tumours using Expression data' (ESTIMATE)--a method that uses gene expression signatures to infer the fraction of stromal and immune cells in tumour samples. ESTIMATE scores correlate with DNA copy number-based tumour purity across samples from 11 different tumour types, profiled on Agilent, Affymetrix platforms or based on RNA sequencing and available through The Cancer Genome Atlas. The prediction accuracy is further corroborated using 3,809 transcriptional profiles available elsewhere in the public domain. The ESTIMATE method allows consideration of tumour-associated normal cells in genomic and transcriptomic studies. An R-library is available on https://sourceforge.net/projects/estimateproject/.
Assuntos
Leucócitos/metabolismo , Neoplasias/genética , Transcriptoma , Algoritmos , Separação Celular , Variações do Número de Cópias de DNA , Feminino , Perfilação da Expressão Gênica , Biblioteca Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Leucócitos/citologia , Neoplasias/imunologia , Neoplasias/patologia , Análise de Sequência com Séries de Oligonucleotídeos , Projetos de Pesquisa , Sensibilidade e Especificidade , Software , Células Estromais/citologia , Células Estromais/metabolismoRESUMO
With the advent of high-throughput sequencing technologies, much progress has been made in the identification of somatic structural rearrangements in cancer genomes. However, characterization of the complex alterations and their associated mechanisms remains inadequate. Here, we report a comprehensive analysis of whole-genome sequencing and DNA copy number data sets from The Cancer Genome Atlas to relate chromosomal alterations to imbalances in DNA dosage and describe the landscape of intragenic breakpoints in glioblastoma multiforme (GBM). Gene length, guanine-cytosine (GC) content, and local presence of a copy number alteration were closely associated with breakpoint susceptibility. A dense pattern of repeated focal amplifications involving the murine double minute 2 (MDM2)/cyclin-dependent kinase 4 (CDK4) oncogenes and associated with poor survival was identified in 5% of GBMs. Gene fusions and rearrangements were detected concomitant within the breakpoint-enriched region. At the gene level, we noted recurrent breakpoints in genes such as apoptosis regulator FAF1. Structural alterations of the FAF1 gene disrupted expression and led to protein depletion. Restoration of the FAF1 protein in glioma cell lines significantly increased the FAS-mediated apoptosis response. Our study uncovered a previously underappreciated genomic mechanism of gene deregulation that can confer growth advantages on tumor cells and may generate cancer-specific vulnerabilities in subsets of GBM.