RESUMEN
Polypharmacology drugs-compounds that inhibit multiple proteins-have many applications but are difficult to design. To address this challenge we have developed POLYGON, an approach to polypharmacology based on generative reinforcement learning. POLYGON embeds chemical space and iteratively samples it to generate new molecular structures; these are rewarded by the predicted ability to inhibit each of two protein targets and by drug-likeness and ease-of-synthesis. In binding data for >100,000 compounds, POLYGON correctly recognizes polypharmacology interactions with 82.5% accuracy. We subsequently generate de-novo compounds targeting ten pairs of proteins with documented co-dependency. Docking analysis indicates that top structures bind their two targets with low free energies and similar 3D orientations to canonical single-protein inhibitors. We synthesize 32 compounds targeting MEK1 and mTOR, with most yielding >50% reduction in each protein activity and in cell viability when dosed at 1-10 µM. These results support the potential of generative modeling for polypharmacology.
Asunto(s)
Simulación del Acoplamiento Molecular , Humanos , Serina-Treonina Quinasas TOR/metabolismo , Polifarmacología , MAP Quinasa Quinasa 1/antagonistas & inhibidores , MAP Quinasa Quinasa 1/metabolismo , MAP Quinasa Quinasa 1/química , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/química , Unión Proteica , Descubrimiento de Drogas/métodos , Diseño de Fármacos , Supervivencia Celular/efectos de los fármacosRESUMEN
A vexing observation in genome-wide association studies (GWASs) is that parallel analyses in different species may not identify orthologous genes. Here, we demonstrate that cross-species translation of GWASs can be greatly improved by an analysis of co-localization within molecular networks. Using body mass index (BMI) as an example, we show that the genes associated with BMI in humans lack significant agreement with those identified in rats. However, the networks interconnecting these genes show substantial overlap, highlighting common mechanisms including synaptic signaling, epigenetic modification, and hormonal regulation. Genetic perturbations within these networks cause abnormal BMI phenotypes in mice, too, supporting their broad conservation across mammals. Other mechanisms appear species specific, including carbohydrate biosynthesis (humans) and glycerolipid metabolism (rodents). Finally, network co-localization also identifies cross-species convergence for height/body length. This study advances a general paradigm for determining whether and how phenotypes measured in model species recapitulate human biology.
Asunto(s)
Índice de Masa Corporal , Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Humanos , Animales , Ratas , Tamaño Corporal , Ratones , Especificidad de la EspecieRESUMEN
Cell-cycle control is accomplished by cyclin-dependent kinases (CDKs), motivating extensive research into CDK targeting small-molecule drugs as cancer therapeutics. Here we use combinatorial CRISPR/Cas9 perturbations to uncover an extensive network of functional interdependencies among CDKs and related factors, identifying 43 synthetic-lethal and 12 synergistic interactions. We dissect CDK perturbations using single-cell RNAseq, for which we develop a novel computational framework to precisely quantify cell-cycle effects and diverse cell states orchestrated by specific CDKs. While pairwise disruption of CDK4/6 is synthetic-lethal, only CDK6 is required for normal cell-cycle progression and transcriptional activation. Multiple CDKs (CDK1/7/9/12) are synthetic-lethal in combination with PRMT5, independent of cell-cycle control. In-depth analysis of mRNA expression and splicing patterns provides multiple lines of evidence that the CDK-PRMT5 dependency is due to aberrant transcriptional regulation resulting in premature termination. These inter-dependencies translate to drug-drug synergies, with therapeutic implications in cancer and other diseases.
Asunto(s)
Neoplasias , Humanos , Puntos de Control del Ciclo Celular , Ciclo Celular/genética , Neoplasias/tratamiento farmacológico , Proteína-Arginina N-Metiltransferasas/farmacologíaRESUMEN
A longstanding goal of biomedicine is to understand how alterations in molecular and cellular networks give rise to the spectrum of human diseases. For diseases with shared etiology, understanding the common causes allows for improved diagnosis of each disease, development of new therapies and more comprehensive identification of disease genes. Accordingly, this protocol describes how to evaluate the extent to which two diseases, each characterized by a set of mapped genes, are colocalized in a reference gene interaction network. This procedure uses network propagation to measure the network 'distance' between gene sets. For colocalized diseases, the network can be further analyzed to extract common gene communities at progressive granularities. In particular, we show how to: (1) obtain input gene sets and a reference gene interaction network; (2) identify common subnetworks of genes that encompass or are in close proximity to all gene sets; (3) use multiscale community detection to identify systems and pathways represented by each common subnetwork to generate a network colocalized systems map; (4) validate identified genes and systems using a mouse variant database; and (5) visualize and further investigate select genes, interactions and systems for relevance to phenotype(s) of interest. We demonstrate the utility of this approach by identifying shared biological mechanisms underlying autism and congenital heart disease. However, this protocol is general and can be applied to any gene sets attributed to diseases or other phenotypes with suspected joint association. A typical NetColoc run takes less than an hour. Software and documentation are available at https://github.com/ucsd-ccbb/NetColoc .
Asunto(s)
Redes Reguladoras de Genes , Programas Informáticos , Humanos , Bases de Datos Factuales , Biología Computacional/métodosRESUMEN
The cell is a multi-scale structure with modular organization across at least four orders of magnitude1. Two central approaches for mapping this structure-protein fluorescent imaging and protein biophysical association-each generate extensive datasets, but of distinct qualities and resolutions that are typically treated separately2,3. Here we integrate immunofluorescence images in the Human Protein Atlas4 with affinity purifications in BioPlex5 to create a unified hierarchical map of human cell architecture. Integration is achieved by configuring each approach as a general measure of protein distance, then calibrating the two measures using machine learning. The map, known as the multi-scale integrated cell (MuSIC 1.0), resolves 69 subcellular systems, of which approximately half are to our knowledge undocumented. Accordingly, we perform 134 additional affinity purifications and validate subunit associations for the majority of systems. The map reveals a pre-ribosomal RNA processing assembly and accessory factors, which we show govern rRNA maturation, and functional roles for SRRM1 and FAM120C in chromatin and RPS3A in splicing. By integration across scales, MuSIC increases the resolution of imaging while giving protein interactions a spatial dimension, paving the way to incorporate diverse types of data in proteome-wide cell maps.
Asunto(s)
Cromosomas , Proteoma , Antígenos Nucleares/genética , Antígenos Nucleares/metabolismo , Cromatina/genética , Cromosomas/metabolismo , Humanos , Proteínas Asociadas a Matriz Nuclear/metabolismo , Proteoma/metabolismo , ARN Ribosómico , Proteínas de Unión al ARN/genéticaRESUMEN
Cancers have been associated with a diverse array of genomic alterations. To help mechanistically understand such alterations in breast-invasive carcinoma, we applied affinity purificationmass spectrometry to delineate comprehensive biophysical interaction networks for 40 frequently altered breast cancer (BC) proteins, with and without relevant mutations, across three human breast cell lines. These networks identify cancer-specific protein-protein interactions (PPIs), interconnected and enriched for common and rare cancer mutations, that are substantially rewired by the introduction of key BC mutations. Our analysis identified BPIFA1 and SCGB2A1 as PIK3CA-interacting proteins, which repress PI3K-AKT signaling, and uncovered USP28 and UBE2N as functionally relevant interactors of BRCA1. We also show that the protein phosphatase 1 regulatory subunit spinophilin interacts with and regulates dephosphorylation of BRCA1 to promote DNA double-strand break repair. Thus, PPI landscapes provide a powerful framework for mechanistically interpreting disease genomic data and can identify valuable therapeutic targets.
Asunto(s)
Neoplasias de la Mama/metabolismo , Proteínas de Neoplasias/metabolismo , Mapas de Interacción de Proteínas , Neoplasias de la Mama/genética , Línea Celular Tumoral , Femenino , Humanos , Espectrometría de Masas , Mutación , Proteínas de Neoplasias/genética , Proteínas de Neoplasias/aislamiento & purificación , Purificación por Afinidad en TándemRESUMEN
A major goal of cancer research is to understand how mutations distributed across diverse genes affect common cellular systems, including multiprotein complexes and assemblies. Two challengeshow to comprehensively map such systems and how to identify which are under mutational selectionhave hindered this understanding. Accordingly, we created a comprehensive map of cancer protein systems integrating both new and published multi-omic interaction data at multiple scales of analysis. We then developed a unified statistical model that pinpoints 395 specific systems under mutational selection across 13 cancer types. This map, called NeST (Nested Systems in Tumors), incorporates canonical processes and notable discoveries, including a PIK3CA-actomyosin complex that inhibits phosphatidylinositol 3-kinase signaling and recurrent mutations in collagen complexes that promote tumor proliferation. These systems can be used as clinical biomarkers and implicate a total of 548 genes in cancer evolution and progression. This work shows how disparate tumor mutations converge on protein assemblies at different scales.
Asunto(s)
Proteínas de Neoplasias/genética , Proteínas de Neoplasias/metabolismo , Neoplasias/genética , Neoplasias/metabolismo , Mapas de Interacción de Proteínas/genética , Genes Relacionados con las Neoplasias , Humanos , Mutación , Mapeo de Interacción de Proteínas/métodosRESUMEN
We outline a framework for elucidating tumor genetic complexity through multidimensional protein-protein interaction maps and apply it to enhancing our understanding of head and neck squamous cell carcinoma. This network uncovers 771 interactions from cancer and noncancerous cell states, including WT and mutant protein isoforms. Prioritization of cancer-enriched interactions reveals a previously unidentified association of the fibroblast growth factor receptor tyrosine kinase 3 with Daple, a guanine-nucleotide exchange factor, resulting in activation of Gαi- and p21-activated protein kinase 1/2 to promote cancer cell migration. Additionally, we observe mutation-enriched interactions between the human epidermal growth factor receptor 3 (HER3) receptor tyrosine kinase and PIK3CA (the alpha catalytic subunit of phosphatidylinositol 3-kinase) that can inform the response to HER3 inhibition in vivo. We anticipate that the application of this framework will be valuable for translating genetic alterations into a molecular and clinical understanding of the underlying biology of many disease areas.
Asunto(s)
Carcinoma de Células Escamosas/metabolismo , Fosfatidilinositol 3-Quinasa Clase I/genética , Fosfatidilinositol 3-Quinasa Clase I/metabolismo , Resistencia a Antineoplásicos/genética , Neoplasias de Cabeza y Cuello/metabolismo , Mapas de Interacción de Proteínas , Animales , Carcinoma de Células Escamosas/genética , Línea Celular Tumoral , Movimiento Celular , Femenino , Neoplasias de Cabeza y Cuello/genética , Humanos , Péptidos y Proteínas de Señalización Intracelular/metabolismo , Ratones , Ratones Desnudos , Proteínas de Microfilamentos/metabolismo , Mutación , Receptor Tipo 3 de Factor de Crecimiento de Fibroblastos/metabolismo , Ensayos Antitumor por Modelo de XenoinjertoRESUMEN
Despite the growing constellation of genetic loci linked to common traits, these loci have yet to account for most heritable variation, and most act through poorly understood mechanisms. Recent machine learning (ML) systems have used hierarchical biological knowledge to associate genetic mutations with phenotypic outcomes, yielding substantial predictive power and mechanistic insight. Here, we use an ontology-guided ML system to map single nucleotide variants (SNVs) focusing on 6 classic phenotypic traits in natural yeast populations. The 29 identified loci are largely novel and account for ~17% of the phenotypic variance, versus <3% for standard genetic analysis. Representative results show that sensitivity to hydroxyurea is linked to SNVs in two alternative purine biosynthesis pathways, and that sensitivity to copper arises through failure to detoxify reactive oxygen species in fatty acid metabolism. This work demonstrates a knowledge-based approach to amplifying and interpreting signals in population genetic studies.
Asunto(s)
Aprendizaje Automático , Modelos Genéticos , Herencia Multifactorial , Benomilo/toxicidad , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Biología Computacional , Cobre/toxicidad , Ontología de Genes , Estudio de Asociación del Genoma Completo , Glucosa/metabolismo , Glicina/metabolismo , Hidroxiurea/farmacología , Bases del Conocimiento , Redes y Vías Metabólicas/efectos de los fármacos , Redes y Vías Metabólicas/genética , Mutación , Redes Neurales de la Computación , Nucleotidiltransferasas/metabolismo , Fenotipo , Polimorfismo de Nucleótido Simple , Saccharomyces cerevisiae/efectos de los fármacos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Biología de SistemasRESUMEN
Most drugs entering clinical trials fail, often related to an incomplete understanding of the mechanisms governing drug response. Machine learning techniques hold immense promise for better drug response predictions, but most have not reached clinical practice due to their lack of interpretability and their focus on monotherapies. We address these challenges by developing DrugCell, an interpretable deep learning model of human cancer cells trained on the responses of 1,235 tumor cell lines to 684 drugs. Tumor genotypes induce states in cellular subsystems that are integrated with drug structure to predict response to therapy and, simultaneously, learn biological mechanisms underlying the drug response. DrugCell predictions are accurate in cell lines and also stratify clinical outcomes. Analysis of DrugCell mechanisms leads directly to the design of synergistic drug combinations, which we validate systematically by combinatorial CRISPR, drug-drug screening in vitro, and patient-derived xenografts. DrugCell provides a blueprint for constructing interpretable models for predictive medicine.
Asunto(s)
Antineoplásicos/uso terapéutico , Biología Computacional/métodos , Neoplasias/tratamiento farmacológico , Antineoplásicos/farmacología , Línea Celular Tumoral , Bases de Datos Factuales , Aprendizaje Profundo , Ensayos de Selección de Medicamentos Antitumorales , Sinergismo Farmacológico , Genotipo , Humanos , Neoplasias/genética , Modelación Específica para el PacienteRESUMEN
All mammals progress through similar physiological stages throughout life, from early development to puberty, aging, and death. Yet, the extent to which this conserved physiology reflects underlying genomic events is unclear. Here, we map the common methylation changes experienced by mammalian genomes as they age, focusing on comparison of humans with dogs, an emerging model of aging. Using oligo-capture sequencing, we characterize methylomes of 104 Labrador retrievers spanning a 16-year age range, achieving >150× coverage within mammalian syntenic blocks. Comparison with human methylomes reveals a nonlinear relationship that translates dog-to-human years and aligns the timing of major physiological milestones between the two species, with extension to mice. Conserved changes center on developmental gene networks, which are sufficient to translate age and the effects of anti-aging interventions across multiple mammals. These results establish methylation not only as a diagnostic age readout but also as a cross-species translator of physiological aging milestones.
Asunto(s)
Envejecimiento/genética , Metilación de ADN/genética , Animales , Perros , HumanosRESUMEN
Systematic measurements of genetic interactions have been used to classify gene functions and to categorize genes into protein complexes, functional pathways and biological processes. This protocol describes how to perform a high-throughput genetic interaction screen in S. cerevisiae using a variant of epistatic miniarray profiles (E-MAP) in which the fitnesses of 6144 colonies are measured simultaneously. We also describe the computational methods to analyze the resulting data.
Asunto(s)
Epistasis Genética/genética , Genoma Fúngico/genética , Saccharomyces cerevisiae/genéticaRESUMEN
We have mapped a global network of virus-host protein interactions by purification of the complete set of human papillomavirus (HPV) proteins in multiple cell lines followed by mass spectrometry analysis. Integration of this map with tumor genome atlases shows that the virus targets human proteins frequently mutated in HPV- but not HPV+ cancers, providing a unique opportunity to identify novel oncogenic events phenocopied by HPV infection. For example, we find that the NRF2 transcriptional pathway, which protects against oxidative stress, is activated by interaction of the NRF2 regulator KEAP1 with the viral protein E1. We also demonstrate that the L2 HPV protein physically interacts with the RNF20/40 histone ubiquitination complex and promotes tumor cell invasion in an RNF20/40-dependent manner. This combined proteomic and genetic approach provides a systematic means to study the cellular mechanisms hijacked by virally induced cancers.Significance: In this study, we created a protein-protein interaction network between HPV and human proteins. An integrative analysis of this network and 800 tumor mutation profiles identifies multiple oncogenesis pathways promoted by HPV interactions that phenocopy recurrent mutations in cancer, yielding an expanded definition of HPV oncogenic roles. Cancer Discov; 8(11); 1474-89. ©2018 AACR. This article is highlighted in the In This Issue feature, p. 1333.
Asunto(s)
Biomarcadores de Tumor/metabolismo , Carcinogénesis/patología , Carcinoma de Células Escamosas/patología , Neoplasias de Cabeza y Cuello/patología , Interacciones Huésped-Patógeno , Papillomaviridae/fisiología , Infecciones por Papillomavirus/complicaciones , Biomarcadores de Tumor/genética , Carcinogénesis/metabolismo , Carcinoma de Células Escamosas/metabolismo , Carcinoma de Células Escamosas/virología , Neoplasias de Cabeza y Cuello/metabolismo , Neoplasias de Cabeza y Cuello/virología , Humanos , Mutación , Infecciones por Papillomavirus/virología , Mapas de Interacción de ProteínasRESUMEN
A major ambition of artificial intelligence lies in translating patient data to successful therapies. Machine learning models face particular challenges in biomedicine, however, including handling of extreme data heterogeneity and lack of mechanistic insight into predictions. Here, we argue for "visible" approaches that guide model structure with experimental biology.
Asunto(s)
Biología Computacional/métodos , Aprendizaje Automático , Algoritmos , Investigación BiomédicaRESUMEN
Human papillomavirus (HPV)-negative head and neck squamous cell carcinoma (HNSCC) represents a distinct classification of cancer with worse expected outcomes. Of the 11 genes recurrently mutated in HNSCC, we identify a singular and substantial survival advantage for mutations in the gene encoding Nuclear Set Domain Containing Protein 1 (NSD1), a histone methyltransferase altered in approximately 10% of patients. This effect, a 55% decrease in risk of death in NSD1-mutated versus non-mutated patients, can be validated in an independent cohort. NSD1 alterations are strongly associated with widespread genome hypomethylation in the same tumors, to a degree not observed for any other mutated gene. To address whether NSD1 plays a causal role in these associations, we use CRISPR-Cas9 to disrupt NSD1 in HNSCC cell lines and find that this leads to substantial CpG hypomethylation and sensitivity to cisplatin, a standard chemotherapy in head and neck cancer, with a 40% to 50% decrease in the IC50 value. Such results are reinforced by a survey of 1,001 cancer cell lines, in which loss-of-function NSD1 mutations have an average 23% decrease in cisplatin IC50 value compared with cell lines with wild-type NSD1Significance: This study identifies a favorable subtype of HPV-negative HNSCC linked to NSD1 mutation, hypomethylation, and cisplatin sensitivity. Mol Cancer Ther; 17(7); 1585-94. ©2018 AACR.
Asunto(s)
Carcinoma de Células Escamosas/tratamiento farmacológico , Metilación de ADN/genética , Neoplasias de Cabeza y Cuello/tratamiento farmacológico , Péptidos y Proteínas de Señalización Intracelular/genética , Proteínas Nucleares/genética , Sistemas CRISPR-Cas/genética , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/patología , Línea Celular Tumoral , Cisplatino/farmacología , Islas de CpG/efectos de los fármacos , Metilación de ADN/efectos de los fármacos , Resistencia a Antineoplásicos/genética , Femenino , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Neoplasias de Cabeza y Cuello/genética , Neoplasias de Cabeza y Cuello/patología , Histona Metiltransferasas , N-Metiltransferasa de Histona-Lisina , Humanos , Masculino , Mutación/efectos de los fármacos , PapillomaviridaeRESUMEN
Although cancer genomes are replete with noncoding mutations, the effects of these mutations remain poorly characterized. Here we perform an integrative analysis of 930 tumor whole genomes and matched transcriptomes, identifying a network of 193 noncoding loci in which mutations disrupt target gene expression. These 'somatic eQTLs' (expression quantitative trait loci) are frequently mutated in specific cancer tissues, and the majority can be validated in an independent cohort of 3,382 tumors. Among these, we find that the effects of noncoding mutations on DAAM1, MTG2 and HYI transcription are recapitulated in multiple cancer cell lines and that increasing DAAM1 expression leads to invasive cell migration. Collectively, the noncoding loci converge on a set of core pathways, permitting a classification of tumors into pathway-based subtypes. The somatic eQTL network is disrupted in 88% of tumors, suggesting widespread impact of noncoding mutations in cancer.
Asunto(s)
Genes Relacionados con las Neoplasias , Mutación , Neoplasias/genética , Proteínas Adaptadoras Transductoras de Señales/genética , Isomerasas Aldosa-Cetosa/genética , Línea Celular Tumoral , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Proteínas de Microfilamentos , Proteínas de Unión al GTP Monoméricas/genética , Invasividad Neoplásica/genética , Neoplasias/metabolismo , Sitios de Carácter Cuantitativo , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN Neoplásico/genética , ARN Neoplásico/metabolismo , ARN no Traducido/genética , ARN no Traducido/metabolismo , Secuenciación Completa del Genoma , Proteínas de Unión al GTP rhoRESUMEN
Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall. A general tendency is that performance scales with network size, suggesting that new interaction discovery currently outweighs the detrimental effects of false positives. Correcting for size, we find that the DIP network provides the highest efficiency (value per interaction). Based on these results, we create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research.
Asunto(s)
Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Algoritmos , Biología Computacional , Genoma Humano , HumanosRESUMEN
We developed a systematic approach to map human genetic networks by combinatorial CRISPR-Cas9 perturbations coupled to robust analysis of growth kinetics. We targeted all pairs of 73 cancer genes with dual guide RNAs in three cell lines, comprising 141,912 tests of interaction. Numerous therapeutically relevant interactions were identified, and these patterns replicated with combinatorial drugs at 75% precision. From these results, we anticipate that cellular context will be critical to synthetic-lethal therapies.