RESUMO
Despite its success in several clinical trials, cancer immunotherapy remains limited by the rarity of targetable tumor-specific antigens, tumor-mediated immune suppression, and toxicity triggered by systemic delivery of potent immunomodulators. Here, we present a proof-of-concept immunomodulatory gene circuit platform that enables tumor-specific expression of immunostimulators, which could potentially overcome these limitations. Our design comprised de novo synthetic cancer-specific promoters and, to enhance specificity, an RNA-based AND gate that generates combinatorial immunomodulatory outputs only when both promoters are mutually active. These outputs included an immunogenic cell-surface protein, a cytokine, a chemokine, and a checkpoint inhibitor antibody. The circuits triggered selective T cell-mediated killing of cancer cells, but not of normal cells, in vitro. In in vivo efficacy assays, lentiviral circuit delivery mediated significant tumor reduction and prolonged mouse survival. Our design could be adapted to drive additional immunomodulators, sense other cancers, and potentially treat other diseases that require precise immunological programming.
Assuntos
Redes Reguladoras de Genes , Imunoterapia/métodos , Neoplasias Ovarianas/terapia , Animais , Feminino , Humanos , Imunomodulação , Camundongos , Neoplasias Ovarianas/imunologia , Regiões Promotoras Genéticas , Receptores de Antígenos de Linfócitos T/metabolismo , Linfócitos T Citotóxicos/imunologiaRESUMO
SUMMARY: The exponential growth in available genomic data is expected to reach full sequencing of a million genomes in the coming decade. Improving and developing methods to analyze these genomes and to reveal their utility is of major interest in a wide variety of fields, such as comparative and functional genomics, evolution and bioinformatics. Phylogenetic profiling is an established method for predicting functional interactions between proteins based on similarities in their evolutionary patterns across species. Proteins that function together (i.e. generate complexes, interact in the same pathways or improve adaptation to environmental niches) tend to show coordinated evolution across the tree of life. The normalized phylogenetic profiling (NPP) method takes into account minute changes in proteins across species to identify protein co-evolution. Despite the success of this method, it is still not clear what set of parameters is required for optimal use of co-evolution in predicting functional interactions. Moreover, it is not clear if pathway evolution or function should direct parameter choice. Here, we create a reliable and usable NPP construction pipeline. We explore the effect of parameter selection on functional interaction prediction using NPP from 1028 genomes, both separately and in various value combinations. We identify several parameter sets that optimize performance for pathways with certain biological annotation. This work reveals the importance of choosing the right parameters for optimized function prediction based on a biological context. AVAILABILITY AND IMPLEMENTATION: Source code and documentation are available on GitHub: https://github.com/iditam/CompareNPPs. CONTACT: yuvaltab@ekmd.huji.ac.il. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica , Software , Genoma , Filogenia , ProteínasRESUMO
BACKGROUND: AI models have shown promise in performing many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust of doctors in AI-based models, especially in domains where AI prediction capabilities surpass those of humans. Moreover, such explanations could enable novel scientific discovery by uncovering signals in the data that aren't yet known to experts. METHODS: In this paper, we present a workflow for generating hypotheses to understand which visual signals in images are correlated with a classification model's predictions for a given task. This approach leverages an automatic visual explanation algorithm followed by interdisciplinary expert review. We propose the following 4 steps: (i) Train a classifier to perform a given task to assess whether the imagery indeed contains signals relevant to the task; (ii) Train a StyleGAN-based image generator with an architecture that enables guidance by the classifier ("StylEx"); (iii) Automatically detect, extract, and visualize the top visual attributes that the classifier is sensitive towards. For visualization, we independently modify each of these attributes to generate counterfactual visualizations for a set of images (i.e., what the image would look like with the attribute increased or decreased); (iv) Formulate hypotheses for the underlying mechanisms, to stimulate future research. Specifically, present the discovered attributes and corresponding counterfactual visualizations to an interdisciplinary panel of experts so that hypotheses can account for social and structural determinants of health (e.g., whether the attributes correspond to known patho-physiological or socio-cultural phenomena, or could be novel discoveries). FINDINGS: To demonstrate the broad applicability of our approach, we present results on eight prediction tasks across three medical imaging modalities-retinal fundus photographs, external eye photographs, and chest radiographs. We showcase examples where many of the automatically-learned attributes clearly capture clinically known features (e.g., types of cataract, enlarged heart), and demonstrate automatically-learned confounders that arise from factors beyond physiological mechanisms (e.g., chest X-ray underexposure is correlated with the classifier predicting abnormality, and eye makeup is correlated with the classifier predicting low hemoglobin levels). We further show that our method reveals a number of physiologically plausible, previously-unknown attributes based on the literature (e.g., differences in the fundus associated with self-reported sex, which were previously unknown). INTERPRETATION: Our approach enables hypotheses generation via attribute visualizations and has the potential to enable researchers to better understand, improve their assessment, and extract new knowledge from AI-based models, as well as debug and design better datasets. Though not designed to infer causality, importantly, we highlight that attributes generated by our framework can capture phenomena beyond physiology or pathophysiology, reflecting the real world nature of healthcare delivery and socio-cultural factors, and hence interdisciplinary perspectives are critical in these investigations. Finally, we will release code to help researchers train their own StylEx models and analyze their predictive tasks of interest, and use the methodology presented in this paper for responsible interpretation of the revealed attributes. FUNDING: Google.
Assuntos
Algoritmos , Catarata , Humanos , Cardiomegalia , Fundo de Olho , Inteligência ArtificialRESUMO
Conservation is a strong predictor for the pathogenicity of single-nucleotide variants (SNVs). However, some positions that present complex conservation patterns across vertebrates stray from this paradigm. Here, we analyzed the association between complex conservation patterns and the pathogenicity of SNVs in the 115 disease-genes that had sufficient variant data. We show that conservation is not a one-rule-fits-all solution since its accuracy highly depends on the analyzed set of species and genes. For example, pairwise comparisons between the human and 99 vertebrate species showed that species differ in their ability to predict the clinical outcomes of variants among different genes using conservation. Furthermore, certain genes were less amenable for conservation-based variant prediction, while others demonstrated species that optimize prediction. These insights led to developing EvoDiagnostics, which uses the conservation against each species as a feature within a random-forest machine-learning classification algorithm. EvoDiagnostics outperformed traditional conservation algorithms, deep-learning based methods and most ensemble tools in every prediction-task, highlighting the strength of optimizing conservation analysis per-species and per-gene. Overall, we suggest a new and a more biologically relevant approach for analyzing conservation, which improves prediction of variant pathogenicity.
RESUMO
Over the next decade, more than a million eukaryotic species are expected to be fully sequenced. This has the potential to improve our understanding of genotype and phenotype crosstalk, gene function and interactions, and answer evolutionary questions. Here, we develop a machine-learning approach for utilizing phylogenetic profiles across 1154 eukaryotic species. This method integrates co-evolution across eukaryotic clades to predict functional interactions between human genes and the context for these interactions. We benchmark our approach showing a 14% performance increase (auROC) compared to previous methods. Using this approach, we predict functional annotations for less studied genes. We focus on DNA repair and verify that 9 of the top 50 predicted genes have been identified elsewhere, with others previously prioritized by high-throughput screens. Overall, our approach enables better annotation of function and functional interactions and facilitates the understanding of evolutionary processes underlying co-evolution. The manuscript is accompanied by a webserver available at: https://mlpp.cs.huji.ac.il .
Assuntos
Aprendizado de Máquina , Reparo do DNA/genética , Reparo do DNA/fisiologia , Evolução Molecular , Humanos , Filogenia , Análise de Sequência de DNA/métodosRESUMO
Mapping co-evolved genes via phylogenetic profiling (PP) is a powerful approach to uncover functional interactions between genes and to associate them with pathways. Despite many successful endeavors, the understanding of co-evolutionary signals in eukaryotes remains partial. Our hypothesis is that 'Clades', branches of the tree of life (e.g. primates and mammals), encompass signals that cannot be detected by PP using all eukaryotes. As such, integrating information from different clades should reveal local co-evolution signals and improve function prediction. Accordingly, we analyzed 1028 genomes in 66 clades and demonstrated that the co-evolutionary signal was scattered across clades. We showed that functionally related genes are frequently co-evolved in only parts of the eukaryotic tree and that clades are complementary in detecting functional interactions within pathways. We examined the non-homologous end joining pathway and the UFM1 ubiquitin-like protein pathway and showed that both demonstrated distinguished co-evolution patterns in specific clades. Our research offers a different way to look at co-evolution across eukaryotes and points to the importance of modular co-evolution analysis. We developed the 'CladeOScope' PP method to integrate information from 16 clades across over 1000 eukaryotic genomes and is accessible via an easy to use web server at http://cladeoscope.cs.huji.ac.il.
RESUMO
Myotonic dystrophy type 1 is an autosomal-dominant inherited disorder caused by the expansion of CTG repeats in the 3' untranslated region of the DMPK gene. The RNAs bearing these expanded repeats have a range of toxic effects. Here we provide evidence from a Caenorhabditis elegans myotonic dystrophy type 1 model that the RNA interference (RNAi) machinery plays a key role in causing RNA toxicity and disease phenotypes. We show that the expanded repeats systematically affect a range of endogenous genes bearing short non-pathogenic repeats and that this mechanism is dependent on the small RNA pathway. Conversely, by perturbating the RNA interference machinery, we reversed the RNA toxicity effect and reduced the disease pathogenesis. Our results unveil a role for RNA repeats as templates (based on sequence homology) for moderate but constant gene silencing. Such a silencing effect affects the cell steady state over time, with diverse impacts depending on tissue, developmental stage, and the type of repeat. Importantly, such a mechanism may be common among repeats and similar in human cells with different expanded repeat diseases.
Assuntos
Envelhecimento/genética , Caenorhabditis elegans/genética , Distrofia Miotônica/genética , Interferência de RNA , RNA de Cadeia Dupla/genética , Repetições de Trinucleotídeos , Regiões 3' não Traduzidas , Animais , Animais Geneticamente Modificados , Caenorhabditis elegans/crescimento & desenvolvimento , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Modelos Animais de Doenças , Genes Reporter , Proteínas de Fluorescência Verde/genética , Proteínas de Fluorescência Verde/metabolismo , Proteínas de Choque Térmico HSP70/genética , Proteínas de Choque Térmico HSP70/metabolismo , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico/metabolismo , Temperatura Alta , Humanos , Distrofia Miotônica/metabolismo , Distrofia Miotônica/patologia , Ligação Proteica , RNA de Cadeia Dupla/metabolismo , RNA de Helmintos/genética , RNA de Helmintos/metabolismoRESUMO
Cell state-specific promoters constitute essential tools for basic research and biotechnology because they activate gene expression only under certain biological conditions. Synthetic Promoters with Enhanced Cell-State Specificity (SPECS) can be superior to native ones, but the design of such promoters is challenging and frequently requires gene regulation or transcriptome knowledge that is not readily available. Here, to overcome this challenge, we use a next-generation sequencing approach combined with machine learning to screen a synthetic promoter library with 6107 designs for high-performance SPECS for potentially any cell state. We demonstrate the identification of multiple SPECS that exhibit distinct spatiotemporal activity during the programmed differentiation of induced pluripotent stem cells (iPSCs), as well as SPECS for breast cancer and glioblastoma stem-like cells. We anticipate that this approach could be used to create SPECS for gene therapies that are activated in specific cell states, as well as to study natural transcriptional regulatory networks.