RESUMO
Cytokines operate in concert to maintain immune homeostasis and coordinate immune responses. In cases of ER+ breast cancer, peripheral immune cells exhibit altered responses to several cytokines, and these alterations are correlated strongly with patient outcomes. To develop a systems-level understanding of this dysregulation, we measured a panel of cytokine responses and receptor abundances in the peripheral blood of healthy controls and ER+ breast cancer patients across immune cell types. Using tensor factorization to model this multidimensional data, we found that breast cancer patients exhibited widespread alterations in response, including drastically reduced response to IL-10 and heightened basal levels of pSmad2/3 and pSTAT4. ER+ patients also featured upregulation of PD-L1, IL6Rα, and IL2Rα, among other receptors. Despite this, alterations in response to cytokines were not explained by changes in receptor abundances. Thus, tensor factorization helped to reveal a coordinated reprogramming of the immune system that was consistent across our cohort.
Assuntos
Neoplasias da Mama , Citocinas , Transdução de Sinais , Humanos , Neoplasias da Mama/imunologia , Feminino , Citocinas/sangue , Citocinas/metabolismo , Receptores de Estrogênio/metabolismo , Pessoa de Meia-Idade , Biologia de Sistemas/métodosRESUMO
While there are currently over 40 replicated genes with mapped risk alleles for Late Onset Alzheimer's disease (LOAD), the Apolipoprotein E locus E4 haplotype is still the biggest driver of risk, with odds ratios for neuropathologically confirmed E44 carriers exceeding 30 (95% confidence interval 16.59-58.75). We sought to address whether the APOE E4 haplotype modifies expression globally through networks of expression to increase LOAD risk. We have used the Human Brainome data to build expression networks comparing APOE E4 carriers to non-carriers using scalable mixed-datatypes Bayesian network (BN) modeling. We have found that VGF had the greatest explanatory weight. High expression of VGF is a protective signal, even on the background of APOE E4 alleles. LOAD risk signals, considering an APOE background, include high levels of SPECC1L, HLA-DRA and RANBP3L. Our findings nominate several new transcripts, taking a combined approach to network building including known LOAD risk loci.
Assuntos
Doença de Alzheimer , Apolipoproteína E4 , Predisposição Genética para Doença , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Proteínas Adaptadoras de Transdução de Sinal/genética , Proteínas Adaptadoras de Transdução de Sinal/metabolismo , Alelos , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Apolipoproteína E4/genética , Teorema de Bayes , Haplótipos , Cadeias alfa de HLA-DR/genética , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Fatores de RiscoRESUMO
Cooperative interactions in protein-protein interfaces demonstrate the interdependency or the linked network-like behavior and their effect on the coupling of proteins. Cooperative interactions also could cause ripple or allosteric effects at a distance in protein-protein interfaces. Although they are critically important in protein-protein interfaces, it is challenging to determine which amino acid pair interactions are cooperative. In this work, we have used Bayesian network modeling, an interpretable machine learning method, combined with molecular dynamics trajectories to identify the residue pairs that show high cooperativity and their allosteric effect in the interface of G protein-coupled receptor (GPCR) complexes with Gα subunits. Our results reveal six GPCR:Gα contacts that are common to the different Gα subtypes and show strong cooperativity in the formation of interface. Both the C terminus helix5 and the core of the G protein are codependent entities and play an important role in GPCR coupling. We show that a promiscuous GPCR coupling to different Gα subtypes, makes all the GPCR:Gα contacts that are specific to each Gα subtype (Gαs, Gαi, and Gαq). This work underscores the potential of data-driven Bayesian network modeling in elucidating the intricate dependencies and selectivity determinants in GPCR:G protein complexes, offering valuable insights into the dynamic nature of these essential cellular signaling components.
Assuntos
Teorema de Bayes , Receptores Acoplados a Proteínas G , Receptores Acoplados a Proteínas G/metabolismo , Receptores Acoplados a Proteínas G/química , Humanos , Simulação de Dinâmica Molecular , Ligação Proteica , Subunidades alfa de Proteínas de Ligação ao GTP/metabolismo , Subunidades alfa de Proteínas de Ligação ao GTP/química , Subunidades alfa de Proteínas de Ligação ao GTP/genéticaRESUMO
Enhancers are fundamental to gene regulation. Post-translational modifications by the small ubiquitin-like modifiers (SUMO) modify chromatin regulation enzymes, including histone acetylases and deacetylases. However, it remains unclear whether SUMOylation regulates enhancer marks, acetylation at the 27th lysine residue of the histone H3 protein (H3K27Ac). To investigate whether SUMOylation regulates H3K27Ac, we performed genome-wide ChIP-seq analyses and discovered that knockdown (KD) of the SUMO activating enzyme catalytic subunit UBA2 reduced H3K27Ac at most enhancers. Bioinformatic analysis revealed that TFAP2C-binding sites are enriched in enhancers whose H3K27Ac was reduced by UBA2 KD. ChIP-seq analysis in combination with molecular biological methods showed that TFAP2C binding to enhancers increased upon UBA2 KD or inhibition of SUMOylation by a small molecule SUMOylation inhibitor. However, this is not due to the SUMOylation of TFAP2C itself. Proteomics analysis of TFAP2C interactome on the chromatin identified histone deacetylation (HDAC) and RNA splicing machineries that contain many SUMOylation targets. TFAP2C KD reduced HDAC1 binding to chromatin and increased H3K27Ac marks at enhancer regions, suggesting that TFAP2C is important in recruiting HDAC machinery. Taken together, our findings provide insights into the regulation of enhancer marks by SUMOylation and TFAP2C and suggest that SUMOylation of proteins in the HDAC machinery regulates their recruitments to enhancers.
RESUMO
Next-generation cancer and oncology research needs to take full advantage of the multimodal structured, or graph, information, with the graph data types ranging from molecular structures to spatially resolved imaging and digital pathology, biological networks, and knowledge graphs. Graph Neural Networks (GNNs) efficiently combine the graph structure representations with the high predictive performance of deep learning, especially on large multimodal datasets. In this review article, we survey the landscape of recent (2020-present) GNN applications in the context of cancer and oncology research, and delineate six currently predominant research areas. We then identify the most promising directions for future research. We compare GNNs with graphical models and "non-structured" deep learning, and devise guidelines for cancer and oncology researchers or physician-scientists, asking the question of whether they should adopt the GNN methodology in their research pipelines.
RESUMO
Cytokines mediate cell-to-cell communication across the immune system and therefore are critical to immunosurveillance in cancer and other diseases. Several cytokines show dysregulated abundance or signaling responses in breast cancer, associated with the disease and differences in survival and progression. Cytokines operate in a coordinated manner to affect immune surveillance and regulate one another, necessitating a systems approach for a complete picture of this dysregulation. Here, we profiled cytokine signaling responses of peripheral immune cells from breast cancer patients as compared to healthy controls in a multidimensional manner across ligands, cell populations, and responsive pathways. We find alterations in cytokine responsiveness across pathways and cell types that are best defined by integrated signatures across dimensions. Alterations in the abundance of a cytokine's cognate receptor do not explain differences in responsiveness. Rather, alterations in baseline signaling and receptor abundance suggesting immune cell reprogramming are associated with altered responses. These integrated features suggest a global reprogramming of immune cell communication in breast cancer.
RESUMO
Cooperative interactions in protein-protein interfaces demonstrate the interdependency or the linked network-like behavior of interface interactions and their effect on the coupling of proteins. Cooperative interactions also could cause ripple or allosteric effects at a distance in protein-protein interfaces. Although they are critically important in protein-protein interfaces it is challenging to determine which amino acid pair interactions are cooperative. In this work we have used Bayesian network modeling, an interpretable machine learning method, combined with molecular dynamics trajectories to identify the residue pairs that show high cooperativity and their allosteric effect in the interface of G protein-coupled receptor (GPCR) complexes with G proteins. Our results reveal a strong co-dependency in the formation of interface GPCR:G protein contacts. This observation indicates that cooperativity of GPCR:G protein interactions is necessary for the coupling and selectivity of G proteins and is thus critical for receptor function. We have identified subnetworks containing polar and hydrophobic interactions that are common among multiple GPCRs coupling to different G protein subtypes (Gs, Gi and Gq). These common subnetworks along with G protein-specific subnetworks together confer selectivity to the G protein coupling. This work underscores the potential of data-driven Bayesian network modeling in elucidating the intricate dependencies and selectivity determinants in GPCR:G protein complexes, offering valuable insights into the dynamic nature of these essential cellular signaling components.
RESUMO
The goal of oncology is to provide the longest possible survival outcomes with the therapeutics that are currently available without sacrificing patients' quality of life. In lung cancer, several data points over a patient's diagnostic and treatment course are relevant to optimizing outcomes in the form of precision medicine, and artificial intelligence (AI) provides the opportunity to use available data from molecular information to radiomics, in combination with patient and tumor characteristics, to help clinicians provide individualized care. In doing so, AI can help create models to identify cancer early in diagnosis and deliver tailored therapy on the basis of available information, both at the time of diagnosis and in real time as they are undergoing treatment. The purpose of this review is to summarize the current literature in AI specific to lung cancer and how it applies to the multidisciplinary team taking care of these complex patients.
Assuntos
Inteligência Artificial , Neoplasias Pulmonares , Humanos , Qualidade de Vida , Medicina de PrecisãoRESUMO
Modern artificial neural networks (ANNs) have long been designed on foundations of mathematics as opposed to their original foundations of biomimicry. However, the structure and function of these modern ANNs are often analogous to real-life biological networks. We propose that the ubiquitous information-theoretic principles underlying the development of ANNs are similar to the principles guiding the macro-evolution of biological networks and that insights gained from one field can be applied to the other. We generate hypotheses on the bow-tie network structure of the Janus kinase - signal transducers and activators of transcription (JAK-STAT) pathway, additionally informed by the evolutionary considerations, and carry out ANN simulation experiments to demonstrate that an increase in the network's input and output complexity does not necessarily require a more complex intermediate layer. This observation should guide novel biomarker discovery-namely, to prioritize sections of the biological networks in which information is most compressed as opposed to biomarkers representing the periphery of the network.
RESUMO
While there are currently over 40 replicated genes with mapped risk alleles for Late Onset Alzheimer's disease (LOAD), the Apolipoprotein E locus E4 haplotype is still the biggest driver of risk, with odds ratios for neuropathologically confirmed E44 carriers exceeding 30 (95% confidence interval 16.59-58.75). We sought to address whether the APOE E4 haplotype modifies expression globally through networks of expression to increase LOAD risk. We have used the Human Brainome data to build expression networks comparing APOE E4 carriers to non-carriers using scalable mixed-datatypes Bayesian network (BN) modeling. We have found that VGF had the greatest explanatory weight. High expression of VGF is a protective signal, even on the background of APOE E4 alleles. LOAD risk signals, considering an APOE background, include high levels of SPECC1L, HLA-DRA and RANBP3L. Our findings nominate several new transcripts, taking a combined approach to network building including known LOAD risk loci.
RESUMO
Background and Objective: Machine learning (ML) models are increasingly being utilized in oncology research for use in the clinic. However, while more complicated models may provide improvements in predictive or prognostic power, a hurdle to their adoption are limits of model interpretability, wherein the inner workings can be perceived as a "black box". Explainable artificial intelligence (XAI) frameworks including Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) are novel, model-agnostic approaches that aim to provide insight into the inner workings of the "black box" by producing quantitative visualizations of how model predictions are calculated. In doing so, XAI can transform complicated ML models into easily understandable charts and interpretable sets of rules, which can give providers with an intuitive understanding of the knowledge generated, thus facilitating the deployment of such models in routine clinical workflows. Methods: We performed a comprehensive, non-systematic review of the latest literature to define use cases of model-agnostic XAI frameworks in oncologic research. The examined database was PubMed/MEDLINE. The last search was run on May 1, 2022. Key Content and Findings: In this review, we identified several fields in oncology research where ML models and XAI were utilized to improve interpretability, including prognostication, diagnosis, radiomics, pathology, treatment selection, radiation treatment workflows, and epidemiology. Within these fields, XAI facilitates determination of feature importance in the overall model, visualization of relationships and/or interactions, evaluation of how individual predictions are produced, feature selection, identification of prognostic and/or predictive thresholds, and overall confidence in the models, among other benefits. These examples provide a basis for future work to expand on, which can facilitate adoption in the clinic when the complexity of such modeling would otherwise be prohibitive. Conclusions: Model-agnostic XAI frameworks offer an intuitive and effective means of describing oncology ML models, with applications including prognostication and determination of optimal treatment regimens. Using such frameworks presents an opportunity to improve understanding of ML models, which is a critical step to their adoption in the clinic.
RESUMO
Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.
Assuntos
Biologia de Sistemas , Teorema de Bayes , Simulação por ComputadorRESUMO
In a series of lectures given in 2003, soon after receiving the Fields Medal for his results in the Algebraic Geometry, Vladimir Voevodsky (1966-2017) identifies two strategic goals for mathematics, which he plans to pursue in his further research. The first goal is to develop a ''computerised library of mathematical knowledge,'' which supports an automated proof-verification. The second goal is to ''bridge pure and applied mathematics.'' Voevodsky's research towards the first goal brought about the new Univalent foundations of mathematics. In view of the second goal Voevodsky in 2004 started to develop a mathematical theory of Population Dynamics, which involved the Categorical Probability theory. This latter project did not bring published results and was abandoned by Voevodsky in 2009 when he decided to focus his efforts on the Univalent foundations and closely related topics. In the present paper, which is based on Voevodsky's archival sources, I present Voevodsky's views of mathematics and its relationships with natural sciences, critically discuss these views, and suggest how Voevodsky's ideas and approaches in the applied mathematics can be further developed and pursued. A special attention is given to Voevodsky's original strategy to bridge the persisting gap between pure and applied mathematics where computers and the computer-assisted mathematics play a major role.
Assuntos
Matemática , Dinâmica Populacional , Teoria da Probabilidade , HumanosRESUMO
Cancer immunotherapy, specifically immune checkpoint blockade, has been found to be effective in the treatment of metastatic cancers. However, only a subset of patients achieve clinical responses. Elucidating pretreatment biomarkers predictive of sustained clinical response is a major research priority. Another research priority is evaluating changes in the immune system before and after treatment in responders vs. nonresponders. Our group has been studying immune networks as an accurate reflection of the global immune state. Flow cytometry (FACS, fluorescence-activated cell sorting) data characterizing immune cell panels in peripheral blood mononuclear cells (PBMC) from gastroesophageal adenocarcinoma (GEA) patients were used to analyze changes in immune networks in this setting. Here, we describe a novel computational pipeline to perform secondary analyses of FACS data using systems biology/machine learning techniques and concepts. The pipeline is centered around comparative Bayesian network analyses of immune networks and is capable of detecting strong signals that conventional methods (such as FlowJo manual gating) might miss. Future studies are planned to validate and follow up the immune biomarkers (and combinations/interactions thereof) associated with clinical responses identified with this computational pipeline.
Assuntos
Adenocarcinoma , Citometria de Fluxo , Neoplasias Gastrointestinais , Imunoterapia , Leucócitos Mononucleares , Adenocarcinoma/sangue , Adenocarcinoma/imunologia , Adenocarcinoma/terapia , Neoplasias Gastrointestinais/sangue , Neoplasias Gastrointestinais/imunologia , Neoplasias Gastrointestinais/terapia , Humanos , Leucócitos Mononucleares/imunologia , Leucócitos Mononucleares/metabolismo , Leucócitos Mononucleares/patologiaRESUMO
Recent successes of immune-modulating therapies for cancer have stimulated research on information flow within the immune system and, in turn, clinical applications of concepts from information theory. Through information theory, one can describe and formalize, in a mathematically rigorous fashion, the function of interconnected components of the immune system in health and disease. Specifically, using concepts including entropy, mutual information, and channel capacity, one can quantify the storage, transmission, encoding, and flow of information within and between cellular components of the immune system on multiple temporal and spatial scales. To understand, at the quantitative level, immune signaling function and dysfunction in cancer, we present a methodology-oriented review of information-theoretic treatment of biochemical signal transduction and transmission coupled with mathematical modeling.
Assuntos
Teoria da Informação , Neoplasias/imunologia , Alergia e Imunologia , Animais , Humanos , Oncologia , Transdução de SinaisRESUMO
In this review, we aim to assess the current state of science in relation to the integration of patient-generated health data (PGHD) and patient-reported outcomes (PROs) into routine clinical care with a focus on surgical oncology populations. We will also describe the critical role of artificial intelligence and machine-learning methodology in the efficient translation of PGHD, PROs, and traditional outcome measures into meaningful patient care models.
Assuntos
Inteligência Artificial , Registros Eletrônicos de Saúde/estatística & dados numéricos , Aprendizado de Máquina , Neoplasias/cirurgia , Dados de Saúde Gerados pelo Paciente , Medidas de Resultados Relatados pelo Paciente , Oncologia Cirúrgica , Humanos , Neoplasias/patologiaRESUMO
We propose a novel two-stage analysis strategy to discover candidate genes associated with the particular cancer outcomes in large multimodal genomic cancers databases, such as The Cancer Genome Atlas (TCGA). During the first stage, we use mixed mutual information to perform variable selection; during the second stage, we use scalable Bayesian network (BN) modeling to identify candidate genes and their interactions. Two crucial features of the proposed approach are (i) the ability to handle mixed data types (continuous and discrete, genomic, epigenomic, etc.) and (ii) a flexible boundary between the variable selection and network modeling stages - the boundary that can be adjusted in accordance with the investigators' BN software scalability and hardware implementation. These two aspects result in high generalizability of the proposed analytical framework. We apply the above strategy to three different TCGA datasets (LGG, Brain Lower Grade Glioma; HNSC, Head and Neck Squamous Cell Carcinoma; STES, Stomach and Esophageal Carcinoma), linking multimodal molecular information (SNPs, mRNA expression, DNA methylation) to two clinical outcome variables (tumor status and patient survival). We identify 11 candidate genes, of which 6 have already been directly implicated in the cancer literature. One novel LGG prognostic factor suggested by our analysis, methylation of TMPRSS11F type II transmembrane serine protease, presents intriguing direction for the follow-up studies.
RESUMO
The challenges in recapitulating in vivo human T cell development in laboratory models have posed a barrier to understanding human thymopoiesis. Here, we used single-cell RNA sequencing (sRNA-seq) to interrogate the rare CD34+ progenitor and the more differentiated CD34- fractions in the human postnatal thymus. CD34+ thymic progenitors were comprised of a spectrum of specification and commitment states characterized by multilineage priming followed by gradual T cell commitment. The earliest progenitors in the differentiation trajectory were CD7- and expressed a stem-cell-like transcriptional profile, but had also initiated T cell priming. Clustering analysis identified a CD34+ subpopulation primed for the plasmacytoid dendritic lineage, suggesting an intrathymic dendritic specification pathway. CD2 expression defined T cell commitment stages where loss of B cell potential preceded that of myeloid potential. These datasets delineate gene expression profiles spanning key differentiation events in human thymopoiesis and provide a resource for the further study of human T cell development.