ABSTRACT
RNA-binding proteins (RBPs) control RNA metabolism to orchestrate gene expression and, when dysfunctional, underlie human diseases. Proteome-wide discovery efforts predict thousands of RBP candidates, many of which lack canonical RNA-binding domains (RBDs). Here, we present a hybrid ensemble RBP classifier (HydRA), which leverages information from both intermolecular protein interactions and internal protein sequence patterns to predict RNA-binding capacity with unparalleled specificity and sensitivity using support vector machines (SVMs), convolutional neural networks (CNNs), and Transformer-based protein language models. Occlusion mapping by HydRA robustly detects known RBDs and predicts hundreds of uncharacterized RNA-binding associated domains. Enhanced CLIP (eCLIP) for HydRA-predicted RBP candidates reveals transcriptome-wide RNA targets and confirms RNA-binding activity for HydRA-predicted RNA-binding associated domains. HydRA accelerates construction of a comprehensive RBP catalog and expands the diversity of RNA-binding associated domains.
Subject(s)
Deep Learning , Hydra , Animals , Humans , RNA/metabolism , Protein Binding , Binding Sites/genetics , Hydra/genetics , Hydra/metabolismABSTRACT
Protein kinases are essential for signal transduction and control of most cellular processes, including metabolism, membrane transport, motility, and cell cycle. Despite the critical role of kinases in cells and their strong association with diseases, good coverage of their interactions is available for only a fraction of the 535 human kinases. Here, we present a comprehensive mass-spectrometry-based analysis of a human kinase interaction network covering more than 300 kinases. The interaction dataset is a high-quality resource with more than 5,000 previously unreported interactions. We extensively characterized the obtained network and were able to identify previously described, as well as predict new, kinase functional associations, including those of the less well-studied kinases PIM3 and protein O-mannose kinase (POMK). Importantly, the presented interaction map is a valuable resource for assisting biomedical studies. We uncover dozens of kinase-disease associations spanning from genetic disorders to complex diseases, including cancer.
Subject(s)
Gene Regulatory Networks , Genetic Diseases, Inborn/genetics , Neoplasms/genetics , Protein Kinases/genetics , Protein Serine-Threonine Kinases/genetics , Proto-Oncogene Proteins/genetics , Computational Biology/methods , Datasets as Topic , Gene Expression Regulation , Gene Ontology , Genetic Diseases, Inborn/enzymology , Genetic Diseases, Inborn/pathology , Humans , Metabolic Networks and Pathways/genetics , Molecular Sequence Annotation , Muscular Dystrophies/enzymology , Muscular Dystrophies/genetics , Muscular Dystrophies/pathology , Neoplasms/enzymology , Neoplasms/pathology , Neurodegenerative Diseases/enzymology , Neurodegenerative Diseases/genetics , Neurodegenerative Diseases/pathology , Protein Interaction Mapping/methods , Protein Kinases/chemistry , Protein Kinases/classification , Protein Kinases/metabolism , Protein Serine-Threonine Kinases/chemistry , Protein Serine-Threonine Kinases/metabolism , Proto-Oncogene Proteins/chemistry , Proto-Oncogene Proteins/metabolism , Signal TransductionABSTRACT
Nucleosomes represent hubs in chromatin organization and gene regulation and interact with a plethora of chromatin factors through different modes. In addition, alterations in histone proteins such as cancer mutations and post-translational modifications have profound effects on histone/nucleosome interactions. To elucidate the principles of histone interactions and the effects of those alterations, we developed histone interactomes for comprehensive mapping of histone-histone interactions (HHIs), histone-DNA interactions (HDIs), histone-partner interactions (HPIs) and DNA-partner interactions (DPIs) of 37 organisms, which contains a total of 3808 HPIs from 2544 binding proteins and 339 HHIs, 100 HDIs and 142 DPIs across 110 histone variants. With the developed networks, we explored histone interactions at different levels of granularities (protein-, domain- and residue-level) and performed systematic analysis on histone interactions at a large scale. Our analyses have characterized the preferred binding hotspots on both nucleosomal/linker DNA and histone octamer and unraveled diverse binding modes between nucleosome and different classes of binding partners. Last, to understand the impact of histone cancer-associated mutations on histone/nucleosome interactions, we complied one comprehensive cancer mutation dataset including 7940 cancer-associated histone mutations and further mapped those mutations onto 419,125 histone interactions at the residue level. Our quantitative analyses point to histone cancer-associated mutations' strongly disruptive effects on HHIs, HDIs and HPIs. We have further predicted 57 recurrent histone cancer mutations that have large effects on histone/nucleosome interactions and may have driver status in oncogenesis.
Subject(s)
Neoplasms , Nucleosomes , Humans , Nucleosomes/genetics , Histones/genetics , Histones/metabolism , DNA/chemistry , Mutation , Neoplasms/geneticsABSTRACT
The interaction networks formed by ectomycorrhizal fungi (EMF) and their tree hosts, which are important to both forest recruitment and ecosystem carbon and nutrient retention, may be particularly susceptible to climate change at the boreal-temperate forest ecotone where environmental conditions are changing rapidly. Here, we quantified the compositional and functional trait responses of EMF communities and their interaction networks with two boreal (Pinus banksiana and Betula papyrifera) and two temperate (Pinus strobus and Quercus macrocarpa) hosts to a factorial combination of experimentally elevated temperatures and reduced rainfall in a long-term open-air field experiment. The study was conducted at the B4WarmED (Boreal Forest Warming at an Ecotone in Danger) experiment in Minnesota, USA, where infrared lamps and buried heating cables elevate temperatures (ambient, +3.1 °C) and rain-out shelters reduce growing season precipitation (ambient, ~30% reduction). EMF communities were characterized and interaction networks inferred from metabarcoding of fungal-colonized root tips. Warming and rainfall reduction significantly altered EMF community composition, leading to an increase in the relative abundance of EMF with contact-short distance exploration types. These compositional changes, which likely limited the capacity for mycelial connections between trees, corresponded with shifts from highly redundant EMF interaction networks under ambient conditions to less redundant (more specialized) networks. Further, the observed changes in EMF communities and interaction networks were correlated with changes in soil moisture and host photosynthesis. Collectively, these results indicate that the projected changes in climate will likely lead to significant shifts in the traits, structure, and integrity of EMF communities as well as their interaction networks in forest ecosystems at the boreal-temperate ecotone.
Subject(s)
Mycorrhizae , Pinus , Ecosystem , Climate Change , Forests , Trees/physiology , Pinus/microbiologyABSTRACT
Protein nanoparticles play pivotal roles in many areas of bionanotechnology, including drug delivery, vaccination, and diagnostics. These technologies require control over the distinct particle morphologies that protein nanocontainers can adopt during self-assembly from their constituent protein components. The geometric construction principle of virus-derived protein cages is by now fairly well understood by analogy to viral protein shells in terms of Caspar and Klug's quasi-equivalence principle. However, many artificial, or genetically modified, protein containers exhibit varying degrees of quasi-equivalence in the interactions between identical protein subunits. They can also contain a subset of protein subunits that do not participate in interactions with other assembly units, called capsomers, leading to gaps in the particle surface. We introduce a method that exploits information on the local interactions between the capsomers to infer the geometric construction principle of these nanoparticle architectures. The predictive power of this approach is demonstrated here for a prominent system in nanotechnology, the AaLS pentamer. Our method not only rationalises hitherto discovered cage structures but also predicts geometrically viable options that have not yet been observed. The classification of nanoparticle architecture based on the geometric properties of the interaction network closes a gap in our current understanding of protein container structure and can be widely applied in protein nanotechnology, paving the way to programmable control over particle polymorphism.
Subject(s)
Nanoparticles , Protein Subunits , NanotechnologyABSTRACT
ABC transporters are found in all organisms and almost every cellular compartment. They mediate the transport of various solutes across membranes, energized by ATP binding and hydrolysis. Dysfunctions can result in severe diseases, such as cystic fibrosis or antibiotic resistance. In type IV ABC transporters, each of the two nucleotide-binding domains is connected to a transmembrane domain by two coupling helices, which are part of cytosolic loops. Although there are many structural snapshots of different conformations, the interdomain communication is still enigmatic. Therefore, we analyzed the function of three conserved charged residues in the intracytosolic loop 1 of the human homodimeric, lysosomal peptide transporter TAPL (transporter associated with antigen processing-like). Substitution of D278 in coupling helix 1 by alanine interrupted peptide transport by impeding ATP hydrolysis. Alanine substitution of R288 and D292, both localized next to the coupling helix 1 extending to transmembrane helix 3, reduced peptide transport but increased basal ATPase activity. Surprisingly, the ATPase activity of the R288A variant dropped in a peptide-dependent manner, whereas ATPase activity of wildtype and D292A was unaffected. Interestingly, R288A and D292A mutants did not differentiate between ATP and GTP in respect of hydrolysis. However, in contrast to wildtye TAPL, only ATP energized peptide transport. In sum, D278 seems to be involved in bidirectional interdomain communication mediated by network of polar interactions, whereas the two residues in the cytosolic extension of transmembrane helix 3 are involved in regulation of ATP hydrolysis, most likely by stabilization of the outward-facing conformation.
Subject(s)
ATP-Binding Cassette Transporters , Adenosine Triphosphate , Protein Multimerization , ATP-Binding Cassette Transporters/metabolism , ATP-Binding Cassette Transporters/chemistry , ATP-Binding Cassette Transporters/genetics , Humans , Adenosine Triphosphate/metabolism , Adenosine Triphosphate/chemistry , Hydrolysis , Amino Acid Substitution , Protein Domains , Adenosine Triphosphatases/metabolism , Adenosine Triphosphatases/chemistry , Adenosine Triphosphatases/geneticsABSTRACT
Frontotemporal dementia (FTD) is a primary cause of dementia encompassing a broad range of clinical phenotypes and cellular pathologies. Genetic discoveries in FTD have largely been driven by linkage studies in well-documented extended families, explaining most of the patients with a known pathogenic mutation. In the context of complex diseases, it is hypothesized that mutations with reduced penetrance or a combination of low-effect size variants with environmental factors drive disease. Furthermore, these genes are likely to be part of the interaction networks of known FTD genes, contributing to converging cellular processes. In this review, we examine gene discovery approaches in FTD and introduce network biology concepts as tools to assist gene identification studies in genetically complex disease.
Subject(s)
Frontotemporal Dementia , Frontotemporal Dementia/genetics , Frontotemporal Dementia/pathology , Genetic Linkage , Humans , Mutation , PhenotypeABSTRACT
Silencers are repressive cis-regulatory elements that play crucial roles in transcriptional regulation. Experimental methods for identifying silencers are always costly and time-consuming. Computational methods, which relies on genomic sequence features, have been introduced as alternative approaches. However, silencers do not have significant epigenomic signature. Therefore, we explore a new way to computationally identify silencers, by incorporating chromatin structural information. We propose the SilenceREIN method, which focuses on finding silencers on anchors of chromatin loops. By using graph neural networks, we extracted chromatin structural information from a regulatory element interaction network. SilenceREIN integrated the chromatin structural information with linear genomic signatures to find silencers. The predictive performance of SilenceREIN is comparable or better than other states-of-the-art methods. We performed a genome-wide scanning to systematically find silencers in human genome. Results suggest that silencers are widespread on anchors of chromatin loops. In addition, enrichment analysis of transcription factor binding motif support our prediction results. As far as we can tell, this is the first attempt to incorporate chromatin structural information in finding silencers. All datasets and source codes of SilenceREIN have been deposited in a GitHub repository (https://github.com/JianHPan/SilenceREIN).
Subject(s)
Chromatin , Silencer Elements, Transcriptional , Humans , Chromatin/genetics , Regulatory Sequences, Nucleic Acid , Genome, Human , Neural Networks, ComputerABSTRACT
Identifying the potential bacteriophages (phage) candidate to treat bacterial infections plays an essential role in the research of human pathogens. Computational approaches are recognized as a valid way to predict bacteria and target phages. However, most of the current methods only utilize lower-order biological information without considering the higher-order connectivity patterns, which helps to improve the predictive accuracy. Therefore, we developed a novel microbial heterogeneous interaction network (MHIN)-based model called PTBGRP to predict new phages for bacterial hosts. Specifically, PTBGRP first constructs an MHIN by integrating phage-bacteria interaction (PBI) and six bacteria-bacteria interaction networks with their biological attributes. Then, different representation learning methods are deployed to extract higher-level biological features and lower-level topological features from MHIN. Finally, PTBGRP employs a deep neural network as the classifier to predict unknown PBI pairs based on the fused biological information. Experiment results demonstrated that PTBGRP achieves the best performance on the corresponding ESKAPE pathogens and PBI dataset when compared with state-of-art methods. In addition, case studies of Klebsiella pneumoniae and Staphylococcus aureus further indicate that the consideration of rich heterogeneous information enables PTBGRP to accurately predict PBI from a more comprehensive perspective. The webserver of the PTBGRP predictor is freely available at http://120.77.11.78/PTBGRP/.
Subject(s)
Bacteriophages , Staphylococcal Infections , Humans , Learning , Bacteria , Neural Networks, ComputerABSTRACT
Gene essentiality is defined as the extent to which a gene is required for the survival and reproductive success of a living system. It can vary between genetic backgrounds and environments. Essential protein coding genes have been well studied. However, the essentiality of non-coding regions is rarely reported. Most regions of human genome do not encode proteins. Determining essentialities of non-coding genes is demanded. We developed iEssLnc models, which can assign essentiality scores to lncRNA genes. As far as we know, this is the first direct quantitative estimation to the essentiality of lncRNA genes. By taking the advantage of graph neural network with meta-path-guided random walks on the lncRNA-protein interaction network, iEssLnc models can perform genome-wide screenings for essential lncRNA genes in a quantitative manner. We carried out validations and whole genome screening in the context of human cancer cell-lines and mouse genome. In comparisons to other methods, which are transferred from protein-coding genes, iEssLnc achieved better performances. Enrichment analysis indicated that iEssLnc essentiality scores clustered essential lncRNA genes with high ranks. With the screening results of iEssLnc models, we estimated the number of essential lncRNA genes in human and mouse. We performed functional analysis to find that essential lncRNA genes interact with microRNAs and cytoskeletal proteins significantly, which may be of interest in experimental life sciences. All datasets and codes of iEssLnc models have been deposited in GitHub (https://github.com/yyZhang14/iEssLnc).
Subject(s)
MicroRNAs , Neoplasms , RNA, Long Noncoding , Humans , Animals , Mice , Protein Interaction Maps , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , MicroRNAs/metabolism , Neural Networks, ComputerABSTRACT
Cancer genomics is dedicated to elucidating the genes and pathways that contribute to cancer progression and development. Identifying cancer genes (CGs) associated with the initiation and progression of cancer is critical for characterization of molecular-level mechanism in cancer research. In recent years, the growing availability of high-throughput molecular data and advancements in deep learning technologies has enabled the modelling of complex interactions and topological information within genomic data. Nevertheless, because of the limited labelled data, pinpointing CGs from a multitude of potential mutations remains an exceptionally challenging task. To address this, we propose a novel deep learning framework, termed self-supervised masked graph learning (SMG), which comprises SMG reconstruction (pretext task) and task-specific fine-tuning (downstream task). In the pretext task, the nodes of multi-omic featured protein-protein interaction (PPI) networks are randomly substituted with a defined mask token. The PPI networks are then reconstructed using the graph neural network (GNN)-based autoencoder, which explores the node correlations in a self-prediction manner. In the downstream tasks, the pre-trained GNN encoder embeds the input networks into feature graphs, whereas a task-specific layer proceeds with the final prediction. To assess the performance of the proposed SMG method, benchmarking experiments are performed on three node-level tasks (identification of CGs, essential genes and healthy driver genes) and one graph-level task (identification of disease subnetwork) across eight PPI networks. Benchmarking experiments and performance comparison with existing state-of-the-art methods demonstrate the superiority of SMG on multi-omic feature engineering.
Subject(s)
Neoplasms , Oncogenes , Mutation , Benchmarking , Genes, Essential , Genomics , Neoplasms/geneticsABSTRACT
Spatiotemporal-controlled second messengers alter molecular interactions of central signaling nodes for ensuring physiological signal transmission. One prototypical second messenger molecule which modulates kinase signal transmission is the cyclic-adenosine monophosphate (cAMP). The main proteinogenic cellular effectors of cAMP are compartmentalized protein kinase A (PKA) complexes. Their cell-type specific compositions precisely coordinate substrate phosphorylation and proper signal propagation which is indispensable for numerous cell-type specific functions. Here we present evidence that TAF15, which is implicated in the etiology of amyotrophic lateral sclerosis, represents a novel nuclear PKA substrate. In cross-linking and immunoprecipitation experiments (iCLIP) we showed that TAF15 phosphorylation alters the binding to target transcripts related to mRNA maturation, splicing and protein-binding related functions. TAF15 appears to be one of multiple PKA substrates that undergo RNA-binding dynamics upon phosphorylation. We observed that the activation of the cAMP-PKA signaling axis caused a change in the composition of a collection of RNA species that interact with TAF15. This observation appears to be a broader principle in the regulation of molecular interactions, as we identified a significant enrichment of RNA-binding proteins within endogenous PKA complexes. We assume that phosphorylation of RNA-binding domains adds another layer of regulation to binary protein-RNAs interactions with consequences to RNA features including binding specificities, localization, abundance and composition.
Subject(s)
Amyotrophic Lateral Sclerosis , TATA-Binding Protein Associated Factors , Humans , Cyclic AMP-Dependent Protein Kinases , Phosphorylation , Cyclic AMP , RNAABSTRACT
Analyzing the interactions of circular RNAs (circRNAs) is a crucial step in understanding their functional impacts. While there are numerous visualization tools available for investigating circRNA interaction networks, these tools are typically limited to known circRNAs from specific databases. Moreover, these existing tools usually require complex installation procedures which can be time-consuming and challenging for users. There is a lack of a user-friendly web application that facilitates interactive exploration and visualization of circRNA interaction networks. CircNetVis is an interactive online web application to enhance the analysis of human/mouse circRNA interactions. The tool allows three different input formats of circRNAs including circRNA IDs from CircBase, circRNA coordinates (chromosome, start position, end position), and circRNA sequences in the FASTA format. It integrates multiple interaction networks for visualization and investigation of the interplay between circRNA, microRNAs, mRNAs and RNA binding proteins. CircNetVis also enables users to interactively explore the interactions of unknown circRNAs which are not reported from previous databases. The tool can generate interactive plots and allows users to save results as output files for offline usage. CircNetVis is implemented as a web application using R-shiny and freely available for academic use at https://www.meb.ki.se/shiny/truvu/CircNetVis/ .
Subject(s)
MicroRNAs , RNA, Circular , Humans , Mice , Animals , MicroRNAs/genetics , MicroRNAs/metabolism , RNA, Messenger/genetics , Software , Databases, Factual , Gene Regulatory NetworksABSTRACT
BACKGROUND: High-throughput experimental technologies can provide deeper insights into pathway perturbations in biomedical studies. Accordingly, their usage is central to the identification of molecular targets and the subsequent development of suitable treatments for various diseases. Classical interpretations of generated data, such as differential gene expression and pathway analyses, disregard interconnections between studied genes when looking for gene-disease associations. Given that these interconnections are central to cellular processes, there has been a recent interest in incorporating them in such studies. The latter allows the detection of gene modules that underlie complex phenotypes in gene interaction networks. Existing methods either impose radius-based restrictions or freely grow modules at the expense of a statistical bias towards large modules. We propose a heuristic method, inspired by Ant Colony Optimization, to apply gene-level scoring and module identification with distance-based search constraints and penalties, rather than radius-based constraints. RESULTS: We test and compare our results to other approaches using three datasets of different neurodegenerative diseases, namely Alzheimer's, Parkinson's, and Huntington's, over three independent experiments. We report the outcomes of enrichment analyses and concordance of gene-level scores for each disease. Results indicate that the proposed approach generally shows superior stability in comparison to existing methods. It produces stable and meaningful enrichment results in all three datasets which have different case to control proportions and sample sizes. CONCLUSION: The presented network-based gene expression analysis approach successfully identifies dysregulated gene modules associated with a certain disease. Using a heuristic based on Ant Colony Optimization, we perform a distance-based search with no radius constraints. Experimental results support the effectiveness and stability of our method in prioritizing modules of high relevance. Our tool is publicly available at github.com/GhadiElHasbani/ACOxGS.git.
Subject(s)
Gene Regulatory Networks , Gene Regulatory Networks/genetics , Humans , Algorithms , Neurodegenerative Diseases/genetics , Gene Expression Profiling/methods , Computational Biology/methods , Animals , Ants/genetics , Databases, GeneticABSTRACT
BACKGROUND: Driver genes play a vital role in the development of cancer. Identifying driver genes is critical for diagnosing and understanding cancer. However, challenges remain in identifying personalized driver genes due to tumor heterogeneity of cancer. Although many computational methods have been developed to solve this problem, few efforts have been undertaken to explore gene-patient associations to identify personalized driver genes. RESULTS: Here we propose a method called LPDriver to identify personalized cancer driver genes by employing linear neighborhood propagation model on individual genetic data. LPDriver builds personalized gene network based on the genetic data of individual patients, extracts the gene-patient associations from the bipartite graph of the personalized gene network and utilizes a linear neighborhood propagation model to mine gene-patient associations to detect personalized driver genes. The experimental results demonstrate that as compared to the existing methods, our method shows competitive performance and can predict cancer driver genes in a more accurate way. Furthermore, these results also show that besides revealing novel driver genes that have been reported to be related with cancer, LPDriver is also able to identify personalized cancer driver genes for individual patients by their network characteristics even if the mutation data of genes are hidden. CONCLUSIONS: LPDriver can provide an effective approach to predict personalized cancer driver genes, which could promote the diagnosis and treatment of cancer. The source code and data are freely available at https://github.com/hyr0771/LPDriver .
Subject(s)
Neoplasms , Oncogenes , Humans , Mutation , Gene Regulatory Networks , Linear Models , Patients , Neoplasms/geneticsABSTRACT
BACKGROUND: The identification of essential proteins can help in understanding the minimum requirements for cell survival and development to discover drug targets and prevent disease. Nowadays, node ranking methods are a common way to identify essential proteins, but the poor data quality of the underlying PIN has somewhat hindered the identification accuracy of essential proteins for these methods in the PIN. Therefore, researchers constructed refinement networks by considering certain biological properties of interacting protein pairs to improve the performance of node ranking methods in the PIN. Studies show that proteins in a complex are more likely to be essential than proteins not present in the complex. However, the modularity is usually ignored for the refinement methods of the PINs. METHODS: Based on this, we proposed a network refinement method based on module discovery and biological information. The idea is, first, to extract the maximal connected subgraph in the PIN, and to divide it into different modules by using Fast-unfolding algorithm; then, to detect critical modules according to the orthologous information, subcellular localization information and topology information within each module; finally, to construct a more refined network (CM-PIN) by using the identified critical modules. RESULTS: To evaluate the effectiveness of the proposed method, we used 12 typical node ranking methods (LAC, DC, DMNC, NC, TP, LID, CC, BC, PR, LR, PeC, WDC) to compare the overall performance of the CM-PIN with those on the S-PIN, D-PIN and RD-PIN. The experimental results showed that the CM-PIN was optimal in terms of the identification number of essential proteins, precision-recall curve, Jackknifing method and other criteria, and can help to identify essential proteins more accurately.
Subject(s)
Saccharomyces cerevisiae Proteins , Saccharomyces cerevisiae , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/metabolism , Protein Interaction Mapping/methods , Algorithms , Protein Interaction Maps , Computational Biology/methodsABSTRACT
The COVID-19 pandemic caused by the SARS-CoV-2 virus infected more than 775,686,716 humans and was responsible for the death of more than 7,054,093 individuals. COVID-19 has taught us that the development of vaccines, repurposing of drugs, and understanding the mechanism of a disease can be done within a short time. The COVID-19 proteomics and metabolomics has contributed to its diagnosis, understanding of its progression, host-virus interaction, disease mechanism, and also in the search of suitable anti-COVID therapeutics. Mass spectrometry based proteomics was used to find the potential biomarkers of different stages of COVID-19 including severe and nonsevere cases in the blood serum. Notably, protein-protein interaction techniques to understand host-virus interactions were also significantly useful. The single-cell proteomics studies were carried out to ascertain the changes in immune cell composition and its activation in mild COVID-19 patients versus severe COVID-19 patients using whole-blood and peripheral-blood mononuclear cells. Modern technologies were helpful to deal with the pandemic; however, there is still scope for further development. Further, attempts were made to understand the protein-protein, metabolite-metabolite, and protein-metabolite interactomes, derived from proteins and metabolite fingerprints of COVID-19 patients by reanalysis of COVID-19 public mass spectrometry based proteomics and metabolomics studies. Further, some of these interactions were supported by the literature as validations in the COVID-19 studies.
Subject(s)
Biomarkers , COVID-19 , Metabolomics , Proteomics , SARS-CoV-2 , Humans , COVID-19/metabolism , COVID-19/virology , COVID-19/blood , Proteomics/methods , Metabolomics/methods , Biomarkers/blood , Mass Spectrometry/methods , Host-Pathogen Interactions , PandemicsABSTRACT
Sin3 is an evolutionarily conserved repressor protein complex mainly associated with histone deacetylase (HDAC) activity. Many proteins are part of Sin3/HDAC complexes, and the function of most of these members remains poorly understood. SAP25, a previously identified Sin3A associated protein of 25 kDa, has been proposed to participate in regulating gene expression programs involved in the immune response but the exact mechanism of this regulation is unclear. SAP25 is not expressed in HEK293 cells, which hence serve as a natural knockout system to decipher the molecular functions uniquely carried out by this Sin3/HDAC subunit. Using molecular, proteomic, protein engineering, and interaction network approaches, we show that SAP25 interacts with distinct enzymatic and regulatory protein complexes in addition to Sin3/HDAC. Additional proteins uniquely recovered from the Halo-SAP25 pull-downs included the SCF E3 ubiquitin ligase complex SKP1/FBXO3/CUL1 and the ubiquitin carboxyl-terminal hydrolase 11 (USP11). Furthermore, mutational analysis demonstrates that distinct regions of SAP25 participate in its interaction with USP11, OGT/TETs, and SCF(FBXO3). These results suggest that SAP25 may function as an adaptor protein to coordinate the assembly of different enzymatic complexes to control Sin3/HDAC-mediated gene expression. The data were deposited with the MASSIVE repository with the identifiers MSV000093576 and MSV000093553.
ABSTRACT
BACKGROUND: In cellular activities, essential proteins play a vital role and are instrumental in comprehending fundamental biological necessities and identifying pathogenic genes. Current deep learning approaches for predicting essential proteins underutilize the potential of gene expression data and are inadequate for the exploration of dynamic networks with limited evaluation across diverse species. RESULTS: We introduce ECDEP, an essential protein identification model based on evolutionary community discovery. ECDEP integrates temporal gene expression data with a protein-protein interaction (PPI) network and employs the 3-Sigma rule to eliminate outliers at each time point, constructing a dynamic network. Next, we utilize edge birth and death information to establish an interaction streaming source to feed into the evolutionary community discovery algorithm and then identify overlapping communities during the evolution of the dynamic network. SVM recursive feature elimination (RFE) is applied to extract the most informative communities, which are combined with subcellular localization data for classification predictions. We assess the performance of ECDEP by comparing it against ten centrality methods, four shallow machine learning methods with RFE, and two deep learning methods that incorporate multiple biological data sources on Saccharomyces. Cerevisiae (S. cerevisiae), Homo sapiens (H. sapiens), Mus musculus, and Caenorhabditis elegans. ECDEP achieves an AP value of 0.86 on the H. sapiens dataset and the contribution ratio of community features in classification reaches 0.54 on the S. cerevisiae (Krogan) dataset. CONCLUSIONS: Our proposed method adeptly integrates network dynamics and yields outstanding results across various datasets. Furthermore, the incorporation of evolutionary community discovery algorithms amplifies the capacity of gene expression data in classification.
Subject(s)
Protein Interaction Maps , Saccharomyces cerevisiae , Animals , Mice , Humans , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Algorithms , Proteins/metabolism , Caenorhabditis elegans/genetics , Caenorhabditis elegans/metabolismABSTRACT
Understanding the mechanisms underlying diversity-productivity relationships (DPRs) is crucial to mitigating the effects of forest biodiversity loss. Tree-tree interactions in diverse communities are fundamental in driving growth rates, potentially shaping the emergent DPRs, yet remain poorly explored. Here, using data from a large-scale forest biodiversity experiment in subtropical China, we demonstrated that changes in individual tree productivity were driven by species-specific pairwise interactions, with higher positive net pairwise interaction effects on trees in more diverse neighbourhoods. By perturbing the interactions strength from empirical data in simulations, we revealed that the positive differences between inter- and intra-specific interactions were the critical determinant for the emergence of positive DPRs. Surprisingly, the condition for positive DPRs corresponded to the condition for coexistence. Our results thus provide a novel insight into how pairwise tree interactions regulate DPRs, with implications for identifying the tree mixtures with maximized productivity to guide forest restoration and reforestation efforts.