ABSTRACT
Bacteria use a wide range of immune pathways to counter phage infection. A subset of these genes shares homology with components of eukaryotic immune systems, suggesting that eukaryotes horizontally acquired certain innate immune genes from bacteria. Here, we show that proteins containing a NACHT module, the central feature of the animal nucleotide-binding domain and leucine-rich repeat containing gene family (NLRs), are found in bacteria and defend against phages. NACHT proteins are widespread in bacteria, provide immunity against both DNA and RNA phages, and display the characteristic C-terminal sensor, central NACHT, and N-terminal effector modules. Some bacterial NACHT proteins have domain architectures similar to the human NLRs that are critical components of inflammasomes. Human disease-associated NLR mutations that cause stimulus-independent activation of the inflammasome also activate bacterial NACHT proteins, supporting a shared signaling mechanism. This work establishes that NACHT module-containing proteins are ancient mediators of innate immunity across the tree of life.
Subject(s)
Bacteria , Bacteriophages , NLR Proteins , Animals , Humans , Bacteria/genetics , Bacteria/metabolism , Bacteria/virology , Bacteriophages/genetics , Bacteriophages/metabolism , Immunity, Innate , Inflammasomes/metabolism , NLR Proteins/genetics , Bacterial ProteinsABSTRACT
Over the past two decades, studies have revealed profound evolutionary connections between prokaryotic and eukaryotic immune systems, challenging the notion of their unrelatedness. Immune systems across the tree of life share an operational framework, shaping their biochemical logic and evolutionary trajectories. The diversification of immune genes in the prokaryotic superkingdoms, followed by lateral transfer to eukaryotes, was central to the emergence of innate immunity in the latter. These include protein domains related to nucleotide second messenger-dependent systems, NAD+/nucleotide degradation, and P-loop NTPase domains of the STAND and GTPase clades playing pivotal roles in eukaryotic immunity and inflammation. Moreover, several domains orchestrating programmed cell death, ultimately of prokaryotic provenance, suggest an intimate link between immunity and the emergence of multicellularity in eukaryotes such as animals. While eukaryotes directly adopted some proteins from bacterial immune systems, they repurposed others for new immune functions from bacterial interorganismal conflict systems. These emerging immune components hold substantial biotechnological potential.
ABSTRACT
Stalled ribosomes are rescued by pathways that recycle the ribosome and target the nascent polypeptide for degradation. In E. coli, these pathways are triggered by ribosome collisions through the recruitment of SmrB, a nuclease that cleaves the mRNA. In B. subtilis, the related protein MutS2 was recently implicated in ribosome rescue. Here we show that MutS2 is recruited to collisions by its SMR and KOW domains, and we reveal the interaction of these domains with collided ribosomes by cryo-EM. Using a combination of in vivo and in vitro approaches, we show that MutS2 uses its ABC ATPase activity to split ribosomes, targeting the nascent peptide for degradation through the ribosome quality control pathway. However, unlike SmrB, which cleaves mRNA in E. coli, we see no evidence that MutS2 mediates mRNA cleavage or promotes ribosome rescue by tmRNA. These findings clarify the biochemical and cellular roles of MutS2 in ribosome rescue in B. subtilis and raise questions about how these pathways function differently in diverse bacteria.
Subject(s)
Bacillus subtilis , Protein Biosynthesis , RNA, Messenger/metabolism , Bacillus subtilis/genetics , Bacillus subtilis/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism , Ribosomes/metabolism , Peptides/metabolismABSTRACT
Ribosome rescue pathways recycle stalled ribosomes and target problematic mRNAs and aborted proteins for degradation1,2. In bacteria, it remains unclear how rescue pathways distinguish ribosomes stalled in the middle of a transcript from actively translating ribosomes3-6. Here, using a genetic screen in Escherichia coli, we discovered a new rescue factor that has endonuclease activity. SmrB cleaves mRNAs upstream of stalled ribosomes, allowing the ribosome rescue factor tmRNA (which acts on truncated mRNAs3) to rescue upstream ribosomes. SmrB is recruited to ribosomes and is activated by collisions. Cryo-electron microscopy structures of collided disomes from E. coli and Bacillus subtilis show distinct and conserved arrangements of individual ribosomes and the composite SmrB-binding site. These findings reveal the underlying mechanisms by which ribosome collisions trigger ribosome rescue in bacteria.
Subject(s)
Escherichia coli , Ribosomes , Bacteria/genetics , Cryoelectron Microscopy , Escherichia coli/genetics , Escherichia coli/metabolism , Protein Biosynthesis , RNA, Bacterial/metabolism , RNA, Messenger/metabolism , Ribosomes/metabolismABSTRACT
While nucleic acid-targeting effectors are known to be central to biological conflicts and anti-selfish element immunity, recent findings have revealed immune effectors that target their building blocks and the cellular energy currency-free nucleotides. Through comparative genomics and sequence-structure analysis, we identified several distinct effector domains, which we named Calcineurin-CE, HD-CE, and PRTase-CE. These domains, along with specific versions of the ParB and MazG domains, are widely present in diverse prokaryotic immune systems and are predicted to degrade nucleotides by targeting phosphate or glycosidic linkages. Our findings unveil multiple potential immune systems associated with at least 17 different functional themes featuring these effectors. Some of these systems sense modified DNA/nucleotides from phages or operate downstream of novel enzymes generating signaling nucleotides. We also uncovered a class of systems utilizing HSP90- and HSP70-related modules as analogs of STAND and GTPase domains that are coupled to these nucleotide-targeting- or proteolysis-induced complex-forming effectors. While widespread in bacteria, only a limited subset of nucleotide-targeting effectors was integrated into eukaryotic immune systems, suggesting barriers to interoperability across subcellular contexts. This work establishes nucleotide-degrading effectors as an emerging immune paradigm and traces their origins back to homologous domains in housekeeping systems.
Subject(s)
Nucleic Acids , Nucleotides , Nucleotides/metabolism , Bacteria/metabolism , Prokaryotic Cells/metabolism , Genomics , Nucleic Acids/metabolismABSTRACT
Ribosomal surveillance pathways scan for ribosomes that are transiently paused or terminally stalled owing to structural elements in mRNAs or nascent chain sequences1, 2. Some stalls in budding yeast are sensed by the GTPase Hbs1, which loads Dom34, a catalytically inactive member of the archaeo-eukaryotic release factor 1 superfamily. Hbs1-Dom34 and the ATPase Rli1 dissociate stalled ribosomes into 40S and 60S subunits. However, the 60S subunits retain the peptidyl-tRNA nascent chains, which recruit the ribosome quality control complex that consists of Rqc1-Rqc2-Ltn1-Cdc48-Ufd1-Npl4. Nascent chains ubiquitylated by the E3 ubiquitin ligase Ltn1 are extracted from the 60S subunit by the ATPase Cdc48-Ufd1-Npl4 and presented to the 26S proteasome for degradation3-9. Failure to degrade the nascent chains leads to protein aggregation and proteotoxic stress in yeast and neurodegeneration in mice10-14. Despite intensive investigations on the ribosome quality control pathway, it is not known how the tRNA is hydrolysed from the ubiquitylated nascent chain before its degradation. Here we show that the Cdc48 adaptor Vms1 is a peptidyl-tRNA hydrolase. Similar to classical eukaryotic release factor 1, Vms1 activity is dependent on a conserved catalytic glutamine. Evolutionary analysis indicates that yeast Vms1 is the founding member of a clade of eukaryotic release factor 1 homologues that we designate the Vms1-like release factor 1 clade.
Subject(s)
Carboxylic Ester Hydrolases/metabolism , Carrier Proteins/metabolism , Ribosomes/metabolism , Saccharomyces cerevisiae Proteins/metabolism , Saccharomyces cerevisiae/enzymology , Amino Acid Sequence , Biocatalysis , Carboxylic Ester Hydrolases/chemistry , Carboxylic Ester Hydrolases/genetics , Carrier Proteins/chemistry , Carrier Proteins/genetics , Catalytic Domain/genetics , Glutamine/genetics , Glutamine/metabolism , Humans , Nucleocytoplasmic Transport Proteins/metabolism , Point Mutation , Proteasome Endopeptidase Complex/metabolism , RNA, Transfer/metabolism , RNA-Binding Proteins/metabolism , Ribosome Subunits, Large, Eukaryotic/metabolism , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/chemistry , Saccharomyces cerevisiae Proteins/genetics , Staphylococcal Protein A/metabolism , Ubiquitin-Protein Ligases/metabolism , Ubiquitination , Valosin Containing Protein/metabolism , Vesicular Transport Proteins/metabolismABSTRACT
Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
Subject(s)
Databases, Genetic , RNA, Long Noncoding/chemistry , RNA, Long Noncoding/genetics , Transcriptome/genetics , Cells, Cultured , Conserved Sequence/genetics , Datasets as Topic , Enhancer Elements, Genetic/genetics , Epigenesis, Genetic , Gene Expression Profiling , Gene Expression Regulation , Genome, Human/genetics , Genome-Wide Association Study , Genomics , Humans , Internet , Molecular Sequence Annotation , Organ Specificity/genetics , Polymorphism, Single Nucleotide , Promoter Regions, Genetic/genetics , Quantitative Trait Loci/genetics , RNA Stability , RNA, Messenger/geneticsABSTRACT
ABC ATPases form one of the largest clades of P-loop NTPase fold enzymes that catalyze ATP-hydrolysis and utilize its free energy for a staggering range of functions from transport to nucleoprotein dynamics. Using sensitive sequence and structure analysis with comparative genomics, for the first time we provide a comprehensive classification of the ABC ATPase superfamily. ABC ATPases developed structural hallmarks that unambiguously distinguish them from other P-loop NTPases such as an alternative to arginine-finger-based catalysis. At least five and up to eight distinct clades of ABC ATPases are reconstructed as being present in the last universal common ancestor. They underwent distinct phases of structural innovation with the emergence of inserts constituting conserved binding interfaces for proteins or nucleic acids and the adoption of a unique dimeric toroidal configuration for DNA-threading. Specifically, several clades have also extensively radiated in counter-invader conflict systems where they serve as nodal nucleotide-dependent sensory and energetic components regulating a diversity of effectors (including some previously unrecognized) acting independently or together with restriction-modification systems. We present a unified mechanism for ABC ATPase function across disparate systems like RNA editing, translation, metabolism, DNA repair, and biological conflicts, and some unexpected recruitments, such as MutS ATPases in secondary metabolism.
Subject(s)
ATP-Binding Cassette Transporters , Adenosine Triphosphatases , Evolution, Molecular , ATP-Binding Cassette Transporters/chemistry , ATP-Binding Cassette Transporters/classification , ATP-Binding Cassette Transporters/physiology , Adenosine Triphosphatases/chemistry , Adenosine Triphosphatases/classification , Adenosine Triphosphatases/physiology , Bacteria/enzymology , Eukaryota/enzymology , Nucleoproteins/metabolismABSTRACT
Nucleotide-activated effector deployment, prototyped by interferon-dependent immunity, is a common mechanistic theme shared by immune systems of several animals and prokaryotes. Prokaryotic versions include CRISPR-Cas with the CRISPR polymerase domain, their minimal variants, and systems with second messenger oligonucleotide or dinucleotide synthetase (SMODS). Cyclic or linear oligonucleotide signals in these systems help set a threshold for the activation of potentially deleterious downstream effectors in response to invader detection. We establish such a regulatory mechanism to be a more general principle of immune systems, which can also operate independently of such messengers. Using sensitive sequence analysis and comparative genomics, we identify 12 new prokaryotic immune systems, which we unify by this principle of threshold-dependent effector activation. These display regulatory mechanisms paralleling physiological signaling based on 3'-5' cyclic mononucleotides, NAD+-derived messengers, two- and one-component signaling that includes histidine kinase-based signaling, and proteolytic activation. Furthermore, these systems allowed the identification of multiple new sensory signal sensory components, such as a tetratricopeptide repeat (TPR) scaffold predicted to recognize NAD+-derived signals, unreported versions of the STING domain, prokaryotic YEATS domains, and a predicted nucleotide sensor related to receiver domains. We also identify previously unrecognized invader detection components and effector components, such as prokaryotic versions of the Wnt domain. Finally, we show that there have been multiple acquisitions of unidentified STING domains in eukaryotes, while the TPR scaffold was incorporated into the animal immunity/apoptosis signal-regulating kinase (ASK) signalosome.IMPORTANCE Both prokaryotic and eukaryotic immune systems face the dangers of premature activation of effectors and degradation of self-molecules in the absence of an invader. To mitigate this, they have evolved threshold-setting regulatory mechanisms for the triggering of effectors only upon the detection of a sufficiently strong invader signal. This work defines general templates for such regulation in effector-based immune systems. Using this, we identify several previously uncharacterized prokaryotic immune mechanisms that accomplish the regulation of downstream effector deployment by using nucleotide, NAD+-derived, two-component, and one-component signals paralleling physiological homeostasis. This study has also helped identify several previously unknown sensor and effector modules in these systems. Our findings also augment the growing evidence for the emergence of key animal immunity and chromatin regulatory components from prokaryotic progenitors.
Subject(s)
Bacteria/genetics , Bacteria/immunology , Bacterial Proteins/immunology , Eukaryota/immunology , Amino Acid Sequence , Bacteria/chemistry , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Eukaryota/genetics , Genomics , Immune System , Nucleotides/chemistry , Nucleotides/immunology , Sequence AlignmentABSTRACT
A diverse collection of enzymes comprising the protocatechuate dioxygenases (PCADs) has been characterized in several extradiol aromatic compound degradation pathways. Structural studies have shown a relationship between PCADs and the more broadly-distributed, functionally enigmatic Memo domain linked to several human diseases. To better understand the evolution of this PCAD-Memo protein superfamily, we explored their structural and functional determinants to establish a unified evolutionary framework, identifying 15 clearly-delineable families, including a previously-underappreciated diversity in five Memo clade families. We place the superfamily's origin within the greater radiation of the nucleoside phosphorylase/hydrolase-peptide/amidohydrolase fold prior to the last universal common ancestor of all extant organisms. In addition to identifying active-site residues across the superfamily, we describe three distinct, structurally-variable regions emanating from the core scaffold often housing conserved residues specific to individual families. These were predicted to contribute to the active-site pocket, potentially in substrate specificity and allosteric regulation. We also identified several previously-undescribed conserved genome contexts, providing insight into potentially novel substrates in PCAD clade families. We extend known conserved contextual associations for the Memo clade beyond previously-described associations with the AMMECR1 domain and a radical S-adenosylmethionine family domain. These observations point to two distinct yet potentially overlapping contexts wherein the elusive molecular function of the Memo domain could be finally resolved, thereby linking it to nucleotide base and aliphatic isoprenoid modification. In total, this report throws light on the functions of large swaths of the experimentally-uncharacterized PCAD-Memo families.
Subject(s)
Dioxygenases/chemistry , Dioxygenases/metabolism , Multigene Family , S-Adenosylmethionine/metabolism , Amino Acid Sequence , Catalytic Domain , Dioxygenases/genetics , Humans , Models, Molecular , Oxidation-Reduction , Protein Conformation , Sequence Homology , Substrate SpecificityABSTRACT
Over the past decade, modifications to microRNAs (miRNAs) via 3' end nucleotide addition have gone from a deep-sequencing curiosity to experimentally confirmed drivers of a range of regulatory activities. Here we overview the methods that have been deployed by researchers seeking to untangle these diverse functional roles and include characterizing not only the nucleotidyl transferases catalyzing the additions but also the nucleotides being added, and the timing of their addition during the miRNA pathway. These methods and their further development are key to clarifying the diverse and sometimes contradictory functional findings presently attributed to these nucleotide additions.
Subject(s)
MicroRNAs/chemistry , Computational Biology , Genome, Human , Humans , MicroRNAs/physiology , RNA 3' End Processing , Sequence Analysis, RNAABSTRACT
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
Subject(s)
Atlases as Topic , Molecular Sequence Annotation , Promoter Regions, Genetic/genetics , Transcriptome/genetics , Animals , Cell Line , Cells, Cultured , Cluster Analysis , Conserved Sequence/genetics , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Genes, Essential/genetics , Genome/genetics , Humans , Mice , Open Reading Frames/genetics , Organ Specificity , RNA, Messenger/analysis , RNA, Messenger/genetics , Transcription Factors/metabolism , Transcription Initiation Site , Transcription, Genetic/geneticsABSTRACT
Enhancers control the correct temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. Here we use the FANTOM5 panel of samples, covering the majority of human tissues and cell types, to produce an atlas of active, in vivo-transcribed enhancers. We show that enhancers share properties with CpG-poor messenger RNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity. The atlas is used to compare regulatory programs between different cells at unprecedented depth, to identify disease-associated regulatory single nucleotide polymorphisms, and to classify cell-type-specific and ubiquitous enhancers. We further explore the utility of enhancer redundancy, which explains gene expression strength rather than expression patterns. The online FANTOM5 enhancer atlas represents a unique resource for studies on cell-type-specific enhancers and gene regulation.
Subject(s)
Atlases as Topic , Enhancer Elements, Genetic/genetics , Gene Expression Regulation/genetics , Molecular Sequence Annotation , Organ Specificity , Cell Line , Cells, Cultured , Cluster Analysis , Genetic Predisposition to Disease/genetics , HeLa Cells , Humans , Polymorphism, Single Nucleotide/genetics , Promoter Regions, Genetic/genetics , RNA, Messenger/biosynthesis , RNA, Messenger/genetics , Transcription Initiation Site , Transcription Initiation, GeneticABSTRACT
Spliceostatin A (SSA) is a methyl ketal derivative of FR901464, a potent antitumor compound isolated from a culture broth of Pseudomonas sp no. 2663. These compounds selectively bind to the essential spliceosome component SF3b, a subcomplex of the U2 snRNP, to inhibit pre-mRNA splicing. However, the mechanism of SSA's antitumor activity is unknown. It is noteworthy that SSA causes accumulation of a truncated form of the CDK inhibitor protein p27 translated from CDKN1B pre-mRNA, which is involved in SSA-induced cell-cycle arrest. However, it is still unclear whether pre-mRNAs are uniformly exported from the nucleus following SSA treatment. We performed RNA-seq analysis on nuclear and cytoplasmic fractions of SSA-treated cells. Our statistical analyses showed that intron retention is the major consequence of SSA treatment, and a small number of intron-containing pre-mRNAs leak into the cytoplasm. Using a series of reporter plasmids to investigate the roles of intronic sequences in the pre-mRNA leakage, we showed that the strength of the 5' splice site affects pre-mRNA leakage. Additionally, we found that the level of pre-mRNA leakage is related to transcript length. These results suggest that the strength of the 5' splice site and the length of the transcripts are determinants of the pre-mRNA leakage induced by SF3b inhibitors.
Subject(s)
Cyclin-Dependent Kinase Inhibitor p27/genetics , Neoplasms/genetics , Pyrans/pharmacology , Sequence Analysis, RNA/methods , Spiro Compounds/pharmacology , Cell Nucleus/genetics , Cytoplasm/genetics , Gene Expression Regulation, Neoplastic/drug effects , HeLa Cells , Humans , RNA Precursors/genetics , RNA SplicingABSTRACT
The evolution of release factors catalyzing the hydrolysis of the final peptidyl-tRNA bond and the release of the polypeptide from the ribosome has been a longstanding paradox. While the components of the translation apparatus are generally well-conserved across extant life, structurally unrelated release factor peptidyl hydrolases (RF-PHs) emerged in the stems of the bacterial and archaeo-eukaryotic lineages. We analyze the diversification of RF-PH domains within the broader evolutionary framework of the translation apparatus. Thus, we reconstruct the possible state of translation termination in the Last Universal Common Ancestor with possible tRNA-like terminators. Further, evolutionary trajectories of the several auxiliary release factors in ribosome quality control (RQC) and rescue pathways point to multiple independent solutions to this problem and frequent transfers between superkingdoms including the recently characterized ArfT, which is more widely distributed across life than previously appreciated. The eukaryotic RQC system was pieced together from components with disparate provenance, which include the long-sought-after Vms1/ANKZF1 RF-PH of bacterial origin. We also uncover an under-appreciated evolutionary driver of innovation in rescue pathways: effectors deployed in biological conflicts that target the ribosome. At least three rescue pathways (centered on the prfH/RFH, baeRF-1, and C12orf65 RF-PH domains), were likely innovated in response to such conflicts.
Subject(s)
Carboxylic Ester Hydrolases/genetics , Peptide Chain Termination, Translational , Peptide Termination Factors/genetics , Ribosomes/genetics , Amino Acid Sequence , Animals , Carboxylic Ester Hydrolases/chemistry , Carboxylic Ester Hydrolases/metabolism , Evolution, Molecular , Humans , Models, Molecular , Peptide Termination Factors/chemistry , Peptide Termination Factors/metabolism , Phylogeny , Protein Biosynthesis , Protein Domains , Ribosomes/metabolismABSTRACT
RNA is targeted in biological conflicts by enzymatic toxins or effectors. A vast diversity of systems which repair or 'heal' this damage has only recently become apparent. Here, we summarize the known effectors, their modes of action, and RNA targets before surveying the diverse systems which counter this damage from a comparative genomics viewpoint. RNA-repair systems show a modular organization with extensive shuffling and displacement of the constituent domains; however, a general 'syntax' is strongly maintained whereby systems typically contain: a RNA ligase (either ATP-grasp or RtcB superfamilies), nucleotidyltransferases, enzymes modifying RNA-termini for ligation (phosphatases and kinases) or protection (methylases), and scaffold or cofactor proteins. We highlight poorly-understood or previously-uncharacterized repair systems and components, e.g. potential scaffolding cofactors (Rot/TROVE and SPFH/Band-7 modules) with their respective cognate non-coding RNAs (YRNAs and a novel tRNA-like molecule) and a novel nucleotidyltransferase associating with diverse ligases. These systems have been extensively disseminated by lateral transfer between distant prokaryotic and microbial eukaryotic lineages consistent with intense inter-organismal conflict. Components have also often been 'institutionalized' for non-conflict roles, e.g. in RNA-splicing and in RNAi systems (e.g. in kinetoplastids) which combine a distinct family of RNA-acting prim-pol domains with DICER-like proteins.
Subject(s)
RNA/metabolism , Animals , Base Sequence , Biocatalysis , Evolution, Molecular , Humans , Ligases/metabolism , Protein DomainsABSTRACT
Enzymatic effectors targeting nucleic acids, proteins and other cellular components are the mainstay of conflicts across life forms. Using comparative genomics we identify a large class of eukaryotic proteins, which include effectors from oomycetes, fungi and other parasites. The majority of these proteins have a characteristic domain architecture with one of several N-terminal 'Header' domains, which are predicted to play a role in trafficking of these effectors, including a novel version of the Ubiquitin fold. The Headers are followed by one or more diverse C-terminal domains, such as restriction endonuclease (REase), protein kinase, HNH endonuclease, LK-nuclease (a RNase) and multiple distinct peptidase domains, which are predicted to carry their toxicity determinants. The most common types of these proteins appear to have originated from prokaryotic transposases (e.g. TN7 and Mu) and combine a CDC6/ORC1-STAND clade NTPase domain with a C-terminal REase domain. Other than the so-called Crinkler effectors of oomycetes and fungi, these effectors are encoded by other eukaryotic parasites such as trypanosomatids (the RHS proteins) and the rhizarian Plasmodiophora, and symbionts like Capsaspora Remarkably, we also find these proteins in free-living eukaryotes, including several viridiplantae, fungi, amoebozoans and animals. These versions might either still be transposons or function in other poorly understood eukaryote-specific inter-organismal and inter-genomic conflicts. These include the Medea1 selfish element of Tribolium that spreads via post-zygotic killing. We present a unified mechanism for the recombination-dependent diversification and action of this widespread class of molecular weaponry deployed across diverse conflicts ranging from parasitic to free-living forms.
Subject(s)
Eukaryota/enzymology , Protein Domains/genetics , Protein Transport/genetics , Proteins/metabolism , Toxins, Biological/chemistry , Amoebozoa/enzymology , Animals , DNA Restriction Enzymes/metabolism , Fungi/enzymology , Genomics/methods , Oomycetes/enzymology , Proteins/ultrastructure , Tribolium/enzymologyABSTRACT
Intense biological conflicts between prokaryotic genomes and their genomic parasites have resulted in an arms race in terms of the molecular "weaponry" deployed on both sides. Using a recursive computational approach, we uncovered a remarkable class of multidomain proteins with 2 to 15 domains in the same polypeptide deployed by viruses and plasmids in such conflicts. Domain architectures and genomic contexts indicate that they are part of a widespread conflict strategy involving proteins injected into the host cell along with parasite DNA during the earliest phase of infection. Their unique feature is the combination of domains with highly disparate biochemical activities in the same polypeptide; accordingly, we term them polyvalent proteins. Of the 131 domains in polyvalent proteins, a large fraction are enzymatic domains predicted to modify proteins, target nucleic acids, alter nucleotide signaling/metabolism, and attack peptidoglycan or cytoskeletal components. They further contain nucleic acid-binding domains, virion structural domains, and 40 novel uncharacterized domains. Analysis of their architectural network reveals both pervasive common themes and specialized strategies for conjugative elements and plasmids or (pro)phages. The themes include likely processing of multidomain polypeptides by zincin-like metallopeptidases and mechanisms to counter restriction or CRISPR/Cas systems and jump-start transcription or replication. DNA-binding domains acquired by eukaryotes from such systems have been reused in XPC/RAD4-dependent DNA repair and mitochondrial genome replication in kinetoplastids. Characterization of the novel domains discovered here, such as RNases and peptidases, are likely to aid in the development of new reagents and elucidation of the spread of antibiotic resistance.IMPORTANCE This is the first report of the widespread presence of large proteins, termed polyvalent proteins, predicted to be transmitted by genomic parasites such as conjugative elements, plasmids, and phages during the initial phase of infection along with their DNA. They are typified by the presence of multiple domains with disparate activities combined in the same protein. While some of these domains are predicted to assist the invasive element in replication, transcription, or protection of their DNA, several are likely to target various host defense systems or modify the host to favor the parasite's life cycle. Notably, DNA-binding domains from these systems have been transferred to eukaryotes, where they have been incorporated into DNA repair and mitochondrial genome replication systems.
Subject(s)
Bacteriophages/genetics , Peptides/genetics , Peptides/metabolism , Plasmids , Computational Biology , Evolution, Molecular , Protein DomainsABSTRACT
RNA interference (RNAi) pathways have evolved as important modulators of gene expression that operate in the cytoplasm by degrading RNA target molecules through the activity of short (21-30 nucleotide) RNAs. RNAi components have been reported to have a role in the nucleus, as they are involved in epigenetic regulation and heterochromatin formation. However, although RNAi-mediated post-transcriptional gene silencing is well documented, the mechanisms of RNAi-mediated transcriptional gene silencing and, in particular, the role of RNAi components in chromatin dynamics, especially in animal multicellular organisms, are elusive. Here we show that the key RNAi components Dicer 2 (DCR2) and Argonaute 2 (AGO2) associate with chromatin (with a strong preference for euchromatic, transcriptionally active, loci) and interact with the core transcription machinery. Notably, loss of function of DCR2 or AGO2 showed that transcriptional defects are accompanied by the perturbation of RNA polymerase II positioning on promoters. Furthermore, after heat shock, both Dcr2 and Ago2 null mutations, as well as missense mutations that compromise the RNAi activity, impaired the global dynamics of RNA polymerase II. Finally, the deep sequencing of the AGO2-associated small RNAs (AGO2 RIP-seq) revealed that AGO2 is strongly enriched in small RNAs that encompass the promoter regions and other regions of heat-shock and other genetic loci on both the sense and antisense DNA strands, but with a strong bias for the antisense strand, particularly after heat shock. Taken together, our results show that DCR2 and AGO2 are globally associated with transcriptionally active loci and may have a pivotal role in shaping the transcriptome by controlling the processivity of RNA polymerase II.
Subject(s)
Argonaute Proteins/metabolism , Chromatin/genetics , Drosophila Proteins/metabolism , Drosophila melanogaster/genetics , Gene Expression Regulation , RNA Helicases/metabolism , RNA Interference , Ribonuclease III/metabolism , Transcription, Genetic , Animals , Argonaute Proteins/deficiency , Argonaute Proteins/genetics , Chromatin/metabolism , Drosophila Proteins/deficiency , Drosophila Proteins/genetics , HSP70 Heat-Shock Proteins/genetics , Heat-Shock Response/genetics , MicroRNAs/genetics , MicroRNAs/metabolism , Promoter Regions, Genetic/genetics , Protein Binding , RNA Helicases/deficiency , RNA Helicases/genetics , RNA Polymerase II/metabolism , RNA, Double-Stranded/genetics , RNA, Double-Stranded/metabolism , RNA-Binding Proteins/metabolism , Ribonuclease III/deficiency , Ribonuclease III/genetics , Transcription FactorsABSTRACT
Cyclic di- and linear oligo-nucleotide signals activate defenses against invasive nucleic acids in animal immunity; however, their evolutionary antecedents are poorly understood. Using comparative genomics, sequence and structure analysis, we uncovered a vast network of systems defined by conserved prokaryotic gene-neighborhoods, which encode enzymes generating such nucleotides or alternatively processing them to yield potential signaling molecules. The nucleotide-generating enzymes include several clades of the DNA-polymerase ß-like superfamily (including Vibrio cholerae DncV), a minimal version of the CRISPR polymerase and DisA-like cyclic-di-AMP synthetases. Nucleotide-binding/processing domains include TIR domains and members of a superfamily prototyped by Smf/DprA proteins and base (cytokinin)-releasing LOG enzymes. They are combined in conserved gene-neighborhoods with genes for a plethora of protein superfamilies, which we predict to function as nucleotide-sensors and effectors targeting nucleic acids, proteins or membranes (pore-forming agents). These systems are sometimes combined with other biological conflict-systems such as restriction-modification and CRISPR/Cas. Interestingly, several are coupled in mutually exclusive neighborhoods with either a prokaryotic ubiquitin-system or a HORMA domain-PCH2-like AAA+ ATPase dyad. The latter are potential precursors of equivalent proteins in eukaryotic chromosome dynamics. Further, components from these nucleotide-centric systems have been utilized in several other systems including a novel diversity-generating system with a reverse transcriptase. We also found the Smf/DprA/LOG domain from these systems to be recruited as a predicted nucleotide-binding domain in eukaryotic TRPM channels. These findings point to evolutionary and mechanistic links, which bring together CRISPR/Cas, animal interferon-induced immunity, and several other systems that combine nucleic-acid-sensing and nucleotide-dependent signaling.