Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2024 Apr 06.
Article in English | MEDLINE | ID: mdl-38617247

ABSTRACT

Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. While advances in deep learning enable the prediction of accurate protein structural models, RNA structure prediction is not possible at present due to a lack of abundant high-quality reference data. Furthermore, available sequence data are generally not associated with organismal phenotypes that could inform RNA function. We created GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences derived from GTDB genomes to experimental and predicted optimal growth temperatures of GTDB reference organisms. This enables construction of deep and diverse RNA sequence alignments to be used for machine learning. Using GARNET, we define the minimal requirements for a sequence- and structure-aware RNA generative model. We also develop a GPT-like language model for RNA in which triplet tokenization provides optimal encoding. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identified mutations in ribosomal RNA that confer increased thermostability to the Escherichia coli ribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function.

2.
Nucleic Acids Res ; 52(D1): D590-D596, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37889041

ABSTRACT

CRISPR-Cas enzymes enable RNA-guided bacterial immunity and are widely used for biotechnological applications including genome editing. In particular, the Class 2 CRISPR-associated enzymes (Cas9, Cas12 and Cas13 families), have been deployed for numerous research, clinical and agricultural applications. However, the immense genetic and biochemical diversity of these proteins in the public domain poses a barrier for researchers seeking to leverage their activities. We present CasPEDIA (http://caspedia.org), the Cas Protein Effector Database of Information and Assessment, a curated encyclopedia that integrates enzymatic classification for hundreds of different Cas enzymes across 27 phylogenetic groups spanning the Cas9, Cas12 and Cas13 families, as well as evolutionarily related IscB and TnpB proteins. All enzymes in CasPEDIA were annotated with a standard workflow based on their primary nuclease activity, target requirements and guide-RNA design constraints. Our functional classification scheme, CasID, is described alongside current phylogenetic classification, allowing users to search related orthologs by enzymatic function and sequence similarity. CasPEDIA is a comprehensive data portal that summarizes and contextualizes enzymatic properties of widely used Cas enzymes, equipping users with valuable resources to foster biotechnological development. CasPEDIA complements phylogenetic Cas nomenclature and enables researchers to leverage the multi-faceted nucleic-acid targeting rules of diverse Class 2 Cas enzymes.


Subject(s)
CRISPR-Associated Proteins , CRISPR-Cas Systems , Databases, Genetic , Endodeoxyribonucleases , CRISPR-Cas Systems/genetics , Phylogeny , CRISPR-Associated Proteins/chemistry , CRISPR-Associated Proteins/classification , CRISPR-Associated Proteins/genetics , Endodeoxyribonucleases/chemistry , Endodeoxyribonucleases/classification , Endodeoxyribonucleases/genetics , Encyclopedias as Topic
3.
Nat Commun ; 14(1): 8432, 2023 Dec 19.
Article in English | MEDLINE | ID: mdl-38114465

ABSTRACT

Sparse and short-lived excited RNA conformational states are essential players in cell physiology, disease, and therapeutic development, yet determining their 3D structures remains challenging. Combining mutagenesis, NMR spectroscopy, and computational modeling, we determined the 3D structural ensemble formed by a short-lived (lifetime ~2.1 ms) lowly-populated (~0.4%) conformational state in HIV-1 TAR RNA. Through a strand register shift, the excited conformational state completely remodels the 3D structure of the ground state (RMSD from the ground state = 7.2 ± 0.9 Å), forming a surprisingly more ordered conformational ensemble rich in non-canonical mismatches. The structure impedes the formation of the motifs recognized by Tat and the super elongation complex, explaining why this alternative TAR conformation cannot activate HIV-1 transcription. The ability to determine the 3D structures of fleeting RNA states using the presented methodology holds great promise for our understanding of RNA biology, disease mechanisms, and the development of RNA-targeting therapeutics.


Subject(s)
RNA, Viral , RNA, Viral/genetics , RNA, Viral/chemistry , Nucleic Acid Conformation , Magnetic Resonance Spectroscopy , Mutagenesis
4.
Nucleic Acids Res ; 51(22): 12414-12427, 2023 Dec 11.
Article in English | MEDLINE | ID: mdl-37971304

ABSTRACT

RNA-guided endonucleases form the crux of diverse biological processes and technologies, including adaptive immunity, transposition, and genome editing. Some of these enzymes are components of insertion sequences (IS) in the IS200/IS605 and IS607 transposon families. Both IS families encode a TnpA transposase and a TnpB nuclease, an RNA-guided enzyme ancestral to CRISPR-Cas12s. In eukaryotes, TnpB homologs occur as two distinct types, Fanzor1s and Fanzor2s. We analyzed the evolutionary relationships between prokaryotic TnpBs and eukaryotic Fanzors, which revealed that both Fanzor1s and Fanzor2s stem from a single lineage of IS607 TnpBs with unusual active site arrangement. The widespread nature of Fanzors implies that the properties of this particular lineage of IS607 TnpBs were particularly suited to adaptation in eukaryotes. Biochemical analysis of an IS607 TnpB and Fanzor1s revealed common strategies employed by TnpBs and Fanzors to co-evolve with their cognate transposases. Collectively, our results provide a new model of sequential evolution from IS607 TnpBs to Fanzor2s, and Fanzor2s to Fanzor1s that details how genes of prokaryotic origin evolve to give rise to new protein families in eukaryotes.


Subject(s)
Bacteria , Endonucleases , Evolution, Molecular , Bacteria/enzymology , Bacteria/genetics , DNA Transposable Elements , Endonucleases/genetics , Endonucleases/metabolism , Prokaryotic Cells/enzymology , Transposases/metabolism , Eukaryotic Cells/enzymology
5.
J Am Chem Soc ; 145(42): 22964-22978, 2023 10 25.
Article in English | MEDLINE | ID: mdl-37831584

ABSTRACT

Knowing the 3D structures formed by the various conformations populating the RNA free-energy landscape, their relative abundance, and kinetic interconversion rates is required to obtain a quantitative and predictive understanding of how RNAs fold and function at the atomic level. While methods integrating ensemble-averaged experimental data with computational modeling are helping define the most abundant conformations in RNA ensembles, elucidating their kinetic rates of interconversion and determining the 3D structures of sparsely populated short-lived RNA excited conformational states (ESs) remains challenging. Here, we developed an approach integrating Rosetta-FARFAR RNA structure prediction with NMR residual dipolar couplings and relaxation dispersion that simultaneously determines the 3D structures formed by the ground-state (GS) and ES subensembles, their relative abundance, and kinetic rates of interconversion. The approach is demonstrated on HIV-1 TAR, whose six-nucleotide apical loop was previously shown to form a sparsely populated (∼13%) short-lived (lifetime ∼ 45 µs) ES. In the GS, the apical loop forms a broad distribution of open conformations interconverting on the pico-to-nanosecond time scale. Most residues are unpaired and preorganized to bind the Tat-superelongation protein complex. The apical loop zips up in the ES, forming a narrow distribution of closed conformations, which sequester critical residues required for protein recognition. Our work introduces an approach for determining the 3D ensemble models formed by sparsely populated RNA conformational states, provides a rare atomic view of an RNA ES, and kinetically resolves the atomic 3D structures of RNA conformational substates, interchanging on time scales spanning 6 orders of magnitude, from picoseconds to microseconds.


Subject(s)
Proteins , RNA , RNA/chemistry , Nuclear Magnetic Resonance, Biomolecular , Magnetic Resonance Spectroscopy , Nucleic Acid Conformation , Proteins/genetics
6.
bioRxiv ; 2023 Aug 10.
Article in English | MEDLINE | ID: mdl-37609353

ABSTRACT

RNA-guided endonucleases form the crux of diverse biological processes and technologies, including adaptive immunity, transposition, and genome editing. Some of these enzymes are components of insertion sequences (IS) in the IS200/IS605 and IS607 transposon families. Both IS families encode a TnpA transposase and TnpB nuclease, an RNA-guided enzyme ancestral to CRISPR-Cas12. In eukaryotes and their viruses, TnpB homologs occur as two distinct types, Fanzor1 and Fanzor2. We analyzed the evolutionary relationships between prokaryotic TnpBs and eukaryotic Fanzors, revealing that a clade of IS607 TnpBs with unusual active site arrangement found primarily in Cyanobacteriota likely gave rise to both types of Fanzors. The wide-spread nature of Fanzors imply that the properties of this particular group of IS607 TnpBs were particularly suited to adaptation and evolution in eukaryotes and their viruses. Experimental characterization of a prokaryotic IS607 TnpB and virally encoded Fanzor1s uncovered features that may have fostered coevolution between TnpBs/Fanzors and their cognate transposases. Our results provide insight into the evolutionary origins of a ubiquitous family of RNA-guided proteins that shows remarkable conservation across domains of life.

7.
Nat Chem Biol ; 19(7): 900-910, 2023 07.
Article in English | MEDLINE | ID: mdl-37095237

ABSTRACT

Replicative errors contribute to the genetic diversity needed for evolution but in high frequency can lead to genomic instability. Here, we show that DNA dynamics determine the frequency of misincorporating the A•G mismatch, and altered dynamics explain the high frequency of 8-oxoguanine (8OG) A•8OG misincorporation. NMR measurements revealed that Aanti•Ganti (population (pop.) of >91%) transiently forms sparsely populated and short-lived Aanti+•Gsyn (pop. of ~2% and kex = kforward + kreverse of ~137 s-1) and Asyn•Ganti (pop. of ~6% and kex of ~2,200 s-1) Hoogsteen conformations. 8OG redistributed the ensemble, rendering Aanti•8OGsyn the dominant state. A kinetic model in which Aanti+•Gsyn is misincorporated quantitatively predicted the dA•dGTP misincorporation kinetics by human polymerase ß, the pH dependence of misincorporation and the impact of the 8OG lesion. Thus, 8OG increases replicative errors relative to G because oxidation of guanine redistributes the ensemble in favor of the mutagenic Aanti•8OGsyn Hoogsteen state, which exists transiently and in low abundance in the A•G mismatch.


Subject(s)
DNA Damage , DNA , Humans , Base Pairing , DNA/chemistry , Mutagenesis
8.
Nucleic Acids Res ; 50(22): 12689-12701, 2022 12 09.
Article in English | MEDLINE | ID: mdl-36537251

ABSTRACT

CRISPR-Cas12a is an RNA-guided, programmable genome editing enzyme found within bacterial adaptive immune pathways. Unlike CRISPR-Cas9, Cas12a uses only a single catalytic site to both cleave target double-stranded DNA (dsDNA) (cis-activity) and indiscriminately degrade single-stranded DNA (ssDNA) (trans-activity). To investigate how the relative potency of cis- versus trans-DNase activity affects Cas12a-mediated genome editing, we first used structure-guided engineering to generate variants of Lachnospiraceae bacterium Cas12a that selectively disrupt trans-activity. The resulting engineered mutant with the biggest differential between cis- and trans-DNase activity in vitro showed minimal genome editing activity in human cells, motivating a second set of experiments using directed evolution to generate additional mutants with robust genome editing activity. Notably, these engineered and evolved mutants had enhanced ability to induce homology-directed repair (HDR) editing by 2-18-fold compared to wild-type Cas12a when using HDR donors containing mismatches with crRNA at the PAM-distal region. Finally, a site-specific reversion mutation produced improved Cas12a (iCas12a) variants with superior genome editing efficiency at genomic sites that are difficult to edit using wild-type Cas12a. This strategy establishes a pipeline for creating improved genome editing tools by combining structural insights with randomization and selection. The available structures of other CRISPR-Cas enzymes will enable this strategy to be applied to improve the efficacy of other genome-editing proteins.


Subject(s)
CRISPR-Cas Systems , Gene Editing , Humans , Bacterial Proteins/metabolism , CRISPR-Cas Systems/genetics , DNA , DNA, Single-Stranded/genetics , Gene Editing/methods , CRISPR-Associated Proteins , Endodeoxyribonucleases
9.
Proc Natl Acad Sci U S A ; 119(30): e2200681119, 2022 07 26.
Article in English | MEDLINE | ID: mdl-35857870

ABSTRACT

The majority of base pairs in double-stranded DNA exist in the canonical Watson-Crick geometry. However, they can also adopt alternate Hoogsteen conformations in various complexes of DNA with proteins and small molecules, which are key for biological function and mechanism. While detection of Hoogsteen base pairs in large DNA complexes and assemblies poses considerable challenges for traditional structural biology techniques, we show here that multidimensional dynamic nuclear polarization-enhanced solid-state NMR can serve as a unique spectroscopic tool for observing and distinguishing Watson-Crick and Hoogsteen base pairs in a broad range of DNA systems based on characteristic NMR chemical shifts and internuclear dipolar couplings. We illustrate this approach using a model 12-mer DNA duplex, free and in complex with the antibiotic echinomycin, which features two central adenine-thymine base pairs with Watson-Crick and Hoogsteen geometry, respectively, and subsequently extend it to the ∼200 kDa Widom 601 DNA nucleosome core particle.


Subject(s)
Base Pairing , DNA , Magnetic Resonance Spectroscopy , Adenine/chemistry , Adenine/metabolism , DNA/chemistry , Echinomycin/chemistry , Magnetic Resonance Spectroscopy/methods , Thymine/chemistry
10.
Proc Natl Acad Sci U S A ; 119(24): e2112496119, 2022 06 14.
Article in English | MEDLINE | ID: mdl-35671421

ABSTRACT

Thermodynamic preferences to form non-native conformations are crucial for understanding how nucleic acids fold and function. However, they are difficult to measure experimentally because this requires accurately determining the population of minor low-abundance (<10%) conformations in a sea of other conformations. Here, we show that melting experiments enable facile measurements of thermodynamic preferences to adopt nonnative conformations in DNA and RNA. The key to this "delta-melt" approach is to use chemical modifications to render specific minor non-native conformations the major state. The validity and robustness of delta-melt is established for four different non-native conformations under various physiological conditions and sequence contexts through independent measurements of thermodynamic preferences using NMR. Delta-melt is faster relative to NMR, simple, and cost-effective and enables thermodynamic preferences to be measured for exceptionally low-populated conformations. Using delta-melt, we obtained rare insights into conformational cooperativity, obtaining evidence for significant cooperativity (1.0 to 2.5 kcal/mol) when simultaneously forming two adjacent Hoogsteen base pairs. We also measured the thermodynamic preferences to form G-C+ and A-T Hoogsteen and A-T base open states for nearly all 16 trinucleotide sequence contexts and found distinct sequence-specific variations on the order of 2 to 3 kcal/mol. This rich landscape of sequence-specific non-native minor conformations in the DNA double helix may help shape the sequence specificity of DNA biochemistry. Thus, melting experiments can now be used to access thermodynamic information regarding regions of the free energy landscape of biomolecules beyond the native folded and unfolded conformations.


Subject(s)
DNA , Nucleic Acid Conformation , RNA , Base Sequence , DNA/chemistry , Freezing , RNA/chemistry , Thermodynamics , Ultraviolet Rays
11.
Nucleic Acids Res ; 49(21): 12540-12555, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34792150

ABSTRACT

Watson-Crick base pairs (bps) are the fundamental unit of genetic information and the building blocks of the DNA double helix. However, A-T and G-C can also form alternative 'Hoogsteen' bps, expanding the functional complexity of DNA. We developed 'Hoog-finder', which uses structural fingerprints to rapidly screen Hoogsteen bps, which may have been mismodeled as Watson-Crick in crystal structures of protein-DNA complexes. We uncovered 17 Hoogsteen bps, 7 of which were in complex with 6 proteins never before shown to bind Hoogsteen bps. The Hoogsteen bps occur near mismatches, nicks and lesions and some appear to participate in recognition and damage repair. Our results suggest a potentially broad role for Hoogsteen bps in stressed regions of the genome and call for a community-wide effort to identify these bps in current and future crystal structures of DNA and its complexes.


Subject(s)
Base Pairing , DNA-Binding Proteins/chemistry , DNA/chemistry , Nucleic Acid Conformation , Protein Domains , Base Sequence , Binding Sites/genetics , Computational Biology/methods , Crystallography, X-Ray , DNA/genetics , DNA/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Databases, Genetic , Hydrogen Bonding , Models, Molecular , Mutation , Protein Binding , Thermodynamics
12.
Nat Commun ; 12(1): 5201, 2021 08 31.
Article in English | MEDLINE | ID: mdl-34465779

ABSTRACT

N6-methyladenosine (m6A) is a post-transcriptional modification that controls gene expression by recruiting proteins to RNA sites. The modification also slows biochemical processes through mechanisms that are not understood. Using temperature-dependent (20°C-65°C) NMR relaxation dispersion, we show that m6A pairs with uridine with the methylamino group in the anti conformation to form a Watson-Crick base pair that transiently exchanges on the millisecond timescale with a singly hydrogen-bonded low-populated (1%) mismatch-like conformation in which the methylamino group is syn. This ability to rapidly interchange between Watson-Crick or mismatch-like forms, combined with different syn:anti isomer preferences when paired (~1:100) versus unpaired (~10:1), explains how m6A robustly slows duplex annealing without affecting melting at elevated temperatures via two pathways in which isomerization occurs before or after duplex annealing. Our model quantitatively predicts how m6A reshapes the kinetic landscape of nucleic acid hybridization and conformational transitions, and provides an explanation for why the modification robustly slows diverse cellular processes.


Subject(s)
Adenosine/analogs & derivatives , DNA/chemistry , DNA/metabolism , Adenosine/chemistry , Adenosine/genetics , Adenosine/metabolism , Base Pairing , DNA/genetics , Hydrogen Bonding , Kinetics , Models, Molecular , Nucleic Acid Conformation , Nucleic Acid Hybridization , RNA Processing, Post-Transcriptional , Uridine/chemistry , Uridine/genetics , Uridine/metabolism
13.
J Phys Chem B ; 125(28): 7613-7627, 2021 07 22.
Article in English | MEDLINE | ID: mdl-34236202

ABSTRACT

Measuring the strength of the hydrogen bonds between DNA base pairs is of vital importance for understanding how our genetic code is physically accessed and recognized in cells, particularly during replication and transcription. Therefore, it is important to develop probes for these key hydrogen bonds (H-bonds) that dictate events critical to cellular function, such as the localized melting of DNA. The vibrations of carbonyl bonds are well-known probes of their H-bonding environment, and their signals can be observed with infrared (IR) spectroscopy. Yet, pinpointing a single bond of interest in the complex IR spectrum of DNA is challenging due to the large number of carbonyl signals that overlap with each other. Here, we develop a method using isotope editing and infrared (IR) spectroscopy to isolate IR signals from the thymine (T) C2═O carbonyl. We use solvatochromatic studies to show that the TC2═O signal's position in the IR spectrum is sensitive to the H-bonding capacity of the solvent. Our results indicate that C2═O of a single T base within DNA duplexes experiences weak H-bonding interactions. This finding is consistent with the existence of a third, noncanonical CH···O H-bond between adenine and thymine in both Watson-Crick and Hoogsteen base pairs in DNA.


Subject(s)
DNA , Isotopes , Hydrogen , Hydrogen Bonding , Spectrum Analysis
14.
Curr Opin Struct Biol ; 70: 16-25, 2021 10.
Article in English | MEDLINE | ID: mdl-33836446

ABSTRACT

Nucleic acids do not fold into a single conformation, and dynamic ensembles are needed to describe their propensities to cycle between different conformations when performing cellular functions. We review recent advances in solution-state nuclear magnetic resonance (NMR) methods and their integration with computational techniques that are improving the ability to probe the dynamic ensembles of DNA and RNA. These include computational approaches for predicting chemical shifts from structure and generating conformational libraries from sequence, measurements of exact nuclear Overhauser effects, development of new probes to study chemical exchange using relaxation dispersion, faster and more sensitive real-time NMR techniques, and new NMR approaches to tackle large nucleic acid assemblies. We discuss how these advances are leading to new mechanistic insights into gene expression and regulation.


Subject(s)
Nucleic Acids , DNA , Magnetic Resonance Spectroscopy , Nuclear Magnetic Resonance, Biomolecular , RNA
15.
Magn Reson (Gott) ; 2(2): 715-731, 2021.
Article in English | MEDLINE | ID: mdl-37905209

ABSTRACT

In duplex DNA, Watson-Crick A-T and G-C base pairs (bp's) exist in dynamic equilibrium with an alternative Hoogsteen conformation, which is low in abundance and short-lived. Measuring how the Hoogsteen dynamics varies across different DNA sequences, structural contexts and physiological conditions is key for identifying potential Hoogsteen hot spots and for understanding the potential roles of Hoogsteen base pairs in DNA recognition and repair. However, such studies are hampered by the need to prepare 13C or 15N isotopically enriched DNA samples for NMR relaxation dispersion (RD) experiments. Here, using SELective Optimized Proton Experiments (SELOPE) 1H CEST experiments employing high-power radiofrequency fields (B1 > 250 Hz) targeting imino protons, we demonstrate accurate and robust characterization of Watson-Crick to Hoogsteen exchange, without the need for isotopic enrichment of the DNA. For 13 residues in three DNA duplexes under different temperature and pH conditions, the exchange parameters deduced from high-power imino 1H CEST were in very good agreement with counterparts measured using off-resonance 13C / 15N spin relaxation in the rotating frame (R1ρ). It is shown that 1H-1H NOE effects which typically introduce artifacts in 1H-based measurements of chemical exchange can be effectively suppressed by selective excitation, provided that the relaxation delay is short (≤ 100 ms). The 1H CEST experiment can be performed with ∼ 10× higher throughput and ∼ 100× lower cost relative to 13C / 15N R1ρ and enabled Hoogsteen chemical exchange measurements undetectable by R1ρ. The results reveal an increased propensity to form Hoogsteen bp's near terminal ends and a diminished propensity within A-tract motifs. The 1H CEST experiment provides a basis for rapidly screening Hoogsteen breathing in duplex DNA, enabling identification of unusual motifs for more in-depth characterization.

16.
RNA ; 27(1): 12-26, 2021 01.
Article in English | MEDLINE | ID: mdl-33028652

ABSTRACT

Identifying small molecules that selectively bind an RNA target while discriminating against all other cellular RNAs is an important challenge in RNA-targeted drug discovery. Much effort has been directed toward identifying drug-like small molecules that minimize electrostatic and stacking interactions that lead to nonspecific binding of aminoglycosides and intercalators to many stem-loop RNAs. Many such compounds have been reported to bind RNAs and inhibit their cellular activities. However, target engagement and cellular selectivity assays are not routinely performed, and it is often unclear whether functional activity directly results from specific binding to the target RNA. Here, we examined the propensities of three drug-like compounds, previously shown to bind and inhibit the cellular activities of distinct stem-loop RNAs, to bind and inhibit the cellular activities of two unrelated HIV-1 stem-loop RNAs: the transactivation response element (TAR) and the rev response element stem IIB (RREIIB). All compounds bound TAR and RREIIB in vitro, and two inhibited TAR-dependent transactivation and RRE-dependent viral export in cell-based assays while also exhibiting off-target interactions consistent with nonspecific activity. A survey of X-ray and NMR structures of RNA-small molecule complexes revealed that aminoglycosides and drug-like molecules form hydrogen bonds with functional groups commonly accessible in canonical stem-loop RNA motifs, in contrast to ligands that specifically bind riboswitches. Our results demonstrate that drug-like molecules can nonspecifically bind stem-loop RNAs most likely through hydrogen bonding and electrostatic interactions and reinforce the importance of assaying for off-target interactions and RNA selectivity in vitro and in cells when assessing novel RNA-binders.


Subject(s)
Aminoglycosides/pharmacology , Genes, env/drug effects , HIV Long Terminal Repeat/drug effects , RNA, Viral/antagonists & inhibitors , Small Molecule Libraries/pharmacology , Aminoglycosides/chemistry , Aminoglycosides/metabolism , Base Pairing , Base Sequence , Binding Sites , Biological Assay , Drug Discovery , HIV-1/drug effects , HIV-1/genetics , HIV-1/metabolism , Humans , Hydrogen Bonding , Isoquinolines/chemistry , Isoquinolines/metabolism , Isoquinolines/pharmacology , Nucleic Acid Conformation , Pentamidine/chemistry , Pentamidine/metabolism , Pentamidine/pharmacology , RNA, Viral/genetics , RNA, Viral/metabolism , Small Molecule Libraries/chemistry , Small Molecule Libraries/metabolism , Static Electricity , Transcriptional Activation/drug effects , Yohimbine/chemistry , Yohimbine/metabolism , Yohimbine/pharmacology
17.
Nat Commun ; 11(1): 5531, 2020 11 02.
Article in English | MEDLINE | ID: mdl-33139729

ABSTRACT

Biomolecules form dynamic ensembles of many inter-converting conformations which are key for understanding how they fold and function. However, determining ensembles is challenging because the information required to specify atomic structures for thousands of conformations far exceeds that of experimental measurements. We addressed this data gap and dramatically simplified and accelerated RNA ensemble determination by using structure prediction tools that leverage the growing database of RNA structures to generate a conformation library. Refinement of this library with NMR residual dipolar couplings provided an atomistic ensemble model for HIV-1 TAR, and the model accuracy was independently supported by comparisons to quantum-mechanical calculations of NMR chemical shifts, comparison to a crystal structure of a substate, and through designed ensemble redistribution via atomic mutagenesis. Applications to TAR bulge variants and more complex tertiary RNAs support the generality of this approach and the potential to make the determination of atomic-resolution RNA ensembles routine.


Subject(s)
Cheminformatics/methods , HIV-1/chemistry , RNA Folding , RNA, Viral/ultrastructure , HIV Long Terminal Repeat , HIV-1/genetics , HIV-1/ultrastructure , Models, Chemical , Molecular Dynamics Simulation , Nuclear Magnetic Resonance, Biomolecular , RNA, Viral/chemistry , RNA, Viral/genetics
18.
Nature ; 587(7833): 291-296, 2020 11.
Article in English | MEDLINE | ID: mdl-33087930

ABSTRACT

Transcription factors recognize specific genomic sequences to regulate complex gene-expression programs. Although it is well-established that transcription factors bind to specific DNA sequences using a combination of base readout and shape recognition, some fundamental aspects of protein-DNA binding remain poorly understood1,2. Many DNA-binding proteins induce changes in the structure of the DNA outside the intrinsic B-DNA envelope. However, how the energetic cost that is associated with distorting the DNA contributes to recognition has proven difficult to study, because the distorted DNA exists in low abundance in the unbound ensemble3-9. Here we use a high-throughput assay that we term SaMBA (saturation mismatch-binding assay) to investigate the role of DNA conformational penalties in transcription factor-DNA recognition. In SaMBA, mismatched base pairs are introduced to pre-induce structural distortions in the DNA that are much larger than those induced by changes in the Watson-Crick sequence. Notably, approximately 10% of mismatches increased transcription factor binding, and for each of the 22 transcription factors that were examined, at least one mismatch was found that increased the binding affinity. Mismatches also converted non-specific sites into high-affinity sites, and high-affinity sites into 'super sites' that exhibit stronger affinity than any known canonical binding site. Determination of high-resolution X-ray structures, combined with nuclear magnetic resonance measurements and structural analyses, showed that many of the DNA mismatches that increase binding induce distortions that are similar to those induced by protein binding-thus prepaying some of the energetic cost incurred from deforming the DNA. Our work indicates that conformational penalties are a major determinant of protein-DNA recognition, and reveals mechanisms by which mismatches can recruit transcription factors and thus modulate replication and repair activities in the cell10,11.


Subject(s)
DNA-Binding Proteins/chemistry , Molecular Conformation , Nucleic Acid Heteroduplexes/chemistry , Arabidopsis Proteins/chemistry , Base Pairing , Binding Sites , Crystallography, X-Ray , Humans , Models, Molecular , Mutation , Nuclear Magnetic Resonance, Biomolecular , Protein Binding , Saccharomyces cerevisiae Proteins/chemistry , Thermodynamics , Transcription Factors/chemistry
19.
Nucleic Acids Res ; 48(21): 12365-12379, 2020 12 02.
Article in English | MEDLINE | ID: mdl-33104789

ABSTRACT

2'-O-Methyl (Nm) is a highly abundant post-transcriptional RNA modification that plays important biological roles through mechanisms that are not entirely understood. There is evidence that Nm can alter the biological activities of RNAs by biasing the ribose sugar pucker equilibrium toward the C3'-endo conformation formed in canonical duplexes. However, little is known about how Nm might more broadly alter the dynamic ensembles of flexible RNAs containing bulges and internal loops. Here, using NMR and the HIV-1 transactivation response (TAR) element as a model system, we show that Nm preferentially stabilizes alternative secondary structures in which the Nm-modified nucleotides are paired, increasing both the abundance and lifetime of low-populated short-lived excited states by up to 10-fold. The extent of stabilization increased with number of Nm modifications and was also dependent on Mg2+. Through phi-value analysis, the Nm modification also provided rare insights into the structure of the transition state for conformational exchange. Our results suggest that Nm could alter the biological activities of Nm-modified RNAs by modulating their secondary structural ensembles as well as establish the utility of Nm as a tool for the discovery and characterization of RNA excited state conformations.


Subject(s)
HIV Long Terminal Repeat , Magnesium/chemistry , RNA Processing, Post-Transcriptional , RNA, Viral/chemistry , Base Pairing , Cations, Divalent , Density Functional Theory , HIV-1/chemistry , Magnesium/metabolism , Magnetic Resonance Spectroscopy , Methylation , Nucleic Acid Conformation , RNA Stability , RNA, Viral/genetics , RNA, Viral/metabolism , Thermodynamics
20.
J Biol Chem ; 295(47): 15933-15947, 2020 11 20.
Article in English | MEDLINE | ID: mdl-32913127

ABSTRACT

As the Watson-Crick faces of nucleobases are protected in dsDNA, it is commonly assumed that deleterious alkylation damage to the Watson-Crick faces of nucleobases predominantly occurs when DNA becomes single-stranded during replication and transcription. However, damage to the Watson-Crick faces of nucleobases has been reported in dsDNA in vitro through mechanisms that are not understood. In addition, the extent of protection from methylation damage conferred by dsDNA relative to ssDNA has not been quantified. Watson-Crick base pairs in dsDNA exist in dynamic equilibrium with Hoogsteen base pairs that expose the Watson-Crick faces of purine nucleobases to solvent. Whether this can influence the damage susceptibility of dsDNA remains unknown. Using dot-blot and primer extension assays, we measured the susceptibility of adenine-N1 to methylation by dimethyl sulfate (DMS) when in an A-T Watson-Crick versus Hoogsteen conformation. Relative to unpaired adenines in a bulge, Watson-Crick A-T base pairs in dsDNA only conferred ∼130-fold protection against adenine-N1 methylation, and this protection was reduced to ∼40-fold for A(syn)-T Hoogsteen base pairs embedded in a DNA-drug complex. Our results indicate that Watson-Crick faces of nucleobases are accessible to alkylating agents in canonical dsDNA and that Hoogsteen base pairs increase this accessibility. Given the higher abundance of dsDNA relative to ssDNA, these results suggest that dsDNA could be a substantial source of cytotoxic damage. The work establishes DMS probing as a method for characterizing A(syn)-T Hoogsteen base pairs in vitro and also lays the foundation for a sequencing approach to map A(syn)-T Hoogsteen and unpaired adenines genome-wide in vivo.


Subject(s)
Base Pairing , DNA Methylation , DNA/chemistry , Sulfuric Acid Esters/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...