Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 82
1.
Hum Mol Genet ; 2024 Apr 03.
Article En | MEDLINE | ID: mdl-38569558

While many disease-associated single nucleotide polymorphisms (SNPs) are expression quantitative trait loci (eQTLs), a large proportion of genome-wide association study (GWAS) variants are of unknown function. Alternative polyadenylation (APA) plays an important role in posttranscriptional regulation by allowing genes to shorten or extend 3' untranslated regions (UTRs). We hypothesized that genetic variants that affect APA in lung tissue may lend insight into the function of respiratory associated GWAS loci. We generated alternative polyadenylation (apa) QTLs using RNA sequencing and whole genome sequencing on 1241 subjects from the Lung Tissue Research Consortium (LTRC) as part of the NHLBI TOPMed project. We identified 56 179 APA sites corresponding to 13 582 unique genes after filtering out APA sites with low usage. We found that a total of 8831 APA sites were associated with at least one SNP with q-value < 0.05. The genomic distribution of lead APA SNPs indicated that the majority are intronic variants (33%), followed by downstream gene variants (26%), 3' UTR variants (17%), and upstream gene variants (within 1 kb region upstream of transcriptional start site, 10%). APA sites in 193 genes colocalized with GWAS data for at least one phenotype. Genes containing the top APA sites associated with GWAS variants include membrane associated ring-CH-type finger 2 (MARCHF2), nectin cell adhesion molecule 2 (NECTIN2), and butyrophilin subfamily 3 member A2 (BTN3A2). Overall, these findings suggest that APA may be an important mechanism for genetic variants in lung function and chronic obstructive pulmonary disease (COPD).

2.
bioRxiv ; 2024 Mar 29.
Article En | MEDLINE | ID: mdl-38585719

The SARS-CoV-2 frameshifting element (FSE) has been intensely studied and explored as a therapeutic target for coronavirus diseases including COVID-19. Besides the intriguing virology, this small RNA is known to adopt many length-dependent conformations, as verified by multiple experimental and computational approaches. However, the role these alternative conformations play in the frameshifting mechanism and how to quantify this structural abundance has been an ongoing challenge. Here, we show by DMS and dual-luciferase functional assays that previously predicted FSE mutants (using the RAG graph theory approach) suppress structural transitions and abolish frameshifting. Furthermore, correlated mutation analysis of DMS data by three programs (DREEM, DRACO, and DANCE-MaP) reveals important differences in their estimation of specific RNA conformations, suggesting caution in the interpretation of such complex conformational landscapes. Overall, the abolished frameshifting in three different mutants confirms that all alternative conformations play a role in the pathways of ribosomal transition.

3.
bioRxiv ; 2023 Jul 27.
Article En | MEDLINE | ID: mdl-37546801

Regulation of codon optimality is an increasingly appreciated layer of cell- and tissue-specific protein expression control. Here, we use codon-modified reporters to show that differentiation of Drosophila neural stem cells into neurons enables protein expression from rare-codon-enriched genes. From a candidate screen, we identify the cytoplasmic polyadenylation element binding (CPEB) protein Orb2 as a positive regulator of rare-codon-dependent expression in neurons. Using RNA sequencing, we reveal that Orb2-upregulated mRNAs in the brain with abundant Orb2 binding sites have a rare-codon bias. From these Orb2-regulated mRNAs, we demonstrate that rare-codon enrichment is important for expression control and social behavior function of the metabotropic glutamate receptor (mGluR). Our findings reveal a molecular mechanism by which neural stem cell differentiation shifts genetic code regulation to enable critical mRNA and protein expression.

4.
Mol Biol Cell ; 34(12): ar118, 2023 Nov 01.
Article En | MEDLINE | ID: mdl-37647143

Production of large amounts of histone proteins during S phase is critical for proper chromatin formation and genome integrity. This process is achieved in part by the presence of multiple copies of replication dependent (RD) histone genes that occur in one or more clusters in metazoan genomes. In addition, RD histone gene clusters are associated with a specialized nuclear body, the histone locus body (HLB), which facilitates efficient transcription and 3' end-processing of RD histone mRNA. How all five RD histone genes within these clusters are coordinately regulated such that neither too few nor too many histones are produced, a process referred to as histone homeostasis, is not fully understood. Here, we explored the mechanisms of coordinate regulation between multiple RD histone loci in Drosophila melanogaster and Drosophila virilis. We provide evidence for functional competition between endogenous and ectopic transgenic histone arrays located at different chromosomal locations in D. melanogaster that helps maintain proper histone mRNA levels. Consistent with this model, in both species we found that individual histone gene arrays can independently assemble an HLB that results in active histone transcription. Our findings suggest a role for HLB assembly in coordinating RD histone gene expression to maintain histone homeostasis.


Drosophila Proteins , Drosophila , Animals , Drosophila/metabolism , Histones/metabolism , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Homeostasis , RNA, Messenger/genetics , RNA, Messenger/metabolism
5.
RNA ; 29(5): 691-704, 2023 05.
Article En | MEDLINE | ID: mdl-36792358

Although not canonically polyadenylated, the long noncoding RNA MALAT1 (metastasis-associated lung adenocarcinoma transcript 1) is stabilized by a highly conserved 76-nt triple helix structure on its 3' end. The entire MALAT1 transcript is over 8000 nt long in humans. The strongest structural conservation signal in MALAT1 (as measured by covariation of base pairs) is in the triple helix structure. Primary sequence analysis of covariation alone does not reveal the degree of structural conservation of the entire full-length transcript, however. Furthermore, RNA structure is often context dependent; RNA binding proteins that are differentially expressed in different cell types may alter structure. We investigate here the in-cell and cell-free structures of the full-length human and green monkey (Chlorocebus sabaeus) MALAT1 transcripts in multiple tissue-derived cell lines using SHAPE chemical probing. Our data reveal levels of uniform structural conservation in different cell lines, in cells and cell-free, and even between species, despite significant differences in primary sequence. The uniformity of the structural conservation across the entire transcript suggests that, despite seeing covariation signals only in the triple helix junction of the lncRNA, the rest of the transcript's structure is remarkably conserved, at least in primates and across multiple cell types and conditions.


RNA, Long Noncoding , Animals , Humans , Chlorocebus aethiops , RNA, Long Noncoding/metabolism , Base Pairing , Cell Line , RNA Stability , Cell Proliferation , Cell Line, Tumor
6.
Nucleic Acids Res ; 50(17): 10078-10092, 2022 09 23.
Article En | MEDLINE | ID: mdl-36062555

Due to genome segmentation, rotaviruses must co-package eleven distinct genomic RNAs. The packaging is mediated by virus-encoded RNA chaperones, such as the rotavirus NSP2 protein. While the activities of distinct RNA chaperones are well studied on smaller RNAs, little is known about their global effect on the entire viral transcriptome. Here, we used Selective 2'-hydroxyl Acylation Analyzed by Primer Extension and Mutational Profiling (SHAPE-MaP) to examine the secondary structure of the rotavirus transcriptome in the presence of increasing amounts of NSP2. SHAPE-MaP data reveals that despite the well-documented helix-unwinding activity of NSP2 in vitro, its incubation with cognate rotavirus transcripts does not induce a significant change in the SHAPE reactivities. However, a quantitative analysis of mutation rates measured by mutational profiling reveals a global 5-fold rate increase in the presence of NSP2. We demonstrate that the normalization procedure used in deriving SHAPE reactivities from mutation rates can mask an important global effect of an RNA chaperone. Analysis of the mutation rates reveals a larger effect on stems rather than loops. Together, these data provide the first experimentally derived secondary structure model of the rotavirus transcriptome and reveal that NSP2 acts by globally increasing RNA backbone flexibility in a concentration-dependent manner.


Rotavirus , Molecular Chaperones/genetics , Molecular Chaperones/metabolism , Protein Structure, Secondary , RNA, Viral/genetics , RNA, Viral/metabolism , Rotavirus/genetics , Transcriptome/genetics , Viral Nonstructural Proteins/metabolism
7.
Nucleic Acids Res ; 50(17): 9689-9704, 2022 09 23.
Article En | MEDLINE | ID: mdl-36107773

SERPINA1 mRNAs encode the protease inhibitor α-1-antitrypsin and are regulated through post-transcriptional mechanisms. α-1-antitrypsin deficiency leads to chronic obstructive pulmonary disease (COPD) and liver cirrhosis, and specific variants in the 5'-untranslated region (5'-UTR) are associated with COPD. The NM_000295.4 transcript is well expressed and translated in lung and blood and features an extended 5'-UTR that does not contain a competing upstream open reading frame (uORF). We show that the 5'-UTR of NM_000295.4 folds into a well-defined multi-helix structural domain. We systematically destabilized mRNA structure across the NM_000295.4 5'-UTR, and measured changes in (SHAPE quantified) RNA structure and cap-dependent translation relative to a native-sequence reporter. Surprisingly, despite destabilizing local RNA structure, most mutations either had no effect on or decreased translation. Most structure-destabilizing mutations retained native, global 5'-UTR structure. However, those mutations that disrupted the helix that anchors the 5'-UTR domain yielded three groups of non-native structures. Two of these non-native structure groups refolded to create a stable helix near the translation initiation site that decreases translation. Thus, in contrast to the conventional model that RNA structure in 5'-UTRs primarily inhibits translation, complex folding of the NM_000295.4 5'-UTR creates a translation-optimized message by promoting accessibility at the translation initiation site.


Protein Biosynthesis , Pulmonary Disease, Chronic Obstructive , alpha 1-Antitrypsin/genetics , 5' Untranslated Regions , Humans , Protease Inhibitors , Pulmonary Disease, Chronic Obstructive/genetics , RNA, Messenger/metabolism
8.
Elife ; 112022 06 13.
Article En | MEDLINE | ID: mdl-35695373

Splicing is highly regulated and is modulated by numerous factors. Quantitative predictions for how a mutation will affect precursor mRNA (pre-mRNA) structure and downstream function are particularly challenging. Here, we use a novel chemical probing strategy to visualize endogenous precursor and mature MAPT mRNA structures in cells. We used these data to estimate Boltzmann suboptimal structural ensembles, which were then analyzed to predict consequences of mutations on pre-mRNA structure. Further analysis of recent cryo-EM structures of the spliceosome at different stages of the splicing cycle revealed that the footprint of the Bact complex with pre-mRNA best predicted alternative splicing outcomes for exon 10 inclusion of the alternatively spliced MAPT gene, achieving 74% accuracy. We further developed a ß-regression weighting framework that incorporates splice site strength, RNA structure, and exonic/intronic splicing regulatory elements capable of predicting, with 90% accuracy, the effects of 47 known and 6 newly discovered mutations on inclusion of exon 10 of MAPT. This combined experimental and computational framework represents a path forward for accurate prediction of splicing-related disease-causing variants.


Alternative Splicing , RNA Precursors , Exons , Introns , Mutation , RNA Precursors/genetics , RNA Precursors/metabolism , RNA Splice Sites , RNA Splicing , RNA, Messenger/genetics
9.
Elife ; 112022 05 06.
Article En | MEDLINE | ID: mdl-35522036

Codon usage bias has long been appreciated to influence protein production. Yet, relatively few studies have analyzed the impacts of codon usage on tissue-specific mRNA and protein expression. Here, we use codon-modified reporters to perform an organism-wide screen in Drosophila melanogaster for distinct tissue responses to codon usage bias. These reporters reveal a cliff-like decline of protein expression near the limit of rare codon usage in endogenously expressed Drosophila genes. Near the edge of this limit, however, we find the testis and brain are uniquely capable of expressing rare codon-enriched reporters. We define a new metric of tissue-specific codon usage, the tissue-apparent Codon Adaptation Index (taCAI), to reveal a conserved enrichment for rare codon usage in the endogenously expressed genes of both Drosophila and human testis. We further demonstrate a role for rare codons in an evolutionarily young testis-specific gene, RpL10Aa. Optimizing RpL10Aa codons disrupts female fertility. Our work highlights distinct responses to rarely used codons in select tissues, revealing a critical role for codon bias in tissue biology.


Drosophila melanogaster , Drosophila , Animals , Codon/genetics , Codon Usage , Drosophila/genetics , Drosophila melanogaster/genetics , Female , Humans , Male , Testis
10.
Nucleic Acids Res ; 50(7): 4068-4082, 2022 04 22.
Article En | MEDLINE | ID: mdl-35380695

Zinc finger protein 36 like 2 (ZFP36L2) is an RNA-binding protein that destabilizes transcripts containing adenine-uridine rich elements (AREs). The overlap between ZFP36L2 targets in different tissues is minimal, suggesting that ZFP36L2-targeting is highly tissue specific. We developed a novel Zfp36l2-lacking mouse model (L2-fKO) to identify factors governing this tissue specificity. We found 549 upregulated genes in the L2-fKO spleen by RNA-seq. These upregulated genes were enriched in ARE motifs in the 3'UTRs, which suggests that they are ZFP36L2 targets, however the precise sequence requirement for targeting was not evident from motif analysis alone. We therefore used gel-shift mobility assays on 12 novel putative targets and established that ZFP36L2 requires a 7-mer (UAUUUAU) motif to bind. We observed a statistically significant enrichment of 7-mer ARE motifs in upregulated genes and determined that ZFP36L2 targets are enriched for multiple 7-mer motifs. Elavl2 mRNA, which has three 7-mer (UAUUUAU) motifs, was also upregulated in L2-fKO spleens. Overexpression of ZFP36L2, but not a ZFP36L2(C176S) mutant, reduced Elavl2 mRNA expression, suggesting a direct negative effect. Additionally, a reporter assay demonstrated that the ZFP36L2 effect on Elavl2 decay is dependent on the Elavl2-3'UTR and requires the 7-mer AREs. Our data indicate that Elavl2 mRNA is a novel target of ZFP36L2, specific to the spleen. Likely, ZFP36L2 combined with other RNA binding proteins, such as ELAVL2, governs tissue specificity.


ELAV-Like Protein 2 , RNA-Binding Proteins , Tristetraprolin/metabolism , 3' Untranslated Regions/genetics , Animals , Mice , Organ Specificity , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , RNA-Seq
11.
Hum Genet ; 141(10): 1659-1672, 2022 Oct.
Article En | MEDLINE | ID: mdl-34741198

Disease-associated variants (DAVs) are commonly considered either through a genomic lens that describes variant function at the DNA level, or at the protein function level if the variant is translated. Although the genomic and proteomic effects of variation are well-characterized, genetic variants disrupting post-transcriptional regulation is another mechanism of disease that remains understudied. Specific RNA sequence motifs mediate post-transcriptional regulation both in the nucleus and cytoplasm of eukaryotic cells, often by binding to RNA-binding proteins or other RNAs. However, many DAVs map far from these motifs, which suggests deeper layers of post-transcriptional mechanistic control. Here, we consider a transcriptomic framework to outline the importance of post-transcriptional regulation as a mechanism of disease-causing single-nucleotide variation in the human genome. We first describe the composition of the human transcriptome and the importance of abundant yet overlooked components such as introns and untranslated regions (UTRs) of messenger RNAs (mRNAs). We present an analysis of Human Gene Mutation Database variants mapping to mRNAs and examine the distribution of causative disease-associated variation across the transcriptome. Although our analysis confirms the importance of post-transcriptional regulatory motifs, a majority of DAVs do not directly map to known regulatory motifs. Therefore, we review evidence that regions outside these well-characterized motifs can regulate function by RNA structure-mediated mechanisms in all four elements of an mRNA: exons, introns, 5' and 3' UTRs. To this end, we review published examples of riboSNitches, which are single-nucleotide variants that result in a change in RNA structure that is causative of the disease phenotype. In this review, we present the current state of knowledge of how DAVs act at the transcriptome level, both through altering post-transcriptional regulatory motifs and by the effects of RNA structure.


Proteomics , RNA-Binding Proteins , 3' Untranslated Regions , Genetic Variation , Humans , Nucleotides/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA-Binding Proteins/genetics
12.
Biophys J ; 121(1): 7-10, 2022 01 04.
Article En | MEDLINE | ID: mdl-34896370

RNA research is advancing at an ever increasing pace. The newest and most state-of-the-art instruments and techniques have made possible the discoveries of new RNAs, and they have carried the field to new frontiers of disease research, vaccine development, therapeutics, and architectonics. Like proteins, RNAs show a marked relationship between structure and function. A deeper grasp of RNAs requires a finer understanding of their elaborate structures. In pursuit of this, cutting-edge experimental and computational structure-probing techniques output several candidate geometries for a given RNA, each of which is perfectly aligned with experimentally determined parameters. Identifying which structure is the most accurate, however, remains a major obstacle. In recent years, several algorithms have been developed for ranking candidate RNA structures in order from most to least probable, though their levels of accuracy and transparency leave room for improvement. Most recently, advances in both areas are demonstrated by rsRNASP, a novel algorithm proposed by Tan et al. rsRNASP is a residue-separation-based statistical potential for three-dimensional structure evaluation, and it outperforms the leading algorithms in the field.


Algorithms , RNA , Nucleic Acid Conformation , Proteins , RNA/chemistry , RNA/genetics , Sequence Analysis, RNA
13.
PLoS Comput Biol ; 17(12): e1009632, 2021 12.
Article En | MEDLINE | ID: mdl-34905538

SHAPE-JuMP is a concise strategy for identifying close-in-space interactions in RNA molecules. Nucleotides in close three-dimensional proximity are crosslinked with a bi-reactive reagent that covalently links the 2'-hydroxyl groups of the ribose moieties. The identities of crosslinked nucleotides are determined using an engineered reverse transcriptase that jumps across crosslinked sites, resulting in a deletion in the cDNA that is detected using massively parallel sequencing. Here we introduce ShapeJumper, a bioinformatics pipeline to process SHAPE-JuMP sequencing data and to accurately identify through-space interactions, as observed in complex JuMP datasets. ShapeJumper identifies proximal interactions with near-nucleotide resolution using an alignment strategy that is optimized to tolerate the unique non-templated reverse-transcription profile of the engineered crosslink-traversing reverse-transcriptase. JuMP-inspired strategies are now poised to replace adapter-ligation for detecting RNA-RNA interactions in most crosslinking experiments.


DNA, Complementary/chemistry , RNA/chemistry , Software , Algorithms , Binding Sites , Computational Biology , Cross-Linking Reagents , DNA, Complementary/genetics , Genetic Engineering , Models, Molecular , Nucleic Acid Conformation , RNA/genetics , Sequence Alignment/statistics & numerical data
14.
Nucleic Acids Res ; 49(21): 12445-12466, 2021 12 02.
Article En | MEDLINE | ID: mdl-34850114

Telomerase is a unique ribonucleoprotein (RNP) reverse transcriptase that utilizes its cognate RNA molecule as a template for telomere DNA repeat synthesis. Telomerase contains the reverse transcriptase protein, TERT and the template RNA, TR, as its core components. The 5'-half of TR forms a highly conserved catalytic core comprising of the template region and adjacent domains necessary for telomere synthesis. However, how telomerase RNA folding takes place in vivo has not been fully understood due to low abundance of the native RNP. Here, using unicellular pathogen Trypanosoma brucei as a model, we reveal important regional folding information of the native telomerase RNA core domains, i.e. TR template, template boundary element, template proximal helix and Helix IV (eCR4-CR5) domain. For this purpose, we uniquely combined in-cell probing with targeted high-throughput RNA sequencing and mutational mapping under three conditions: in vivo (in WT and TERT-/- cells), in an immunopurified catalytically active telomerase RNP complex and ex vivo (deproteinized). We discover that TR forms at least two different conformers with distinct folding topologies in the insect and mammalian developmental stages of T. brucei. Also, TERT does not significantly affect the RNA folding in vivo, suggesting that the telomerase RNA in T. brucei exists in a conformationally preorganized stable structure. Our observed differences in RNA (TR) folding at two distinct developmental stages of T. brucei suggest that important conformational changes are a key component of T. brucei development.


Catalytic Domain , Protozoan Proteins/genetics , RNA, Protozoan/genetics , RNA/genetics , Telomerase/genetics , Trypanosoma brucei brucei/genetics , Base Sequence , Biocatalysis , Enzyme Assays/methods , Green Fluorescent Proteins/genetics , Green Fluorescent Proteins/metabolism , Mutation , Nucleic Acid Conformation , Protein Binding , Protozoan Proteins/chemistry , Protozoan Proteins/metabolism , RNA/chemistry , RNA/metabolism , RNA Folding , RNA, Protozoan/chemistry , RNA, Protozoan/metabolism , Telomerase/chemistry , Telomerase/metabolism , Thermodynamics , Trypanosoma brucei brucei/metabolism
15.
Genomics ; 113(6): 4184-4195, 2021 11.
Article En | MEDLINE | ID: mdl-34763026

Cigarette smoking induces a profound transcriptomic and systemic inflammatory response. Previous studies have focused on gene level differential expression of smoking, but the genome-wide effects of smoking on alternative isoform regulation have not yet been described. We conducted RNA sequencing in whole-blood samples of 454 current and 767 former smokers in the COPDGene Study, and we analyzed the effects of smoking on differential usage of isoforms and exons. At 10% FDR, we detected 3167 differentially expressed genes, 945 differentially used isoforms and 160 differentially used exons. Isoform switch analysis revealed widespread 3' UTR lengthening associated with cigarette smoking. The lengthening of these 3' UTRs was consistent with alternative usage of distal polyadenylation sites, and these extended 3' UTR regions were significantly enriched with functional sequence elements including microRNA and RNA-protein binding sites. These findings warrant further studies on alternative polyadenylation events as potential biomarkers and novel therapeutic targets for smoking-related diseases.


Cigarette Smoking , Polyadenylation , 3' Untranslated Regions , Cigarette Smoking/adverse effects , Cigarette Smoking/genetics , Protein Isoforms/genetics , Smoking/adverse effects , Smoking/genetics
16.
PLoS Genet ; 17(11): e1009912, 2021 11.
Article En | MEDLINE | ID: mdl-34784346

α1-anti-trypsin (A1AT), encoded by SERPINA1, is a neutrophil elastase inhibitor that controls the inflammatory response in the lung. Severe A1AT deficiency increases risk for Chronic Obstructive Pulmonary Disease (COPD), however, the role of A1AT in COPD in non-deficient individuals is not well known. We identify a 2.1-fold increase (p = 2.5x10-6) in the use of a distal poly-adenylation site in primary lung tissue RNA-seq in 82 COPD cases when compared to 64 controls and replicate this in an independent study of 376 COPD and 267 controls. This alternative polyadenylation event involves two sites, a proximal and distal site, 61 and 1683 nucleotides downstream of the A1AT stop codon. To characterize this event, we measured the distal ratio in human primary tissue short read RNA-seq data and corroborated our results with long read RNA-seq data. Integrating these results with 3' end RNA-seq and nanoluciferase reporter assay experiments we show that use of the distal site yields mRNA transcripts with over 50-fold decreased translation efficiency and A1AT expression. We identified seven RNA binding proteins using enhanced CrossLinking and ImmunoPrecipitation precipitation (eCLIP) with one or more binding sites in the SERPINA1 3' UTR. We combined these data with measurements of the distal ratio in shRNA knockdown experiments, nuclear and cytoplasmic fractionation, and chemical RNA structure probing. We identify Quaking Homolog (QKI) as a modulator of SERPINA1 mRNA translation and confirm the role of QKI in SERPINA1 translation with luciferase reporter assays. Analysis of single-cell RNA-seq showed differences in the distribution of the SERPINA1 distal ratio among hepatocytes, macrophages, αß-Tcells and plasma cells in the liver. Alveolar Type 1,2, dendritic cells and macrophages also vary in their distal ratio in the lung. Our work reveals a complex post-transcriptional mechanism that regulates alternative polyadenylation and A1AT expression in COPD.


Lung/metabolism , Pulmonary Disease, Chronic Obstructive/genetics , alpha 1-Antitrypsin/genetics , Cell Line , Codon, Terminator/genetics , Gene Expression Regulation/genetics , Hepatocytes/metabolism , Humans , Liver/metabolism , Lung/pathology , Macrophages/metabolism , Polyadenylation/genetics , Proteinase Inhibitory Proteins, Secretory/genetics , Proteinase Inhibitory Proteins, Secretory/metabolism , Pulmonary Disease, Chronic Obstructive/pathology , RNA-Seq , Single-Cell Analysis , T-Lymphocytes/metabolism
17.
J Am Chem Soc ; 143(30): 11404-11422, 2021 08 04.
Article En | MEDLINE | ID: mdl-34283611

The SARS-CoV-2 frameshifting RNA element (FSE) is an excellent target for therapeutic intervention against Covid-19. This small gene element employs a shifting mechanism to pause and backtrack the ribosome during translation between Open Reading Frames 1a and 1b, which code for viral polyproteins. Any interference with this process has a profound effect on viral replication and propagation. Pinpointing the structures adapted by the FSE and associated structural transformations involved in frameshifting has been a challenge. Using our graph-theory-based modeling tools for representing RNA secondary structures, "RAG" (RNA-As-Graphs), and chemical structure probing experiments, we show that the 3-stem H-type pseudoknot (3_6 dual graph), long assumed to be the dominant structure, has a viable alternative, an HL-type 3-stem pseudoknot (3_3) for longer constructs. In addition, an unknotted 3-way junction RNA (3_5) emerges as a minor conformation. These three conformations share Stems 1 and 3, while the different Stem 2 may be involved in a conformational switch and possibly associations with the ribosome during translation. For full-length genomes, a stem-loop motif (2_2) may compete with these forms. These structural and mechanistic insights advance our understanding of the SARS-CoV-2 frameshifting process and concomitant virus life cycle, and point to three avenues of therapeutic intervention.


RNA, Viral/chemistry , SARS-CoV-2/chemistry , Base Sequence , Inverted Repeat Sequences , Models, Molecular , Nucleic Acid Conformation , RNA, Viral/genetics
18.
Biochemistry ; 60(25): 1971-1982, 2021 06 29.
Article En | MEDLINE | ID: mdl-34121404

Higher-order structure governs function for many RNAs. However, discerning this structure for large RNA molecules in solution is an unresolved challenge. Here, we present SHAPE-JuMP (selective 2'-hydroxyl acylation analyzed by primer extension and juxtaposed merged pairs) to interrogate through-space RNA tertiary interactions. A bifunctional small molecule is used to chemically link proximal nucleotides in an RNA structure. The RNA cross-link site is then encoded into complementary DNA (cDNA) in a single, direct step using an engineered reverse transcriptase that "jumps" across cross-linked nucleotides. The resulting cDNAs contain a deletion relative to the native RNA sequence, which can be detected by sequencing, that indicates the sites of cross-linked nucleotides. SHAPE-JuMP measures RNA tertiary structure proximity concisely across large RNA molecules at nanometer resolution. SHAPE-JuMP is especially effective at measuring interactions in multihelix junctions and loop-to-helix packing, enables modeling of the global fold for RNAs up to several hundred nucleotides in length, facilitates ranking of structural models by consistency with through-space restraints, and is poised to enable solution-phase structural interrogation and modeling of complex RNAs.


RNA/chemistry , Acylation , Cross-Linking Reagents/chemistry , DNA, Complementary/chemistry , Nucleic Acid Conformation , Oxazines/chemistry , RNA/genetics , RNA-Directed DNA Polymerase/chemistry , RNA-Directed DNA Polymerase/genetics , Sequence Analysis, DNA
19.
bioRxiv ; 2021 Jul 05.
Article En | MEDLINE | ID: mdl-33821274

The SARS-CoV-2 frameshifting RNA element (FSE) is an excellent target for therapeutic intervention against Covid-19. This small gene element employs a shifting mechanism to pause and backtrack the ribosome during translation between Open Reading Frames 1a and 1b, which code for viral polyproteins. Any interference with this process has profound effect on viral replication and propagation. Pinpointing the structures adapted by the FSE and associated structural transformations involved in frameshifting has been a challenge. Using our graph-theory-based modeling tools for representing RNA secondary structures, "RAG" (RNA-As-Graphs), and chemical structure probing experiments, we show that the 3-stem H-type pseudoknot (3_6 dual graph), long assumed to be the dominant structure has a viable alternative, an HL-type 3-stem pseudoknot (3_3) for longer constructs. In addition, an unknotted 3-way junction RNA (3_5) emerges as a minor conformation. These three conformations share Stems 1 and 3, while the different Stem 2 may be involved in a conformational switch and possibly associations with the ribosome during translation. For full-length genomes, a stem-loop motif (2_2) may compete with these forms. These structural and mechanistic insights advance our understanding of the SARS-CoV-2 frameshifting process and concomitant virus life cycle, and point to three avenues of therapeutic intervention.

20.
medRxiv ; 2020 Nov 03.
Article En | MEDLINE | ID: mdl-33173926

Chronic obstructive pulmonary disease (COPD) is a leading cause of death worldwide. Genome-wide association studies (GWAS) have identified over 80 loci that are associated with COPD and emphysema, however for most of these loci the causal variant and gene are unknown. Here, we utilize lung splice quantitative trait loci (sQTL) data from the Genotype-Tissue Expression project (GTEx) and short read sequencing data from the Lung Tissue Research Consortium (LTRC) to characterize a locus in nephronectin ( NPNT ) associated with COPD case-control status and lung function. We found that the rs34712979 variant is associated with alternative splice junction use in NPNT , specifically for the junction connecting the 2nd and 4th exons (chr4:105898001-105927336) (p=4.02×10 -38 ). This association colocalized with GWAS data for COPD and lung spirometry measures with a posterior probability of 94%, indicating that the same causal genetic variants in NPNT underlie the associations with COPD risk, spirometric measures of lung function, and splicing. Investigation of NPNT short read sequencing revealed that rs34712979 creates a cryptic splice acceptor site which results in the inclusion of a 3 nucleotide exon extension, coding for a serine residue near the N-terminus of the protein. Using Oxford Nanopore Technologies (ONT) long read sequencing we identified 13 NPNT isoforms, 6 of which are predicted to be protein coding. Two of these are full length isoforms which differ only in the 3 nucleotide exon extension whose occurrence differs by genotype. Overall, our data indicate that rs34712979 modulates COPD risk and lung function by creating a novel splice acceptor which results in the inclusion of a 3 nucelotide sequence coding for a serine in the nephronectin protein sequence. Our findings implicate NPNT splicing in contributing to COPD risk, and identify a novel serine insertion in the nephronectin protein that warrants further study.

...