Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.618
Filtrar
1.
Nat Commun ; 11(1): 527, 2020 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-31988292

RESUMO

G-quadruplex (G4) sequences are abundant in untranslated regions (UTRs) of human messenger RNAs, but their functional importance remains unclear. By integrating multiple sources of genetic and genomic data, we show that putative G-quadruplex forming sequences (pG4) in 5' and 3' UTRs are selectively constrained, and enriched for cis-eQTLs and RNA-binding protein (RBP) interactions. Using over 15,000 whole-genome sequences, we find that negative selection acting on central guanines of UTR pG4s is comparable to that of missense variation in protein-coding sequences. At multiple GWAS-implicated SNPs within pG4 UTR sequences, we find robust allelic imbalance in gene expression across diverse tissue contexts in GTEx, suggesting that variants affecting G-quadruplex formation within UTRs may also contribute to phenotypic variation. Our results establish UTR G4s as important cis-regulatory elements and point to a link between disruption of UTR pG4 and disease.


Assuntos
Quadruplex G , Proteínas de Ligação a RNA/metabolismo , Regiões não Traduzidas , Estudos de Associação Genética , Variação Genética , Humanos , Motivos de Nucleotídeos , Dobramento de RNA , Proteínas de Ligação a RNA/fisiologia
2.
Comput Biol Chem ; 84: 107171, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31931434

RESUMO

Recent advances in high-throughput experimental technologies have generated a huge amount of data on interactions between proteins and nucleic acids. Motivated by the big experimental data, several computational methods have been developed either to predict binding sites in a sequence or to determine if an interaction exists between protein and nucleic acid sequences. However, most of the methods cannot be used to discover new nucleic acid sequences that bind to a target protein because they are classifiers rather than generators. In this paper we propose a generative model for constructing protein-binding RNA sequences and motifs using a long short-term memory (LSTM) neural network. Testing the model for several target proteins showed that RNA sequences generated by the model have high binding affinity and specificity for their target proteins and that the protein-binding motifs derived from the generated RNA sequences are comparable to the motifs from experimentally validated protein-binding RNA sequences. The results are promising and we believe this approach will help design more efficient in vitro or in vivo experiments by suggesting potential RNA aptamers for a target protein.


Assuntos
Modelos Biológicos , Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Sítios de Ligação , Biologia Computacional/métodos , Motivos de Nucleotídeos
3.
Nucleic Acids Res ; 48(4): 1681-1690, 2020 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-31950160

RESUMO

I-motif DNAs have been widely employed as robust modulating components to construct reconfigurable DNA nanodevices that function well in acidic cellular environments. However, they generally display poor interactivity with fluorescent ligands under these complex conditions, illustrating a major difficulty in utilizing i-motifs as the light-up system for label-free DNA nanoassemblies and bioimaging. Towards addressing this challenge, here we devise new types of i-motif/miniduplex hybrid structures that display an unprecedentedly high interactivity with commonly-used benzothiazole dyes (e.g. thioflavin T). A well-chosen tetranucleotide, whose optimal sequence depends on the used ligand, is appended to the 5'-terminals of diverse i-motifs and forms a minimal parallel duplex thereby creating a preferential site for binding ligands, verified by molecular dynamics simulation. In this way, the fluorescence of ligands can be dramatically enhanced by the i-motif/miniduplex hybrids under complex physiological conditions. This provides a generic light-up system with a high signal-to-background ratio for programmable DNA nanoassemblies, illustrated through utilizing it for a pH-driven framework nucleic acid nanodevice manipulated in acidic cellular membrane microenvironments. It enables label-free fluorescence bioimaging in response to extracellular pH change.


Assuntos
Técnicas Biossensoriais , DNA/isolamento & purificação , Ácidos Nucleicos/genética , Motivos de Nucleotídeos/genética , Benzotiazóis/química , DNA/química , Corantes Fluorescentes/química , Quadruplex G , Humanos , Ligantes , Simulação de Dinâmica Molecular , Ácidos Nucleicos/química , Espectrometria de Fluorescência
4.
Nucleic Acids Res ; 48(4): 2026-2034, 2020 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-31943070

RESUMO

Type II CRISPR-Cas9 RNA-guided nucleases are widely used for genome engineering. Type II-A SpCas9 protein from Streptococcus pyogenes is the most investigated and highly used enzyme of its class. Nevertheless, it has some drawbacks, including a relatively big size, imperfect specificity and restriction to DNA targets flanked by an NGG PAM sequence. Cas9 orthologs from other bacterial species may provide a rich and largely untapped source of biochemical diversity, which can help to overcome the limitations of SpCas9. Here, we characterize CcCas9, a Type II-C CRISPR nuclease from Clostridium cellulolyticum H10. We show that CcCas9 is an active endonuclease of comparatively small size that recognizes a novel two-nucleotide PAM sequence. The CcCas9 can potentially broaden the existing scope of biotechnological applications of Cas9 nucleases and may be particularly advantageous for genome editing of C. cellulolyticum H10, a bacterium considered to be a promising biofuel producer.


Assuntos
Proteína 9 Associada à CRISPR/química , Sistemas CRISPR-Cas/genética , Clostridium cellulolyticum/enzimologia , DNA/química , Proteína 9 Associada à CRISPR/genética , Cristalografia por Raios X , DNA/genética , Edição de Genes , Mutação , Motivos de Nucleotídeos/genética , RNA Guia/genética , Streptococcus pyogenes/enzimologia , Especificidade por Substrato
5.
Nucleic Acids Res ; 48(3): 1164-1174, 2020 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-31889193

RESUMO

Solution nuclear magnetic resonance (NMR) experiments allow RNA dynamics to be determined in an aqueous environment. However, when a limited number of peaks are assigned, it is difficult to obtain structural information. We here show a protocol based on the combination of experimental data (Nuclear Overhauser Effect, NOE) and molecular dynamics simulations with enhanced sampling methods. This protocol allows to (a) obtain a maximum entropy ensemble compatible with NMR restraints and (b) obtain a minimal set of metastable conformations compatible with the experimental data (maximum parsimony). The method is applied to a hairpin of 29 nt from an inverted SINEB2, which is part of the SINEUP family and has been shown to enhance protein translation. A clustering procedure is introduced where the annotation of base-base interactions and glycosidic bond angles is used as a metric. By reweighting the contributions of the clusters, minimal sets of four conformations could be found which are compatible with the experimental data. A motif search on the structural database showed that some identified low-population states are present in experimental structures of other RNA transcripts. The introduced method can be applied to characterize RNA dynamics in systems where a limited amount of NMR information is available.


Assuntos
RNA/química , Análise por Conglomerados , Simulação de Dinâmica Molecular , Ressonância Magnética Nuclear Biomolecular , Conformação de Ácido Nucleico , Motivos de Nucleotídeos
7.
Nat Struct Mol Biol ; 26(12): 1114-1122, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31792448

RESUMO

T-box riboswitches are modular bacterial noncoding RNAs that sense and regulate amino acid availability through direct interactions with tRNAs. Between the 5' anticodon-binding stem I domain and the 3' amino acid sensing domains of most T-boxes lies the stem II domain of unknown structure and function. Here, we report a 2.8-Å cocrystal structure of the Nocardia farcinica ileS T-box in complex with its cognate tRNAIle. The structure reveals a perpendicularly arranged ultrashort stem I containing a K-turn and an elongated stem II bearing an S-turn. Both stems rest against a compact pseudoknot, dock via an extended ribose zipper and jointly create a binding groove specific to the anticodon of its cognate tRNA. Contrary to proposed distal contacts to the tRNA elbow region, stem II locally reinforces the codon-anticodon interactions between stem I and tRNA, achieving low-nanomolar affinity. This study illustrates how mRNA junctions can create specific binding sites for interacting RNAs of prescribed sequence and structure.


Assuntos
Proteínas de Bactérias/genética , Regulação Bacteriana da Expressão Gênica , Isoleucina-tRNA Ligase/genética , Nocardia/genética , Motivos de Nucleotídeos , RNA Bacteriano/química , RNA de Transferência de Isoleucina/química , Riboswitch/genética , Sítios de Ligação , Cristalografia por Raios X , Modelos Moleculares , RNA Bacteriano/metabolismo , RNA de Transferência de Isoleucina/metabolismo , Relação Estrutura-Atividade
8.
BMC Bioinformatics ; 20(Suppl 23): 646, 2019 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-31881831

RESUMO

BACKGROUND: There are many different types of microRNAs (miRNAs) and elucidating their functions is still under intensive research. A fundamental step in functional annotation of a new miRNA is to classify it into characterized miRNA families, such as those in Rfam and miRBase. With the accumulation of annotated miRNAs, it becomes possible to use deep learning-based models to classify different types of miRNAs. In this work, we investigate several key issues associated with successful application of deep learning models for miRNA classification. First, as secondary structure conservation is a prominent feature for noncoding RNAs including miRNAs, we examine whether secondary structure-based encoding improves classification accuracy. Second, as there are many more non-miRNA sequences than miRNAs, instead of assigning a negative class for all non-miRNA sequences, we test whether using softmax output can distinguish in-distribution and out-of-distribution samples. Finally, we investigate whether deep learning models can correctly classify sequences from small miRNA families. RESULTS: We present our trained convolutional neural network (CNN) models for classifying miRNAs using different types of feature learning and encoding methods. In the first method, we explicitly encode the predicted secondary structure in a matrix. In the second method, we use only the primary sequence information and one-hot encoding matrix. In addition, in order to reject sequences that should not be classified into targeted miRNA families, we use a threshold derived from softmax layer to exclude out-of-distribution sequences, which is an important feature to make this model useful for real transcriptomic data. The comparison with the state-of-the-art ncRNA classification tools such as Infernal shows that our method can achieve comparable sensitivity and accuracy while being significantly faster. CONCLUSION: Automatic feature learning in CNN can lead to better classification accuracy and sensitivity for miRNA classification and annotation. The trained models and also associated codes are freely available at https://github.com/HubertTang/DeepMir.


Assuntos
MicroRNAs/genética , Algoritmos , Pareamento de Bases/genética , Sequência de Bases , Motivos de Nucleotídeos/genética , Probabilidade , RNA de Transferência/genética
9.
J Chem Theory Comput ; 15(12): 7004-7014, 2019 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-31670957

RESUMO

N6-Methyladenosine (m6A) is the most prevalent chemical modification in human mRNAs. Its recognition by reader proteins enables many cellular functions, including splicing and translation of mRNAs. However, the binding mechanisms of m6A-containing RNAs to their readers are still elusive due to the unclear roles of m6A-flanking ribonucleotides. Here, we use a model system, YTHDC1 with its RNA motif 5'-G-2G-1(m6A)C+1U+2-3', to investigate the binding mechanisms by atomistic simulations, X-ray crystallography, and isothermal titration calorimetry. The experimental data and simulation results show that m6A is captured by an aromatic cage of YTHDC1 and the 3' terminus nucleotides are stabilized by cation-π-π interactions, while the 5' terminus remains flexible. Notably, simulations of unbound RNA motifs reveal that the methyl group of m6A and the 5' terminus shift the conformational preferences of the oligoribonucleotide to the bound-like conformation, thereby facilitating the association process. The binding mechanisms may help in the discovery of chemical probes against m6A reader proteins.


Assuntos
Proteínas do Tecido Nervoso/química , Motivos de Nucleotídeos , Fatores de Processamento de RNA/química , RNA Mensageiro/química , Sítios de Ligação , Cristalografia por Raios X , Humanos , Modelos Moleculares , Proteínas do Tecido Nervoso/isolamento & purificação , Fatores de Processamento de RNA/isolamento & purificação
10.
PLoS Pathog ; 15(10): e1008147, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31644572

RESUMO

Potato spindle tuber viroid (PSTVd) is a circular non-coding RNA of 359 nucleotides that replicates and spreads systemically in host plants, thus all functions required to establish an infection are mediated by sequence and structural elements in the genome. The PSTVd secondary structure contains 26 Watson-Crick base-paired stems and 27 loops. Most of the loops are believed to form three-dimensional (3D) structural motifs through non-Watson-Crick base pairing, base stacking, and other local interactions. Homology-based prediction using the JAR3D online program revealed that loop 27 (nucleotides 177-182) most likely forms a 3D structure similar to the loop of a conserved hairpin located in the 3' untranslated region of histone mRNAs in animal cells. This stem-loop, which is involved in 3'-end maturation, is not found in polyadenylated plant histone mRNAs. Mutagenesis showed that PSTVd genomes containing base substitutions in loop 27 predicted by JAR3D to disrupt the 3D structure were unable to replicate in Nicotiana benthamiana leaves following mechanical rub inoculation, with one exception: a U178G/U179G double mutant was replication-competent and able to spread within the upper epidermis of inoculated leaves, but was confined to this cell layer. Remarkably, direct delivery of the U178G/U179G mutant into the vascular system by needle puncture inoculation allowed it to spread systemically and enter mesophyll cells and epidermal cells of upper leaves. These findings highlight the importance of RNA 3D structure for PSTVd replication and intercellular trafficking and indicate that loop 27 is required for epidermal exit, but not epidermal entry or transit between other cell types. Thus, requirements for RNA trafficking between epidermal and underlying palisade mesophyll cells are unique and directional. Our findings further suggest that 3D structure and RNA-protein interactions constrain RNA sequence evolution, and validate JAR3D as a tool to predict RNA 3D structure.


Assuntos
Conformação de Ácido Nucleico , Motivos de Nucleotídeos/genética , RNA Viral/genética , Solanum tuberosum/virologia , Tabaco/virologia , Viroides/genética , Doenças das Plantas/virologia , Solanum tuberosum/genética , Tabaco/genética
11.
Molecules ; 24(19)2019 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-31597270

RESUMO

G-quadruplexes (G4s) and i-motifs (iMs) are tetraplex DNA structures. Sequences capable of forming G4/iMs are abundant near the transcription start sites (TSS) of several genes. G4/iMs affect gene expression in vitro. Depending on the gene, the presence of G4/iMs can enhance or suppress expression, making it challenging to discern the underlying mechanism by which they operate. Factors affecting G4/iM structures can provide additional insight into their mechanism of regulation. One such factor is epigenetic modification. The 5-hydroxymethylated cytosines (5hmCs) are epigenetic modifications that occur abundantly in human embryonic stem cells (hESC). The 5hmCs, like G4/iMs, are known to participate in gene regulation and are also enriched near the TSS. We investigated genomic co-localization to assess the possibility that these two elements may play an interdependent role in regulating genes in hESC. Our results indicate that amongst 15,760 G4/iM-forming locations, only 15% have 5hmCs associated with them. A detailed analysis of G4/iM-forming locations enriched in 5hmC indicates that most of these locations are in genes that are associated with cell differentiation, proliferation, apoptosis and embryogenesis. The library generated from our analysis is an important resource for investigators exploring the interdependence of these DNA features in regulating expression of selected genes in hESC.


Assuntos
5-Metilcitosina/análogos & derivados , Quadruplex G , Células-Tronco Embrionárias Humanas/metabolismo , Nanoestruturas/química , Conformação de Ácido Nucleico , Motivos de Nucleotídeos , 5-Metilcitosina/química , Composição de Bases , Diferenciação Celular/genética , Proliferação de Células/genética , Ilhas de CpG , Metilação de DNA , Epigênese Genética , Células-Tronco Embrionárias Humanas/citologia , Humanos , Sítio de Iniciação de Transcrição
12.
Int J Mol Sci ; 20(20)2019 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-31623139

RESUMO

The vacuolar H+-ATPase (V-ATPase) plays many important roles in cell growth and in response to stresses in plants. The V-ATPase subunit H (VHA-H) is required to form a stable and active V-ATPase. Genome-wide analyses of VHA-H genes in crops contribute significantly to a systematic understanding of their functions. A total of 22 VHA-H genes were identified from 11 plants representing major crops including cotton, rice, millet, sorghum, rapeseed, maize, wheat, soybean, barley, potato, and beet. All of these VHA-H genes shared exon-intron structures similar to those of Arabidopsis thaliana. The C-terminal domain of VHA-H was shorter and more conserved than the N-terminal domain. The VHA-H gene was effectively used as a genetic marker to infer the phylogenetic relationships among plants, which were congruent with currently accepted taxonomic groupings. The VHA-H genes from six species of crops (Gossypium raimondii, Brassica napus, Glycine max, Solanum tuberosum, Triticum aestivum, and Zea mays) showed high gene structural diversity. This resulted from the gains and losses of introns. Seven VHA-H genes in six species of crops (Gossypium raimondii, Hordeum vulgare, Solanum tuberosum, Setaria italica, Triticum aestivum, and Zea mays) contained multiple transcript isoforms arising from alternative splicing. The study of cis-acting elements of gene promoters and RNA-seq gene expression patterns confirms the role of VHA-H genes as eco-enzymes. The gene structural diversity and proteomic diversity of VHA-H genes in our crop sampling facilitate understanding of their functional diversity, including stress responses and traits important for crop improvement.


Assuntos
Produtos Agrícolas/genética , Genoma de Planta , Estudo de Associação Genômica Ampla , Família Multigênica , Subunidades Proteicas/genética , ATPases Vacuolares Próton-Translocadoras/genética , Processamento Alternativo , Sequência de Aminoácidos , Produtos Agrícolas/classificação , Genômica/métodos , Motivos de Nucleotídeos , Filogenia , Regiões Promotoras Genéticas , ATPases Vacuolares Próton-Translocadoras/química , ATPases Vacuolares Próton-Translocadoras/metabolismo
13.
PLoS Genet ; 15(10): e1008279, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31603892

RESUMO

Muscle development and lipid accumulation in muscle critically affect meat quality of livestock. However, the genetic factors underlying myofiber-type specification and intramuscular fat (IMF) accumulation remain to be elucidated. Using two independent intercrosses between Western commercial breeds and Korean native pigs (KNPs) and a joint linkage-linkage disequilibrium analysis, we identified a 488.1-kb region on porcine chromosome 12 that affects both reddish meat color (a*) and IMF. In this critical region, only the MYH3 gene, encoding myosin heavy chain 3, was found to be preferentially overexpressed in the skeletal muscle of KNPs. Subsequently, MYH3-transgenic mice demonstrated that this gene controls both myofiber-type specification and adipogenesis in skeletal muscle. We discovered a structural variant in the promotor/regulatory region of MYH3 for which Q allele carriers exhibited significantly higher values of a* and IMF than q allele carriers. Furthermore, chromatin immunoprecipitation and cotransfection assays showed that the structural variant in the 5'-flanking region of MYH3 abrogated the binding of the myogenic regulatory factors (MYF5, MYOD, MYOG, and MRF4). The allele distribution of MYH3 among pig populations worldwide indicated that the MYH3 Q allele is of Asian origin and likely predates domestication. In conclusion, we identified a functional regulatory sequence variant in porcine MYH3 that provides novel insights into the genetic basis of the regulation of myofiber type ratios and associated changes in IMF in pigs. The MYH3 variant can play an important role in improving pork quality in current breeding programs.


Assuntos
Adipogenia/genética , Proteínas do Citoesqueleto/genética , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético/crescimento & desenvolvimento , Miosinas/genética , Tecido Adiposo/crescimento & desenvolvimento , Tecido Adiposo/metabolismo , Animais , Cruzamento , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Genótipo , Carne , Camundongos , Camundongos Transgênicos , Músculo Esquelético/metabolismo , Cadeias Pesadas de Miosina/genética , Motivos de Nucleotídeos , Sus scrofa/genética , Sus scrofa/metabolismo , Suínos
14.
PLoS Biol ; 17(10): e3000496, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31603896

RESUMO

Clustered regularly interspaced short palindromic repeats (CRISPR)-Cas systems have been harnessed as powerful genome editing tools in diverse organisms. However, the off-target effects and the protospacer adjacent motif (PAM) compatibility restrict the therapeutic applications of these systems. Recently, a Streptococcus pyogenes Cas9 (SpCas9) variant, xCas9, was evolved to possess both broad PAM compatibility and high DNA fidelity. Through determination of multiple xCas9 structures, which are all in complex with single-guide RNA (sgRNA) and double-stranded DNA containing different PAM sequences (TGG, CGG, TGA, and TGC), we decipher the molecular mechanisms of the PAM expansion and fidelity enhancement of xCas9. xCas9 follows a unique two-mode PAM recognition mechanism. For non-NGG PAM recognition, xCas9 triggers a notable structural rearrangement in the DNA recognition domains and a rotation in the key PAM-interacting residue R1335; such mechanism has not been observed in the wild-type (WT) SpCas9. For NGG PAM recognition, xCas9 applies a strategy similar to WT SpCas9. Moreover, biochemical and cell-based genome editing experiments pinpointed the critical roles of the E1219V mutation for PAM expansion and the R324L, S409I, and M694I mutations for fidelity enhancement. The molecular-level characterizations of the xCas9 nuclease provide critical insights into the mechanisms of the PAM expansion and fidelity enhancement of xCas9 and could further facilitate the engineering of SpCas9 and other Cas9 orthologs.


Assuntos
Proteína 9 Associada à CRISPR/genética , Sistemas CRISPR-Cas , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , DNA/genética , RNA Guia/genética , Substituição de Aminoácidos , Proteína 9 Associada à CRISPR/química , Proteína 9 Associada à CRISPR/metabolismo , Clonagem Molecular , Cristalografia por Raios X , DNA/química , DNA/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Edição de Genes , Expressão Gênica , Vetores Genéticos/química , Vetores Genéticos/metabolismo , Isoenzimas/química , Isoenzimas/genética , Isoenzimas/metabolismo , Klebsiella pneumoniae/genética , Klebsiella pneumoniae/metabolismo , Modelos Moleculares , Mutagênese Sítio-Dirigida/métodos , Mutação , Motivos de Nucleotídeos , Ligação Proteica , Engenharia de Proteínas/métodos , RNA Guia/química , RNA Guia/metabolismo , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Streptococcus pyogenes/genética , Streptococcus pyogenes/metabolismo
15.
Nat Commun ; 10(1): 4327, 2019 09 23.
Artigo em Inglês | MEDLINE | ID: mdl-31548547

RESUMO

Synthetic RNA-based genetic devices dynamically control a wide range of gene-regulatory processes across diverse cell types. However, the limited throughput of quantitative assays in mammalian cells has hindered fast iteration and interrogation of sequence space needed to identify new RNA devices. Here we report developing a quantitative, rapid and high-throughput mammalian cell-based RNA-Seq assay to efficiently engineer RNA devices. We identify new ribozyme-based RNA devices that respond to theophylline, hypoxanthine, cyclic-di-GMP, and folinic acid from libraries of ~22,700 sequences in total. The small molecule responsive devices exhibit low basal expression and high activation ratios, significantly expanding our toolset of highly functional ribozyme switches. The large datasets obtained further provide conserved sequence and structure motifs that may be used for rationally guided design. The RNA-Seq approach offers a generally applicable strategy for developing broad classes of RNA devices, thereby advancing the engineering of genetic devices for mammalian systems.


Assuntos
Mamíferos/genética , RNA Catalítico/química , Biologia Sintética/métodos , Animais , Redes Reguladoras de Genes , Engenharia Genética , Células HEK293 , Humanos , Motivos de Nucleotídeos , RNA Catalítico/metabolismo , RNA Catalítico/fisiologia
16.
Nucleic Acids Res ; 47(17): 8950-8960, 2019 09 26.
Artigo em Inglês | MEDLINE | ID: mdl-31504757

RESUMO

Template-directed RNA ligation catalyzed by an RNA enzyme (ribozyme) is a plausible and important reaction that could have been involved in transferring genetic information during prebiotic evolution. Laboratory evolution experiments have yielded several classes of ligase ribozymes, but their minimal sequence requirements remain largely unexplored. Because selection experiments strongly favor highly active sequences, less active but smaller catalytic motifs may have been overlooked in these experiments. We used large-scale DNA synthesis and high-throughput ribozyme assay enabled by deep sequencing to systematically minimize a previously laboratory-evolved ligase ribozyme. After designing and evaluating >10 000 sequences, we identified catalytic cores as small as 18 contiguous bases that catalyze template-directed regiospecific RNA ligation. The fact that such a short sequence can catalyze this critical reaction suggests that similarly simple or even simpler motifs may populate the RNA sequence space which could have been accessible to the prebiotic ribozymes.


Assuntos
Evolução Molecular Direcionada , RNA Ligase (ATP)/química , RNA Ligase (ATP)/genética , RNA Catalítico/química , RNA Catalítico/genética , Catálise , Domínio Catalítico , DNA/biossíntese , Sequenciamento de Nucleotídeos em Larga Escala , Modelos Moleculares , Motivos de Nucleotídeos , RNA/genética , RNA Ligase (ATP)/metabolismo , RNA Catalítico/metabolismo , Especificidade por Substrato
17.
Nucleic Acids Res ; 47(18): 9480-9494, 2019 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-31504786

RESUMO

Small endonucleolytic ribozymes promote the self-cleavage of their own phosphodiester backbone at a specific linkage. The structures of and the reactions catalysed by members of individual families have been studied in great detail in the past decades. In recent years, bioinformatics studies have uncovered a considerable number of new examples of known catalytic RNA motifs. Importantly, entirely novel ribozyme classes were also discovered, for most of which both structural and biochemical information became rapidly available. However, for the majority of the new ribozymes, which are found in the genomes of a variety of species, a biological function remains elusive. Here, we concentrate on the different approaches to find catalytic RNA motifs in sequence databases. We summarize the emerging principles of RNA catalysis as observed for small endonucleolytic ribozymes. Finally, we address the biological functions of those ribozymes, where relevant information is available and common themes on their cellular activities are emerging. We conclude by speculating on the possibility that the identification and characterization of proteins that we hypothesize to be endogenously associated with catalytic RNA might help in answering the ever-present question of the biological function of the growing number of genomically encoded, small endonucleolytic ribozymes.


Assuntos
Biologia Computacional/métodos , Motivos de Nucleotídeos/genética , RNA Catalítico/genética , Análise de Sequência de RNA/métodos , Catálise , Modelos Moleculares , Conformação de Ácido Nucleico , RNA Catalítico/química , RNA Catalítico/isolamento & purificação
18.
RNA ; 25(12): 1714-1730, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31506380

RESUMO

The origin of the genetic code remains enigmatic five decades after it was elucidated, although there is growing evidence that the code coevolved progressively with the ribosome. A number of primordial codes were proposed as ancestors of the modern genetic code, including comma-free codes such as the RRY, RNY, or GNC codes (R = G or A, Y = C or T, N = any nucleotide), and the X circular code, an error-correcting code that also allows identification and maintenance of the reading frame. It was demonstrated previously that motifs of the X circular code are significantly enriched in the protein-coding genes of most organisms, from bacteria to eukaryotes. Here, we show that imprints of this code also exist in the ribosomal RNA (rRNA). In a large-scale study involving 133 organisms representative of the three domains of life, we identified 32 universal X motifs that are conserved in the rRNA of >90% of the organisms. Intriguingly, most of the universal X motifs are located in rRNA regions involved in important ribosome functions, notably in the peptidyl transferase center and the decoding center that form the original "proto-ribosome." Building on the existing accretion models for ribosome evolution, we propose that error-correcting circular codes represented an important step in the emergence of the modern genetic code. Thus, circular codes would have allowed the simultaneous coding of amino acids and synchronization of the reading frame in primitive translation systems, prior to the emergence of more sophisticated start codon recognition and translation initiation mechanisms.


Assuntos
Evolução Molecular , Código Genético , Motivos de Nucleotídeos , Biossíntese de Proteínas , Ribossomos/genética , Ribossomos/metabolismo , Modelos Biológicos , Modelos Moleculares , Conformação Molecular , Conformação de Ácido Nucleico , RNA Ribossômico/química , RNA Ribossômico/genética , Ribossomos/química , Relação Estrutura-Atividade
19.
BMC Genomics ; 20(1): 718, 2019 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-31533632

RESUMO

BACKGROUND: The work of the FANTOM5 Consortium has brought forth a new level of understanding of the regulation of gene transcription and the cellular processes involved in creating diversity of cell types. In this study, we extended the analysis of the FANTOM5 Cap Analysis of Gene Expression (CAGE) transcriptome data to focus on understanding the genetic regulators involved in mouse cerebellar development. RESULTS: We used the HeliScopeCAGE library sequencing on cerebellar samples over 8 embryonic and 4 early postnatal times. This study showcases temporal expression pattern changes during cerebellar development. Through a bioinformatics analysis that focused on transcription factors, their promoters and binding sites, we identified genes that appear as strong candidates for involvement in cerebellar development. We selected several candidate transcriptional regulators for validation experiments including qRT-PCR and shRNA transcript knockdown. We observed marked and reproducible developmental defects in Atf4, Rfx3, and Scrt2 knockdown embryos, which support the role of these genes in cerebellar development. CONCLUSIONS: The successful identification of these novel gene regulators in cerebellar development demonstrates that the FANTOM5 cerebellum time series is a high-quality transcriptome database for functional investigation of gene regulatory networks in cerebellar development.


Assuntos
Cerebelo/crescimento & desenvolvimento , Perfilação da Expressão Gênica , Motivos de Nucleotídeos/genética , Transcrição Genética/genética , Fator 4 Ativador da Transcrição/deficiência , Fator 4 Ativador da Transcrição/genética , Fator 4 Ativador da Transcrição/metabolismo , Animais , Cerebelo/embriologia , Cerebelo/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Técnicas de Silenciamento de Genes , Camundongos , Camundongos Endogâmicos C57BL , Regiões Promotoras Genéticas/genética , Fatores de Transcrição de Fator Regulador X/deficiência , Fatores de Transcrição de Fator Regulador X/genética , Fatores de Transcrição de Fator Regulador X/metabolismo , Fatores de Transcrição/deficiência , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
20.
Phys Chem Chem Phys ; 21(38): 21549-21560, 2019 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-31536074

RESUMO

Repetitive cytosine rich i-motif forming sequences are abundant in the telomere, centromere and promoters of several oncogenes and in some instances are known to regulate transcription and gene expression. The in vivo existence of i-motif structures demands further insight into the factors affecting their formation and stability and development of better understanding of their gene regulatory functions. Most prior studies characterizing the conformational dynamics of i-motifs are based on i-motif forming synthetic constructs. Here, we present a systematic study on the stability and structural properties of biologically relevant i-motifs of telomeric and centromeric repeat fragments. Our results based on molecular dynamics simulations and quantum chemical calculations indicate that along with base pairing interactions within the i-motif core the overall folded conformation is associated with the stable C-HO sugar "zippers" in the narrow grooves and structured water molecules along the wide grooves. The stacked geometry of the hemi-protonated cytosine pairs within the i-motif core is mainly governed by the repulsive base stacking interaction. The loop sequence can affect the structural dynamics of the i-motif by altering the loop motion and backbone conformation. Overall this study provides microscopic insight into the i-motif structure that will be helpful to understand the structural aspect of mechanisms of gene regulation by i-motif DNA.


Assuntos
DNA/química , Substâncias Intercalantes/química , Motivos de Nucleotídeos , Solventes/química , Telômero/química , Pareamento de Bases , Citosina/química , Ligações de Hidrogênio , Simulação de Dinâmica Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA