Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
bioRxiv ; 2024 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-38617247

RESUMEN

Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. While advances in deep learning enable the prediction of accurate protein structural models, RNA structure prediction is not possible at present due to a lack of abundant high-quality reference data. Furthermore, available sequence data are generally not associated with organismal phenotypes that could inform RNA function. We created GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences derived from GTDB genomes to experimental and predicted optimal growth temperatures of GTDB reference organisms. This enables construction of deep and diverse RNA sequence alignments to be used for machine learning. Using GARNET, we define the minimal requirements for a sequence- and structure-aware RNA generative model. We also develop a GPT-like language model for RNA in which triplet tokenization provides optimal encoding. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identified mutations in ribosomal RNA that confer increased thermostability to the Escherichia coli ribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function.

2.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36511586

RESUMEN

SUMMARY: Codetta is a Python program for predicting the genetic code table of an organism from nucleotide sequences. Codetta can analyze an arbitrary nucleotide sequence and needs no sequence annotation or taxonomic placement. The most likely amino acid decoding for each of the 64 codons is inferred from alignments of profile hidden Markov models of conserved proteins to the input sequence. AVAILABILITY AND IMPLEMENTATION: Codetta 2.0 is implemented as a Python 3 program for MacOS and Linux and is available from http://eddylab.org/software/codetta/codetta2.tar.gz and at http://github.com/kshulgina/codetta. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Código Genético , Programas Informáticos , Secuencia de Bases
3.
Science ; 376(6593): 630-635, 2022 05 06.
Artículo en Inglés | MEDLINE | ID: mdl-35511982

RESUMEN

Epistasis can markedly affect evolutionary trajectories. In recent decades, protein-level fitness landscapes have revealed extensive idiosyncratic epistasis among specific mutations. By contrast, other work has found ubiquitous and apparently nonspecific patterns of global diminishing-returns and increasing-costs epistasis among mutations across the genome. Here, we used a hierarchical CRISPR gene drive system to construct all combinations of 10 missense mutations from across the genome in budding yeast and measured their fitness in six environments. We show that the resulting fitness landscapes exhibit global fitness-correlated trends but that these trends emerge from specific idiosyncratic interactions. We thus provide experimental validation of recent theoretical work arguing that fitness-correlated trends can emerge as the generic consequence of idiosyncratic epistasis.


Asunto(s)
Evolución Biológica , Epistasis Genética , Aptitud Genética , Modelos Genéticos , Mutación
4.
Elife ; 102021 11 09.
Artículo en Inglés | MEDLINE | ID: mdl-34751130

RESUMEN

The genetic code has been proposed to be a 'frozen accident,' but the discovery of alternative genetic codes over the past four decades has shown that it can evolve to some degree. Since most examples were found anecdotally, it is difficult to draw general conclusions about the evolutionary trajectories of codon reassignment and why some codons are affected more frequently. To fill in the diversity of genetic codes, we developed Codetta, a computational method to predict the amino acid decoding of each codon from nucleotide sequence data. We surveyed the genetic code usage of over 250,000 bacterial and archaeal genome sequences in GenBank and discovered five new reassignments of arginine codons (AGG, CGA, and CGG), representing the first sense codon changes in bacteria. In a clade of uncultivated Bacilli, the reassignment of AGG to become the dominant methionine codon likely evolved by a change in the amino acid charging of an arginine tRNA. The reassignments of CGA and/or CGG were found in genomes with low GC content, an evolutionary force that likely helped drive these codons to low frequency and enable their reassignment.


All life forms rely on a 'code' to translate their genetic information into proteins. This code relies on limited permutations of three nucleotides ­ the building blocks that form DNA and other types of genetic information. Each 'triplet' of nucleotides ­ or codon ­ encodes a specific amino acid, the basic component of proteins. Reading the sequence of codons in the right order will let the cell know which amino acid to assemble next on a growing protein. For instance, the codon CGG ­ formed of the nucleotides guanine (G) and cytosine (C) ­ codes for the amino acid arginine. From bacteria to humans, most life forms rely on the same genetic code. Yet certain organisms have evolved to use slightly different codes, where one or several codons have an altered meaning. To better understand how alternative genetic codes have evolved, Shulgina and Eddy set out to find more organisms featuring these altered codons, creating a new software called Codetta that can analyze the genome of a microorganism and predict the genetic code it uses. Codetta was then used to sift through the genetic information of 250,000 microorganisms. This was made possible by the sequencing, in recent years, of the genomes of hundreds of thousands of bacteria and other microorganisms ­ including many never studied before. These analyses revealed five groups of bacteria with alternative genetic codes, all of which had changes in the codons that code for arginine. Amongst these, four had genomes with a low proportion of guanine and cytosine nucleotides. This may have made some guanine and cytosine-rich arginine codons very rare in these organisms and, therefore, easier to be reassigned to encode another amino acid. The work by Shulgina and Eddy demonstrates that Codetta is a new, useful tool that scientists can use to understand how genetic codes evolve. In addition, it can also help to ensure the accuracy of widely used protein databases, which assume which genetic code organisms use to predict protein sequences from their genomes.


Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Código Genético , Técnicas Genéticas/instrumentación , Genoma Arqueal , Genoma Bacteriano , Codón/genética
5.
Elife ; 82019 06 21.
Artículo en Inglés | MEDLINE | ID: mdl-31223115

RESUMEN

Developmental enhancers integrate graded concentrations of transcription factors (TFs) to create sharp gene expression boundaries. Here we examine the hunchback P2 (HbP2) enhancer which drives a sharp expression pattern in the Drosophila blastoderm embryo in response to the transcriptional activator Bicoid (Bcd). We systematically interrogate cis and trans factors that influence the shape and position of expression driven by HbP2, and find that the prevailing model, based on pairwise cooperative binding of Bcd to HbP2 is not adequate. We demonstrate that other proteins, such as pioneer factors, Mediator and histone modifiers influence the shape and position of the HbP2 expression pattern. Comparing our results to theory reveals how higher-order cooperativity and energy expenditure impact boundary location and sharpness. Our results emphasize that the bacterial view of transcription regulation, where pairwise interactions between regulatory proteins dominate, must be reexamined in animals, where multiple molecular mechanisms collaborate to shape the gene regulatory function.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila , Regulación del Desarrollo de la Expresión Génica , Proteínas de Homeodominio/metabolismo , Transactivadores/metabolismo , Factores de Transcripción/metabolismo , Animales , Perfilación de la Expresión Génica , Modelos Genéticos , Transcripción Genética
6.
Curr Genet ; 64(2): 327-333, 2018 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28983660

RESUMEN

Full genome recoding, or rewriting codon meaning, through chemical synthesis of entire bacterial chromosomes has become feasible in the past several years. Recoding an organism can impart new properties including non-natural amino acid incorporation, virus resistance, and biocontainment. The estimated cost of construction that includes DNA synthesis, assembly by recombination, and troubleshooting, is now comparable to costs of early stage development of drugs or other high-tech products. Here, we discuss several recently published assembly methods and provide some thoughts on the future, including how synthetic efforts might benefit from the analysis of natural recoding processes and organisms that use alternative genetic codes.


Asunto(s)
ADN/biosíntesis , Evolución Molecular , Genes Sintéticos/genética , Código Genético/genética , Codón/genética , ADN/genética , Escherichia coli/genética , Ingeniería Genética , Genoma Bacteriano/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA