Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-38617247

RESUMO

Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. While advances in deep learning enable the prediction of accurate protein structural models, RNA structure prediction is not possible at present due to a lack of abundant high-quality reference data. Furthermore, available sequence data are generally not associated with organismal phenotypes that could inform RNA function. We created GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences derived from GTDB genomes to experimental and predicted optimal growth temperatures of GTDB reference organisms. This enables construction of deep and diverse RNA sequence alignments to be used for machine learning. Using GARNET, we define the minimal requirements for a sequence- and structure-aware RNA generative model. We also develop a GPT-like language model for RNA in which triplet tokenization provides optimal encoding. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identified mutations in ribosomal RNA that confer increased thermostability to the Escherichia coli ribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function.

2.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36511586

RESUMO

SUMMARY: Codetta is a Python program for predicting the genetic code table of an organism from nucleotide sequences. Codetta can analyze an arbitrary nucleotide sequence and needs no sequence annotation or taxonomic placement. The most likely amino acid decoding for each of the 64 codons is inferred from alignments of profile hidden Markov models of conserved proteins to the input sequence. AVAILABILITY AND IMPLEMENTATION: Codetta 2.0 is implemented as a Python 3 program for MacOS and Linux and is available from http://eddylab.org/software/codetta/codetta2.tar.gz and at http://github.com/kshulgina/codetta. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Código Genético , Software , Sequência de Bases
3.
Science ; 376(6593): 630-635, 2022 05 06.
Artigo em Inglês | MEDLINE | ID: mdl-35511982

RESUMO

Epistasis can markedly affect evolutionary trajectories. In recent decades, protein-level fitness landscapes have revealed extensive idiosyncratic epistasis among specific mutations. By contrast, other work has found ubiquitous and apparently nonspecific patterns of global diminishing-returns and increasing-costs epistasis among mutations across the genome. Here, we used a hierarchical CRISPR gene drive system to construct all combinations of 10 missense mutations from across the genome in budding yeast and measured their fitness in six environments. We show that the resulting fitness landscapes exhibit global fitness-correlated trends but that these trends emerge from specific idiosyncratic interactions. We thus provide experimental validation of recent theoretical work arguing that fitness-correlated trends can emerge as the generic consequence of idiosyncratic epistasis.


Assuntos
Evolução Biológica , Epistasia Genética , Aptidão Genética , Modelos Genéticos , Mutação
4.
Elife ; 102021 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-34751130

RESUMO

The genetic code has been proposed to be a 'frozen accident,' but the discovery of alternative genetic codes over the past four decades has shown that it can evolve to some degree. Since most examples were found anecdotally, it is difficult to draw general conclusions about the evolutionary trajectories of codon reassignment and why some codons are affected more frequently. To fill in the diversity of genetic codes, we developed Codetta, a computational method to predict the amino acid decoding of each codon from nucleotide sequence data. We surveyed the genetic code usage of over 250,000 bacterial and archaeal genome sequences in GenBank and discovered five new reassignments of arginine codons (AGG, CGA, and CGG), representing the first sense codon changes in bacteria. In a clade of uncultivated Bacilli, the reassignment of AGG to become the dominant methionine codon likely evolved by a change in the amino acid charging of an arginine tRNA. The reassignments of CGA and/or CGG were found in genomes with low GC content, an evolutionary force that likely helped drive these codons to low frequency and enable their reassignment.


All life forms rely on a 'code' to translate their genetic information into proteins. This code relies on limited permutations of three nucleotides ­ the building blocks that form DNA and other types of genetic information. Each 'triplet' of nucleotides ­ or codon ­ encodes a specific amino acid, the basic component of proteins. Reading the sequence of codons in the right order will let the cell know which amino acid to assemble next on a growing protein. For instance, the codon CGG ­ formed of the nucleotides guanine (G) and cytosine (C) ­ codes for the amino acid arginine. From bacteria to humans, most life forms rely on the same genetic code. Yet certain organisms have evolved to use slightly different codes, where one or several codons have an altered meaning. To better understand how alternative genetic codes have evolved, Shulgina and Eddy set out to find more organisms featuring these altered codons, creating a new software called Codetta that can analyze the genome of a microorganism and predict the genetic code it uses. Codetta was then used to sift through the genetic information of 250,000 microorganisms. This was made possible by the sequencing, in recent years, of the genomes of hundreds of thousands of bacteria and other microorganisms ­ including many never studied before. These analyses revealed five groups of bacteria with alternative genetic codes, all of which had changes in the codons that code for arginine. Amongst these, four had genomes with a low proportion of guanine and cytosine nucleotides. This may have made some guanine and cytosine-rich arginine codons very rare in these organisms and, therefore, easier to be reassigned to encode another amino acid. The work by Shulgina and Eddy demonstrates that Codetta is a new, useful tool that scientists can use to understand how genetic codes evolve. In addition, it can also help to ensure the accuracy of widely used protein databases, which assume which genetic code organisms use to predict protein sequences from their genomes.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Código Genético , Técnicas Genéticas/instrumentação , Genoma Arqueal , Genoma Bacteriano , Códon/genética
5.
Elife ; 82019 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-31223115

RESUMO

Developmental enhancers integrate graded concentrations of transcription factors (TFs) to create sharp gene expression boundaries. Here we examine the hunchback P2 (HbP2) enhancer which drives a sharp expression pattern in the Drosophila blastoderm embryo in response to the transcriptional activator Bicoid (Bcd). We systematically interrogate cis and trans factors that influence the shape and position of expression driven by HbP2, and find that the prevailing model, based on pairwise cooperative binding of Bcd to HbP2 is not adequate. We demonstrate that other proteins, such as pioneer factors, Mediator and histone modifiers influence the shape and position of the HbP2 expression pattern. Comparing our results to theory reveals how higher-order cooperativity and energy expenditure impact boundary location and sharpness. Our results emphasize that the bacterial view of transcription regulation, where pairwise interactions between regulatory proteins dominate, must be reexamined in animals, where multiple molecular mechanisms collaborate to shape the gene regulatory function.


Assuntos
Proteínas de Ligação a DNA/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Homeodomínio/metabolismo , Transativadores/metabolismo , Fatores de Transcrição/metabolismo , Animais , Perfilação da Expressão Gênica , Modelos Genéticos , Transcrição Gênica
6.
Curr Genet ; 64(2): 327-333, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28983660

RESUMO

Full genome recoding, or rewriting codon meaning, through chemical synthesis of entire bacterial chromosomes has become feasible in the past several years. Recoding an organism can impart new properties including non-natural amino acid incorporation, virus resistance, and biocontainment. The estimated cost of construction that includes DNA synthesis, assembly by recombination, and troubleshooting, is now comparable to costs of early stage development of drugs or other high-tech products. Here, we discuss several recently published assembly methods and provide some thoughts on the future, including how synthetic efforts might benefit from the analysis of natural recoding processes and organisms that use alternative genetic codes.


Assuntos
DNA/biossíntese , Evolução Molecular , Genes Sintéticos/genética , Código Genético/genética , Códon/genética , DNA/genética , Escherichia coli/genética , Engenharia Genética , Genoma Bacteriano/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...