Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Genome Biol Evol ; 16(6)2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38805023

RESUMEN

The genetic code consists of 61 codons coding for 20 amino acids. These codons are recognized by transfer RNAs (tRNAs) that bind to specific codons during protein synthesis. All organisms utilize less than all 61 possible anticodons due to base pair wobble: the ability to have a mismatch with a codon at its third nucleotide. Previous studies observed a correlation between the tRNA pool of bacteria and the temperature of their respective environments. However, it is unclear if these patterns represent biological adaptations to maintain the efficiency and accuracy of protein synthesis in different environments. A mechanistic mathematical model of mRNA translation is used to quantify the expected elongation rates and error rate for each codon based on an organism's tRNA pool. A comparative analysis across a range of bacteria that accounts for covariance due to shared ancestry is performed to quantify the impact of environmental temperature on the evolution of the tRNA pool. We find that thermophiles generally have more anticodons represented in their tRNA pool than mesophiles or psychrophiles. Based on our model, this increased diversity is expected to lead to increased missense errors. The implications of this for protein evolution in thermophiles are discussed.


Asunto(s)
Bacterias , Evolución Molecular , ARN de Transferencia , Temperatura , ARN de Transferencia/genética , Bacterias/genética , Codón , ARN Bacteriano/genética , Anticodón/genética , Biosíntesis de Proteínas , Modelos Genéticos , Código Genético
2.
Genome Biol Evol ; 16(5)2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38619010

RESUMEN

Rosenberg AA, Marx A, Bronstein AM (Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon. Nat Commun. 2022:13:2815) recently found a surprising correlation between synonymous codon usage and the dihedral bond angles of the resulting amino acid. However, their analysis did not account for the strongest known correlate of codon usage: gene expression. We re-examined the relationship between bond angles and codon usage by applying the approach of Rosenberg et al. to simulated protein-coding sequences that (i) have random codon usage, (ii) codon usage determined by mutation biases, and (iii) maintain the general relationship between codon usage and gene expression via the assumption of selection-mutation-drift equilibrium. We observed correlations between dihedral bond angle and codon usage when codon usage is entirely random, indicating possible conflation of noise with differences in bond angle distributions between synonymous codons. More relevant to the general analysis of codon usage patterns, we found surprisingly good agreement between the analysis of the real sequences and the analysis of sequences simulated assuming selection-mutation-drift equilibrium, with 91% of significant synonymous codon pairs detected in the former were also detected in the latter. We believe the correlation between codon usage and dihedral bond angles resulted from the variation in codon usage across genes due to the interplay between mutation bias, natural selection for translation efficiency, and gene expression, further underscoring these factors must be controlled for when looking for novel patterns related to codon usage.


Asunto(s)
Uso de Codones , Escherichia coli , Escherichia coli/genética , Selección Genética , Proteínas de Escherichia coli/genética , Codón , Mutación Silenciosa , Mutación
3.
bioRxiv ; 2023 Oct 09.
Artículo en Inglés | MEDLINE | ID: mdl-37873246

RESUMEN

The genetic code consists of 61 codon coding for 20 amino acids. These codons are recognized by transfer RNAs (tRNA) that bind to specific codons during protein synthesis. Most organisms utilize less than all 61 possible anticodons due to base pair wobble: the ability to have a mismatch with a codon at its third nucleotide. Previous studies observed a correlation between the tRNA pool of bacteria and the temperature of their respective environments. However, it is unclear if these patterns represent biological adaptations to maintain the efficiency and accuracy of protein synthesis in different environments. A mechanistic mathematical model of mRNA translation is used to quantify the expected elongation rates and error rate for each codon based on an organism's tRNA pool. A comparative analysis across a range of bacteria that accounts for covariance due to shared ancestry is performed to quantify the impact of environmental temperature on the evolution of the tRNA pool. We find that thermophiles generally have more anticodons represented in their tRNA pool than mesophiles or psychrophiles. Based on our model, this increased diversity is expected to lead to increased missense errors. The implications of this for protein evolution in thermophiles are discussed.

4.
Mol Biol Evol ; 40(8)2023 08 03.
Artículo en Inglés | MEDLINE | ID: mdl-37498582

RESUMEN

Variation in gene expression across lineages is thought to explain much of the observed phenotypic variation and adaptation. The protein is closer to the target of natural selection but gene expression is typically measured as the amount of mRNA. The broad assumption that mRNA levels are good proxies for protein levels has been undermined by a number of studies reporting moderate or weak correlations between the two measures across species. One biological explanation for this discrepancy is that there has been compensatory evolution between the mRNA level and regulation of translation. However, we do not understand the evolutionary conditions necessary for this to occur nor the expected strength of the correlation between mRNA and protein levels. Here, we develop a theoretical model for the coevolution of mRNA and protein levels and investigate the dynamics of the model over time. We find that compensatory evolution is widespread when there is stabilizing selection on the protein level; this observation held true across a variety of regulatory pathways. When the protein level is under directional selection, the mRNA level of a gene and the translation rate of the same gene were negatively correlated across lineages but positively correlated across genes. These findings help explain results from comparative studies of gene expression and potentially enable researchers to disentangle biological and statistical hypotheses for the mismatch between transcriptomic and proteomic data.


Asunto(s)
Evolución Molecular , Proteínas , ARN Mensajero , ARN Mensajero/genética , ARN Mensajero/metabolismo , Proteínas/genética , Proteínas/metabolismo , Transcripción Genética , Biosíntesis de Proteínas , Genes , Selección Genética , Proteómica , Perfilación de la Expresión Génica
5.
bioRxiv ; 2023 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-37066157

RESUMEN

Variation in gene expression across lineages is thought to explain much of the observed phenotypic variation and adaptation. The protein is closer to the target of natural selection but gene expression is typically measured as the amount of mRNA. The broad assumption that mRNA levels are good proxies for protein levels has been undermined by a number of studies reporting moderate or weak correlations between the two measures across species. One biological explanation for this discrepancy is that there has been compensatory evolution between the mRNA level and regulation of translation. However, we do not understand the evolutionary conditions necessary for this to occur nor the expected strength of the correlation between mRNA and protein levels. Here we develop a theoretical model for the coevolution of mRNA and protein levels and investigate the dynamics of the model over time. We find that compensatory evolution is widespread when there is stabilizing selection on the protein level, which is true across a variety of regulatory pathways. When the protein level is under directional selection, the mRNA level of a gene and its translation rate of the same gene were negatively correlated across lineages but positively correlated across genes. These findings help explain results from comparative studies of gene expression and potentially enable researchers to disentangle biological and statistical hypotheses for the mismatch between transcriptomic and proteomic studies.

6.
Elife ; 112022 Oct 10.
Artículo en Inglés | MEDLINE | ID: mdl-36214449

RESUMEN

Organisms can adapt to an environment by taking multiple mutational paths. This redundancy at the genetic level, where many mutations have similar phenotypic and fitness effects, can make untangling the molecular mechanisms of complex adaptations difficult. Here, we use the Escherichia coli long-term evolution experiment (LTEE) as a model to address this challenge. To understand how different genomic changes could lead to parallel fitness gains, we characterize the landscape of transcriptional and translational changes across 12 replicate populations evolving in parallel for 50,000 generations. By quantifying absolute changes in mRNA abundances, we show that not only do all evolved lines have more mRNAs but that this increase in mRNA abundance scales with cell size. We also find that despite few shared mutations at the genetic level, clones from replicate populations in the LTEE are remarkably similar in their gene expression patterns at both the transcriptional and translational levels. Furthermore, we show that the majority of the expression changes are due to changes at the transcriptional level with very few translational changes. Finally, we show how mutations in transcriptional regulators lead to consistent and parallel changes in the expression levels of downstream genes. These results deepen our understanding of the molecular mechanisms underlying complex adaptations and provide insights into the repeatability of evolution.


The reason we look like our parents is because we inherit their genes. Genes carry the instructions for our cells to make messenger RNAs (mRNAs), which our cells then translate into proteins. Proteins, in turn, determine many of our features. This is true for all living organisms. Any changes ­ or mutations ­ in an organism's genes can lead to variations in its proteins, which can alter the organism's traits. This is the basis for evolution: mutations can lead to changes that allow an organism to better adapt to a new environment. This increases the organism's chances of survival and reproduction ­ its evolutionary 'fitness' ­ and makes it more likely that the mutation that generated the new trait in the first place will be passed on to the organism's descendants. However, just because two organisms have evolved similar traits to adapt to similar environments, it does not mean that the genetic basis for the adaptation is the same. For example, many animals share similar coloring to warn off predators, but the way that coloring is coded genetically is completely different. In species that are related (which share many of the same genes), this type of evolution is called 'parallel evolution', and it can make it difficult for scientists to understand how an organism evolved and pinpoint exactly what mutations are linked to which features. In 1988, scientists established the 'long-term evolution experiment' to tackle questions about how evolution works. The experiment, which has been running for over 30 years, consisted on tracking the evolution of 12 populations of Escherichia coli bacteria grown in separate flasks containing the same low-nutrient medium. The initial 12 populations were genetically identical, making this an ideal system to study parallel evolution, since all the populations had to evolve to adapt to the same environment, whilst isolated from each other. In previous experiments, scientists had already noted that while the different bacterial populations grew in similar ways, they had mostly different mutations. To better understand parallel evolution, Favate et al. analyzed the synthesis rates of RNA and proteins in the E. coli populations used in the long-term evolution experiment. They found that 22 years after the start of the experiment, all 12 populations produced more RNA, grew faster and were bigger. Additionally, while the different populations had accumulated few shared mutations after 22 years, they all shared similar patterns of RNA levels and protein synthesis rates. Further probing revealed that parallel evolution may be linked to how genes are regulated: mutations in regulators of related groups of genes involved in the same processes inside the cell can amplify the degree of parallel changes in organisms. This means that mutations in these genes may lead to similar traits. These findings provide insight into how parallel evolution arises in the long-term evolution experiment, and provides clues as to how the same traits can evolve several times.


Asunto(s)
Proteínas de Escherichia coli , Escherichia coli , Escherichia coli/genética , Escherichia coli/metabolismo , Adaptación Fisiológica/genética , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Bacterias/genética , Mutación , ARN Mensajero/metabolismo
7.
PLoS Genet ; 18(6): e1010256, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35714134

RESUMEN

Patterns of non-uniform usage of synonymous codons vary across genes in an organism and between species across all domains of life. This codon usage bias (CUB) is due to a combination of non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most models quantify the effects of mutation bias and selection on CUB assuming uniform mutational and other non-adaptive forces across the genome. However, non-adaptive nucleotide biases can vary within a genome due to processes such as biased gene conversion (BGC), potentially obfuscating signals of selection on codon usage. Moreover, genome-wide estimates of non-adaptive nucleotide biases are lacking for non-model organisms. We combine an unsupervised learning method with a population genetics model of synonymous coding sequence evolution to assess the impact of intragenomic variation in non-adaptive nucleotide bias on quantification of natural selection on synonymous codon usage across 49 Saccharomycotina yeasts. We find that in the absence of a priori information, unsupervised learning can be used to identify genes evolving under different non-adaptive nucleotide biases. We find that the impact of intragenomic variation in non-adaptive nucleotide bias varies widely, even among closely-related species. We show that the overall strength and direction of translational selection can be underestimated by failing to account for intragenomic variation in non-adaptive nucleotide biases. Interestingly, genes falling into clusters identified by machine learning are also physically clustered across chromosomes. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable non-adaptive nucleotide biases on codon frequencies.


Asunto(s)
Uso de Codones , Nucleótidos , Sesgo , Codón/genética , Uso de Codones/genética , Evolución Molecular , Mutación , Nucleótidos/genética , Selección Genética
8.
BMC Genomics ; 23(1): 408, 2022 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-35637464

RESUMEN

BACKGROUND: Codon usage bias (CUB), the non-uniform usage of synonymous codons, occurs across all domains of life. Adaptive CUB is hypothesized to result from various selective pressures, including selection for efficient ribosome elongation, accurate translation, mRNA secondary structure, and/or protein folding. Given the critical link between protein folding and protein function, numerous studies have analyzed the relationship between codon usage and protein structure. The results from these studies have often been contradictory, likely reflecting the differing methods used for measuring codon usage and the failure to appropriately control for confounding factors, such as differences in amino acid usage between protein structures and changes in the frequency of different structures with gene expression. RESULTS: Here we take an explicit population genetics approach to quantify codon-specific shifts in natural selection related to protein structure in S. cerevisiae and E. coli. Unlike other metrics of codon usage, our approach explicitly separates the effects of natural selection, scaled by gene expression, and mutation bias while naturally accounting for a region's amino acid usage. Bayesian model comparisons suggest selection on codon usage varies only slightly between helix, sheet, and coil secondary structures and, similarly, between structured and intrinsically-disordered regions. Similarly, in contrast to prevous findings, we find selection on codon usage only varies slightly at the termini of helices in E. coli. Using simulated data, we show this previous work indicating "non-optimal" codons are enriched at the beginning of helices in S. cerevisiae was due to failure to control for various confounding factors (e.g. amino acid biases, gene expression, etc.), and rather than selection to modulate cotranslational folding. CONCLUSIONS: Our results reveal a weak relationship between codon usage and protein structure, indicating that differences in selection on codon usage between structures are slight. In addition to the magnitude of differences in selection between protein structures being slight, the observed shifts appear to be idiosyncratic and largely codon-specific rather than systematic reversals in the nature of selection. Overall, our work demonstrates the statistical power and benefits of studying selective shifts on codon usage or other genomic features from an explicitly evolutionary approach. Limitations of this approach and future potential research avenues are discussed.


Asunto(s)
Uso de Codones , Saccharomyces cerevisiae , Aminoácidos/genética , Teorema de Bayes , Codón/genética , Escherichia coli/genética , Genética de Población , Saccharomyces cerevisiae/genética , Selección Genética
9.
Curr Biol ; 32(9): 1924-1936.e6, 2022 05 09.
Artículo en Inglés | MEDLINE | ID: mdl-35334227

RESUMEN

Extracellular vesicles (EVs) may mediate intercellular communication by carrying protein and RNA cargo. The composition, biology, and roles of EVs in physiology and pathology have been primarily studied in the context of biofluids and in cultured mammalian cells. The experimental tractability of C. elegans makes for a powerful in vivo animal system to identify and study EV cargo from its cellular source. We developed an innovative method to label, track, and profile EVs using genetically encoded, fluorescent-tagged EV cargo and conducted a large-scale isolation and proteomic profiling. Nucleic acid binding proteins (∼200) are overrepresented in our dataset. By integrating our EV proteomic dataset with single-cell transcriptomic data, we identified and validated ciliary EV cargo: CD9-like tetraspanin (TSP-6), ectonucleotide pyrophosphatase/phosphodiesterase (ENPP-1), minichromosome maintenance protein (MCM-3), and double-stranded RNA transporter SID-2. C. elegans EVs also harbor RNA, suggesting that EVs may play a role in extracellular RNA-based communication.


Asunto(s)
Caenorhabditis elegans , Vesículas Extracelulares , Animales , Caenorhabditis elegans/genética , Comunicación Celular , Vesículas Extracelulares/metabolismo , Mamíferos/genética , Proteómica , ARN
10.
Bioinformatics ; 38(8): 2358-2360, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35157051

RESUMEN

MOTIVATION: Ribosome profiling, or Ribo-seq, is the state-of-the-art method for quantifying protein synthesis in living cells. Computational analysis of Ribo-seq data remains challenging due to the complexity of the procedure, as well as variations introduced for specific organisms or specialized analyses. RESULTS: We present riboviz 2, an updated riboviz package, for the comprehensive transcript-centric analysis and visualization of Ribo-seq data. riboviz 2 includes an analysis workflow built on the Nextflow workflow management system for end-to-end processing of Ribo-seq data. riboviz 2 has been extensively tested on diverse species and library preparation strategies, including multiplexed samples. riboviz 2 is flexible and uses open, documented file formats, allowing users to integrate new analyses with the pipeline. AVAILABILITY AND IMPLEMENTATION: riboviz 2 is freely available at github.com/riboviz/riboviz.


Asunto(s)
Perfilado de Ribosomas , Ribosomas , Ribosomas/genética , Ribosomas/metabolismo , Flujo de Trabajo , ARN Mensajero/metabolismo , Análisis de Datos , Análisis de Secuencia de ARN/métodos
11.
Methods Mol Biol ; 2404: 83-110, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34694605

RESUMEN

The emergence of ribosome profiling as a tool for measuring the translatome has provided researchers with valuable insights into the post-transcriptional regulation of gene expression. Despite the biological insights and technical improvements made since the technique was initially described by Ingolia et al. (Science 324(5924):218-223, 2009), ribosome profiling measurements and subsequent data analysis remain challenging. Here, we describe our lab's protocol for performing ribosome profiling in bacteria, yeast, and mammalian cells. This protocol has integrated elements from three published ribosome profiling methods. In addition, we describe a tool called RiboViz (Carja et al., BMC Bioinformatics 18:461, 2017) ( https://github.com/riboviz/riboviz ) for the analysis and visualization of ribosome profiling data. Given raw sequencing reads and transcriptome information (e.g., FASTA, GFF) for a species, RiboViz performs the necessary pre-processing and mapping of the raw sequencing reads. RiboViz also provides the user with various quality control visualizations.


Asunto(s)
Ribosomas , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Biosíntesis de Proteínas , Control de Calidad , ARN Mensajero/metabolismo , Ribosomas/genética , Ribosomas/metabolismo , Análisis de Secuencia de ARN , Transcriptoma
12.
Proc Natl Acad Sci U S A ; 118(27)2021 07 06.
Artículo en Inglés | MEDLINE | ID: mdl-34187896

RESUMEN

Chemical modifications of RNA 5'-ends enable "epitranscriptomic" regulation, influencing multiple aspects of RNA fate. In transcription initiation, a large inventory of substrates compete with nucleoside triphosphates for use as initiating entities, providing an ab initio mechanism for altering the RNA 5'-end. In Escherichia coli cells, RNAs with a 5'-end hydroxyl are generated by use of dinucleotide RNAs as primers for transcription initiation, "primer-dependent initiation." Here, we use massively systematic transcript end readout (MASTER) to detect and quantify RNA 5'-ends generated by primer-dependent initiation for ∼410 (∼1,000,000) promoter sequences in E. coli The results show primer-dependent initiation in E. coli involves any of the 16 possible dinucleotide primers and depends on promoter sequences in, upstream, and downstream of the primer binding site. The results yield a consensus sequence for primer-dependent initiation, YTSS-2NTSS-1NTSSWTSS+1, where TSS is the transcription start site, NTSS-1NTSS is the primer binding site, Y is pyrimidine, and W is A or T. Biochemical and structure-determination studies show that the base pair (nontemplate-strand base:template-strand base) immediately upstream of the primer binding site (Y:RTSS-2, where R is purine) exerts its effect through the base on the DNA template strand (RTSS-2) through interchain base stacking with the RNA primer. Results from analysis of a large set of natural, chromosomally encoded Ecoli promoters support the conclusions from MASTER. Our findings provide a mechanistic and structural description of how TSS-region sequence hard-codes not only the TSS position but also the potential for epitranscriptomic regulation through primer-dependent transcription initiation.


Asunto(s)
Cartilla de ADN/metabolismo , Escherichia coli/genética , Regiones Promotoras Genéticas , Iniciación de la Transcripción Genética , Secuencia de Bases , Sitios de Unión , Cromosomas Bacterianos/genética , Regulación Bacteriana de la Expresión Génica , ARN Mensajero/genética , ARN Mensajero/metabolismo , Sitio de Iniciación de la Transcripción
13.
Biotechnol Biofuels ; 14(1): 116, 2021 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-33971924

RESUMEN

BACKGROUND: Mass spectrometry-based proteomics can identify and quantify thousands of proteins from individual microbial species, but a significant percentage of these proteins are unannotated and hence classified as proteins of unknown function (PUFs). Due to the difficulty in extracting meaningful metabolic information, PUFs are often overlooked or discarded during data analysis, even though they might be critically important in functional activities, in particular for metabolic engineering research. RESULTS: We optimized and employed a pipeline integrating various "guilt-by-association" (GBA) metrics, including differential expression and co-expression analyses of high-throughput mass spectrometry proteome data and phylogenetic coevolution analysis, and sequence homology-based approaches to determine putative functions for PUFs in Clostridium thermocellum. Our various analyses provided putative functional information for over 95% of the PUFs detected by mass spectrometry in a wild-type and/or an engineered strain of C. thermocellum. In particular, we validated a predicted acyltransferase PUF (WP_003519433.1) with functional activity towards 2-phenylethyl alcohol, consistent with our GBA and sequence homology-based predictions. CONCLUSIONS: This work demonstrates the value of leveraging sequence homology-based annotations with empirical evidence based on the concept of GBA to broadly predict putative functions for PUFs, opening avenues to further interrogation via targeted experiments.

14.
BMC Genomics ; 21(1): 370, 2020 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-32434474

RESUMEN

BACKGROUND: Researchers often measure changes in gene expression across conditions to better understand the shared functional roles and regulatory mechanisms of different genes. Analogous to this is comparing gene expression across species, which can improve our understanding of the evolutionary processes shaping the evolution of both individual genes and functional pathways. One area of interest is determining genes showing signals of coevolution, which can also indicate potential functional similarity, analogous to co-expression analysis often performed across conditions for a single species. However, as with any trait, comparing gene expression across species can be confounded by the non-independence of species due to shared ancestry, making standard hypothesis testing inappropriate. RESULTS: We compared RNA-Seq data across 18 fungal species using a multivariate Brownian Motion phylogenetic comparative method (PCM), which allowed us to quantify coevolution between protein pairs while directly accounting for the shared ancestry of the species. Our work indicates proteins which physically-interact show stronger signals of coevolution than randomly-generated pairs. Interactions with stronger empirical and computational evidence also showing stronger signals of coevolution. We examined the effects of number of protein interactions and gene expression levels on coevolution, finding both factors are overall poor predictors of the strength of coevolution between a protein pair. Simulations further demonstrate the potential issues of analyzing gene expression coevolution without accounting for shared ancestry in a standard hypothesis testing framework. Furthermore, our simulations indicate the use of a randomly-generated null distribution as a means of determining statistical significance for detecting coevolving genes with phylogenetically-uncorrected correlations, as has previously been done, is less accurate than PCMs, although is a significant improvement over standard hypothesis testing. These methods are further improved by using a phylogenetically-corrected correlation metric. CONCLUSIONS: Our work highlights potential benefits of using PCMs to detect gene expression coevolution from high-throughput omics scale data. This framework can be built upon to investigate other evolutionary hypotheses, such as changes in transcription regulatory mechanisms across species.


Asunto(s)
Evolución Molecular , Proteínas Fúngicas/genética , Hongos/genética , Expresión Génica , Proteínas Fúngicas/metabolismo , Hongos/clasificación , Hongos/metabolismo , Modelos Genéticos , Fenotipo , Filogenia , Unión Proteica
15.
Biochim Biophys Acta Biomembr ; 1860(12): 2479-2485, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-30279149

RESUMEN

The Sec secretion pathway is found across all domains of life. A critical feature of Sec secreted proteins is the signal peptide, a short peptide with distinct physicochemical properties located at the N-terminus of the protein. Previous work indicates signal peptides are biased towards translationally inefficient codons, which is hypothesized to be an adaptation driven by selection to improve the efficacy and efficiency of the protein secretion mechanisms. We investigate codon usage in the signal peptides of E. coli using the Codon Adaptation Index (CAI), the tRNA Adaptation Index (tAI), and the ribosomal overhead cost formulation of the stochastic evolutionary model of protein production rates (ROC-SEMPPR). Comparisons between signal peptides and 5'-end of cytoplasmic proteins using CAI and tAI are consistent with a preference for inefficient codons in signal peptides. Simulations reveal these differences are due to amino acid usage and gene expression - we find these differences disappear when accounting for both factors. In contrast, ROC-SEMPPR, a mechanistic population genetics model capable of separating the effects of selection and mutation bias, shows codon usage bias (CUB) of the signal peptides is indistinguishable from the 5'-ends of cytoplasmic proteins. Additionally, we find CUB at the 5'-ends is weaker than later segments of the gene. Results illustrate the value in using models grounded in population genetics to interpret genetic data. We show failure to account for mutation bias and the effects of gene expression on the efficacy of selection against translation inefficiency can lead to a misinterpretation of codon usage patterns.


Asunto(s)
Aminoácidos/metabolismo , Codón , Escherichia coli/genética , Expresión Génica , Señales de Clasificación de Proteína/genética , Genes Bacterianos , Mutación , Biosíntesis de Proteínas , ARN de Transferencia/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...