Hidden patterns of codon usage bias across kingdoms.

Deng, Yun; de Lima Hedayioglu, Fabio; Kalfon, Jeremie; Chu, Dominique; von der Haar, Tobias

Deng, Yun; de Lima Hedayioglu, Fabio; Kalfon, Jeremie; Chu, Dominique; von der Haar, Tobias.

Afiliación

Deng Y; School of Computing, University of Kent, Canterbury CT2 7NF, UK.
de Lima Hedayioglu F; Kent Fungal Group, School of Biosciences, University of Kent, Canterbury CT2 7NJ, UK.
Kalfon J; Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
Chu D; School of Computing, University of Kent, Canterbury CT2 7NF, UK.
von der Haar T; Kent Fungal Group, School of Biosciences, University of Kent, Canterbury CT2 7NJ, UK.

J R Soc Interface ; 17(163): 20190819, 2020 02.

Article en En | MEDLINE | ID: mdl-32070219

RESUMEN

The genetic code is necessarily degenerate with 64 possible nucleotide triplets being translated into 20 amino acids. Eighteen out of the 20 amino acids are encoded by multiple synonymous codons. While synonymous codons are clearly equivalent in terms of the information they carry, it is now well established that they are used in a biased fashion. There is currently no consensus as to the origin of this bias. Drawing on ideas from stochastic thermodynamics we derive from first principles a mathematical model describing the statistics of codon usage bias. We show that the model accurately describes the distribution of codon usage bias of genomes in the fungal and bacterial kingdoms. Based on it, we derive a new computational measure of codon usage bias-the distance D capturing two aspects of codon usage bias: (i) differences in the genome-wide frequency of codons and (ii) apparent non-random distributions of codons across mRNAs. By means of large scale computational analysis of over 900 species across two kingdoms of life, we demonstrate that our measure provides novel biological insights. Specifically, we show that while codon usage bias is clearly based on heritable traits and closely related species show similar degrees of bias, there is considerable variation in the magnitude of D within taxonomic classes suggesting that the contribution of sequence-level selection to codon bias varies substantially within relatively confined taxonomic groups. Interestingly, commonly used model organisms are near the median for values of D for their taxonomic class, suggesting that they may not be good representative models for species with more extreme D, which comprise organisms of medical and agricultural interest. We also demonstrate that amino acid specific patterns of codon usage are themselves quite variable between branches of the tree of life, and that some of this variability correlates with organismal tRNA content.

Asunto(s)

Uso de Codones; Código Genético; Aminoácidos/genética; Bacterias/genética; Codón/genética

Palabras clave

bacteria; codon usage bias; fungi; protists; stochastic thermodynamics

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Uso de Codones / Código Genético Tipo de estudio: Prognostic_studies Idioma: En Revista: J R Soc Interface Año: 2020 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google