Your browser doesn't support javascript.
loading
Differentially used codons among essential genes in bacteria identified by machine learning-based analysis.
Kurmi, Annushree; Sen, Piyali; Dash, Madhusmita; Ray, Suvendra Kumar; Satapathy, Siddhartha Sankar.
Affiliation
  • Kurmi A; Department of Computer Science and Engineering, Tezpur University, Napaam, Assam, 784028, India.
  • Sen P; Department of Computer Science and Engineering, The Assam Kaziranga University, Jorhat, Assam, 785006, India.
  • Dash M; Department of Computer Science and Engineering, Tezpur University, Napaam, Assam, 784028, India.
  • Ray SK; Department of Electronics and Communication Engineering, NIT, Jote, Arunachal Pradesh, 791113, India.
  • Satapathy SS; Department of Molecular Biology and Biotechnology, Tezpur University, Napaam, Assam, 784028, India.
Mol Genet Genomics ; 299(1): 72, 2024 Jul 27.
Article in En | MEDLINE | ID: mdl-39060647
ABSTRACT
Codon usage bias (CUB), the uneven usage of synonymous codons encoding the same amino acid, differs among genes within and across bacteria genomes. CUB is known to be influenced by gene expression and accordingly, CUB differs between the high-expression and low-expression genes in several bacteria. In this article, we have extended codon usage study considering gene essentiality as a feature. Using machine learning (ML) based approaches, we have analysed Relative Synonymous Codon Usage (RSCU) values between essential and non-essential genes in Escherichia coli and thirty-four other bacterial genomes whose gene essentiality features were available in public databases. We observed significant differences in codon usage patterns between essential and non-essential genes for majority of the bacterial genomes and accordingly, ML based classifiers achieved high area under curve (AUC) scores, with a minimum score of 70.0 across twenty-eight organisms. Further, importance of the codons towards classifying genes found to differ among the codons in each genome. Arg codon CGT and Gly codon GGT were observed to be the most preferred codons among essential genes in Escherichia coli. Interestingly, some of the codons like CGT, ATA, GGT and GGG observed to be contributing consistently towards classifying essential genes across thirty-five bacteria genomes studied. In other hand, codons TGY and CAY encoding amino acids Cys and His respectively were among the least contributing codons towards classification among all these bacteria. This study demonstrates the gene essentiality based differences in synonymous codon usage in bacteria genomes and presents a common codon usage pattern across bacteria.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Genes, Essential / Escherichia coli / Machine Learning / Codon Usage Language: En Journal: Mol Genet Genomics Journal subject: BIOLOGIA MOLECULAR / GENETICA Year: 2024 Document type: Article Affiliation country: Country of publication:

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Genes, Essential / Escherichia coli / Machine Learning / Codon Usage Language: En Journal: Mol Genet Genomics Journal subject: BIOLOGIA MOLECULAR / GENETICA Year: 2024 Document type: Article Affiliation country: Country of publication: