Your browser doesn't support javascript.
loading
Computational identification of promoters in Klebsiella aerogenes by using support vector machine.
Lin, Yan; Sun, Meili; Zhang, Junjie; Li, Mingyan; Yang, Keli; Wu, Chengyan; Zulfiqar, Hasan; Lai, Hongyan.
Afiliación
  • Lin Y; Key Laboratory for Animal Disease-Resistance Nutrition of the Ministry of Agriculture, Animal Nutrition Institute, Sichuan Agricultural University, Chengdu, China.
  • Sun M; Beidahuang Industry Group General Hospital, Harbin, China.
  • Zhang J; Key Laboratory for Animal Disease-Resistance Nutrition of the Ministry of Agriculture, Animal Nutrition Institute, Sichuan Agricultural University, Chengdu, China.
  • Li M; Chifeng Product Quality Inspection and Testing Centre, Chifeng, China.
  • Yang K; Nonlinear Research Institute, Baoji University of Arts and Sciences, Baoji, China.
  • Wu C; Baotou Teacher's College, Inner Mongolia University of Science and Technology, Baotou, China.
  • Zulfiqar H; Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, Zhejiang, China.
  • Lai H; Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China.
Front Microbiol ; 14: 1200678, 2023.
Article en En | MEDLINE | ID: mdl-37250059
ABSTRACT
Promoters are the basic functional cis-elements to which RNA polymerase binds to initiate the process of gene transcription. Comprehensive understanding gene expression and regulation depends on the precise identification of promoters, as they are the most important component of gene expression. This study aimed to develop a machine learning-based model to predict promoters in Klebsiella aerogenes (K. aerogenes). In the prediction model, the promoter sequences in K. aerogenes genome were encoded by pseudo k-tuple nucleotide composition (PseKNC) and position-correlation scoring function (PCSF). Numerical features were obtained and then optimized using mRMR by combining with support vector machine (SVM) and 5-fold cross-validation (CV). Subsequently, these optimized features were inputted into SVM-based classifier to discriminate promoter sequences from non-promoter sequences in K. aerogenes. Results of 10-fold CV showed that the model could yield the overall accuracy of 96.0% and the area under the ROC curve (AUC) of 0.990. We hope that this model will provide help for the study of promoter and gene regulation in K. aerogenes.
Palabras clave

Texto completo: 1 Base de datos: MEDLINE Tipo de estudio: Diagnostic_studies / Prognostic_studies Idioma: En Revista: Front Microbiol Año: 2023 Tipo del documento: Article País de afiliación: China

Texto completo: 1 Base de datos: MEDLINE Tipo de estudio: Diagnostic_studies / Prognostic_studies Idioma: En Revista: Front Microbiol Año: 2023 Tipo del documento: Article País de afiliación: China