Your browser doesn't support javascript.
loading
Integrating LASSO Feature Selection and Soft Voting Classifier to Identify Origins of Replication Sites.
Yao, Yingying; Zhang, Shengli; Xue, Tian.
Afiliação
  • Yao Y; School of Mathematics and Statistics, Xidian University, Xi'an 710071, P.R. China.
  • Zhang S; School of Mathematics and Statistics, Xidian University, Xi'an 710071, P.R. China.
  • Xue T; School of Mathematics and Statistics, Xidian University, Xi'an 710071, P.R. China.
Curr Genomics ; 23(2): 83-93, 2022 Jun 10.
Article em En | MEDLINE | ID: mdl-36778978
ABSTRACT

Background:

DNA replication plays an indispensable role in the transmission of genetic information. It is considered to be the basis of biological inheritance and the most fundamental process in all biological life. Considering that DNA replication initiates with a special location, namely the origin of replication, a better and accurate prediction of the origins of replication sites (ORIs) is essential to gain insight into the relationship with gene expression.

Objective:

In this study, we have developed an efficient predictor called iORI-LAVT for ORIs identification.

Methods:

This work focuses on extracting feature information from three aspects, including mono-nucleotide encoding, k-mer and ring-function-hydrogen-chemical properties. Subsequently, least absolute shrinkage and selection operator (LASSO) as a feature selection is applied to select the optimal features. Comparing the different combined soft voting classifiers results, the soft voting classifier based on GaussianNB and Logistic Regression is employed as the final classifier.

Results:

Based on 10-fold cross-validation test, the prediction accuracies of two benchmark datasets are 90.39% and 95.96%, respectively. As for the independent dataset, our method achieves high accuracy of 91.3%.

Conclusion:

Compared with previous predictors, iORI-LAVT outperforms the existing methods. It is believed that iORI-LAVT predictor is a promising alternative for further research on identifying ORIs.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Curr Genomics Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Curr Genomics Ano de publicação: 2022 Tipo de documento: Article