Your browser doesn't support javascript.
loading
Towards a better detection of horizontally transferred genes by combining unusual properties effectively.
Xiong, Dapeng; Xiao, Fen; Liu, Li; Hu, Kai; Tan, Yanping; He, Shunmin; Gao, Xieping.
Afiliação
  • Xiong D; Key Laboratory of Intelligent Computing & Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan, People's Republic of China.
PLoS One ; 7(8): e43126, 2012.
Article em En | MEDLINE | ID: mdl-22905214
BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It's known that different properties may mean different information, so the single property can't sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn't been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection. RESULTS: Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1-GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly. CONCLUSIONS: Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Transferência Genética Horizontal Idioma: En Ano de publicação: 2012 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Transferência Genética Horizontal Idioma: En Ano de publicação: 2012 Tipo de documento: Article