Your browser doesn't support javascript.
loading
A novel machine learning approach (svmSomatic) to distinguish somatic and germline mutations using next-generation sequencing data.
Mao, Yu-Fang; Yuan, Xi-Guo; Cun, Yu-Peng.
Afiliação
  • Mao YF; School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China.
  • Yuan XG; School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China. E-mail: xiguoyuan@mail.xidian.edu.cn.
  • Cun YP; iFlora Bioinformatics Center, Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China. E-mail: cunyupeng@mail.kib.ac.cn.
Zool Res ; 42(2): 246-249, 2021 Mar 18.
Article em En | MEDLINE | ID: mdl-33709636
Somatic mutations are a large category of genetic variations, which play an essential role in tumorigenesis. Detection of somatic single nucleotide variants (SNVs) could facilitate downstream analysis of tumorigenesis. Many computational methods have been developed to detect SNVs, but most require normal matched samples to differentiate somatic SNVs from the normal state, which can be difficult to obtain. Therefore, developing new approaches for detecting somatic SNVs without matched samples are crucial. In this work, we detected somatic mutations from individual tumor samples based on a novel machine learning approach, svmSomatic, using next-generation sequencing (NGS) data. In addition, as somatic SNV detection can be impacted by multiple mutations, with germline mutations and co-occurrence of copy number variations (CNVs) common in organisms, we used the novel approach to distinguish somatic and germline mutations based on the NGS data from individual tumor samples. In summary, svmSomatic: (1) considers the influence of CNV co-occurrence in detecting somatic mutations; and (2) trains a support vector machine algorithm to distinguish between somatic and germline mutations, without requiring normal matched samples. We further tested and compared svmSomatic with other common methods. Results showed that svmSomatic performance, as measured by F1-score, was significantly better than that of others using both simulation and real NGS data.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / Mutação / Neoplasias Limite: Animals / Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / Mutação / Neoplasias Limite: Animals / Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article