Your browser doesn't support javascript.
loading
A Bayesian framework to identify methylcytosines from high-throughput bisulfite sequencing data.
Xie, Qing; Liu, Qi; Mao, Fengbiao; Cai, Wanshi; Wu, Honghu; You, Mingcong; Wang, Zhen; Chen, Bingyu; Sun, Zhong Sheng; Wu, Jinyu.
Afiliação
  • Xie Q; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China.
  • Liu Q; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China.
  • Mao F; Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.
  • Cai W; Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.
  • Wu H; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China.
  • You M; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China.
  • Wang Z; Institute of Molecular Medicine, Department of blood transfusion, Zhejiang Provincial People's Hospital, Hangzhou, Zhejiang, China.
  • Chen B; Institute of Molecular Medicine, Department of blood transfusion, Zhejiang Provincial People's Hospital, Hangzhou, Zhejiang, China.
  • Sun ZS; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China; Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.
  • Wu J; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China.
PLoS Comput Biol ; 10(9): e1003853, 2014 Sep.
Article em En | MEDLINE | ID: mdl-25255082
High-throughput bisulfite sequencing technologies have provided a comprehensive and well-fitted way to investigate DNA methylation at single-base resolution. However, there are substantial bioinformatic challenges to distinguish precisely methylcytosines from unconverted cytosines based on bisulfite sequencing data. The challenges arise, at least in part, from cell heterozygosis caused by multicellular sequencing and the still limited number of statistical methods that are available for methylcytosine calling based on bisulfite sequencing data. Here, we present an algorithm, termed Bycom, a new Bayesian model that can perform methylcytosine calling with high accuracy. Bycom considers cell heterozygosis along with sequencing errors and bisulfite conversion efficiency to improve calling accuracy. Bycom performance was compared with the performance of Lister, the method most widely used to identify methylcytosines from bisulfite sequencing data. The results showed that the performance of Bycom was better than that of Lister for data with high methylation levels. Bycom also showed higher sensitivity and specificity for low methylation level samples (<1%) than Lister. A validation experiment based on reduced representation bisulfite sequencing data suggested that Bycom had a false positive rate of about 4% while maintaining an accuracy of close to 94%. This study demonstrated that Bycom had a low false calling rate at any methylation level and accurate methylcytosine calling at high methylation levels. Bycom will contribute significantly to studies aimed at recalibrating the methylation level of genomic regions based on the presence of methylcytosines.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Sulfitos / Algoritmos / Análise de Sequência de DNA / 5-Metilcitosina / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2014 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Sulfitos / Algoritmos / Análise de Sequência de DNA / 5-Metilcitosina / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2014 Tipo de documento: Article