Your browser doesn't support javascript.
loading
BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing.
Vigorito, Elena; Barton, Anne; Pitzalis, Costantino; Lewis, Myles J; Wallace, Chris.
Afiliación
  • Vigorito E; MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, United Kingdom.
  • Barton A; Division of Musculoskeletal and Dermatological Sciences, University of Manchester, Manchester M13 9PL, United Kingdom.
  • Pitzalis C; Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, United Kingdom.
  • Lewis MJ; Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, United Kingdom.
  • Wallace C; MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, United Kingdom.
Bioinformatics ; 39(7)2023 07 01.
Article en En | MEDLINE | ID: mdl-37338536
MOTIVATION: While many pipelines have been developed for calling genotypes using RNA-sequencing (RNA-Seq) data, they all have adapted DNA genotype callers that do not model biases specific to RNA-Seq such as allele-specific expression (ASE). RESULTS: Here, we present Bayesian beta-binomial mixture model (BBmix), a Bayesian beta-binomial mixture model that first learns the expected distribution of read counts for each genotype, and then deploys those learned parameters to call genotypes probabilistically. We benchmarked our model on a wide variety of datasets and showed that our method generally performed better than competitors, mainly due to an increase of up to 1.4% in the accuracy of heterozygous calls, which may have a big impact in reducing false positive rate in applications sensitive to genotyping error such as ASE. Moreover, BBmix can be easily incorporated into standard pipelines for calling genotypes. We further show that parameters are generally transferable within datasets, such that a single learning run of less than 1 h is sufficient to call genotypes in a large number of samples. AVAILABILITY AND IMPLEMENTATION: We implemented BBmix as an R package that is available for free under a GPL-2 licence at https://gitlab.com/evigorito/bbmix and https://cran.r-project.org/package=bbmix with accompanying pipeline at https://gitlab.com/evigorito/bbmix_pipeline.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: ARN / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Prognostic_studies Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: Reino Unido

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: ARN / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Prognostic_studies Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: Reino Unido