Integrated variant allele frequency analysis pipeline and R package: easyVAF.
Mol Carcinog
; 62(12): 1877-1887, 2023 Dec.
Article
em En
| MEDLINE
| ID: mdl-37606183
Somatic sequence variants are associated with cancer diagnosis, prognostic stratification, and treatment response. Variant allele frequency (VAF), the percentage of sequence reads with a specific DNA variant over the read depth at that locus, has been used as a metric to quantify mutation rates in these applications. VAF has the potential for feature detection by reflecting changes in tumor clonal composition across treatments or time points. Although there are several packages, including Genome Analysis Toolkit and VarScan, designed for variant calling and rare mutation identification, there is no readily available package for comparing VAFs among and between groups to identify loci of interest. To this end, we have developed the R package easyVAF, which includes parametric and nonparametric tests to compare VAFs among multiple groups. It is accompanied by an interactive R Shiny app. With easyVAF, the investigator has the option between three statistical tests to maximize power while maintaining an acceptable type I error rate. This paper presents our proposed pipeline for VAF analysis, from quality checking to group comparison. We evaluate our method in a wide range of simulated scenarios and show that choosing the appropriate test to limit the type I error rate is critical. For situations where data is sparse, we recommend comparing VAFs with the beta-binomial likelihood ratio test over Fisher's exact test and Pearson's χ2 test.
Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Neoplasias
Idioma:
En
Ano de publicação:
2023
Tipo de documento:
Article