Your browser doesn't support javascript.
loading
Normalization benchmark of ATAC-seq datasets shows the importance of accounting for GC-content effects.
Van den Berge, Koen; Chou, Hsin-Jung; Roux de Bézieux, Hector; Street, Kelly; Risso, Davide; Ngai, John; Dudoit, Sandrine.
Afiliação
  • Van den Berge K; Department of Statistics, University of California, Berkeley, Berkeley, CA, USA.
  • Chou HJ; Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.
  • Roux de Bézieux H; Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.
  • Street K; Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
  • Risso D; Division of Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, CA, USA.
  • Ngai J; Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
  • Dudoit S; Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
Cell Rep Methods ; 2(11): 100321, 2022 11 21.
Article em En | MEDLINE | ID: mdl-36452861
ABSTRACT
The assay for transposase-accessible chromatin using sequencing (ATAC-seq) allows the study of epigenetic regulation of gene expression by assessing chromatin configuration for an entire genome. Despite its popularity, there have been limited studies investigating the analytical challenges related to ATAC-seq data, with most studies leveraging tools developed for bulk transcriptome sequencing. Here, we show that GC-content effects are omnipresent in ATAC-seq datasets. Since the GC-content effects are sample specific, they can bias downstream analyses such as clustering and differential accessibility analysis. We introduce a normalization method based on smooth-quantile normalization within GC-content bins and evaluate it together with 11 different normalization procedures on 8 public ATAC-seq datasets. Accounting for GC-content effects in the normalization is crucial for common downstream ATAC-seq data analyses, improving accuracy and interpretability. Through case studies, we show that exploratory data analysis is essential to guide the choice of an appropriate normalization method for a given dataset.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Benchmarking / Sequenciamento de Cromatina por Imunoprecipitação Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Benchmarking / Sequenciamento de Cromatina por Imunoprecipitação Idioma: En Ano de publicação: 2022 Tipo de documento: Article