Your browser doesn't support javascript.
loading
Effective normalization for copy number variation in Hi-C data.
Servant, Nicolas; Varoquaux, Nelle; Heard, Edith; Barillot, Emmanuel; Vert, Jean-Philippe.
Afiliação
  • Servant N; Institut Curie, PSL Research University, Paris, F-75005, France. nicolas.servant@curie.fr.
  • Varoquaux N; INSERM, U900, Paris, F-75005, France. nicolas.servant@curie.fr.
  • Heard E; Mines ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, F-75006, France. nicolas.servant@curie.fr.
  • Barillot E; Department of Statistics, University of California, Berkeley, USA.
  • Vert JP; Berkeley Institute for Data Science, Berkeley, USA.
BMC Bioinformatics ; 19(1): 313, 2018 Sep 06.
Article em En | MEDLINE | ID: mdl-30189838
ABSTRACT

BACKGROUND:

Normalization is essential to ensure accurate analysis and proper interpretation of sequencing data, and chromosome conformation capture data such as Hi-C have particular challenges. Although several methods have been proposed, the most widely used type of normalization of Hi-C data usually casts estimation of unwanted effects as a matrix balancing problem, relying on the assumption that all genomic regions interact equally with each other.

RESULTS:

In order to explore the effect of copy-number variations on Hi-C data normalization, we first propose a simulation model that predict the effects of large copy-number changes on a diploid Hi-C contact map. We then show that the standard approaches relying on equal visibility fail to correct for unwanted effects in the presence of copy-number variations. We thus propose a simple extension to matrix balancing methods that model these effects. Our approach can either retain the copy-number variation effects (LOIC) or remove them (CAIC). We show that this leads to better downstream analysis of the three-dimensional organization of rearranged genomes.

CONCLUSIONS:

Taken together, our results highlight the importance of using dedicated methods for the analysis of Hi-C cancer data. Both CAIC and LOIC methods perform well on simulated and real Hi-C data sets, each fulfilling different needs.
Assuntos
Palavras-chave

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Genoma Humano / Aberrações Cromossômicas / Mapeamento Cromossômico / Biologia Computacional / Genômica / Variações do Número de Cópias de DNA / Neoplasias Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2018 Tipo de documento: Article País de afiliação: França

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Genoma Humano / Aberrações Cromossômicas / Mapeamento Cromossômico / Biologia Computacional / Genômica / Variações do Número de Cópias de DNA / Neoplasias Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2018 Tipo de documento: Article País de afiliação: França