Your browser doesn't support javascript.
loading
Enhancing Hi-C contact matrices for loop detection with Capricorn: a multiview diffusion model.
Fang, Tangqi; Liu, Yifeng; Woicik, Addie; Lu, Minsi; Jha, Anupama; Wang, Xiao; Li, Gang; Hristov, Borislav; Liu, Zixuan; Xu, Hanwen; Noble, William S; Wang, Sheng.
Afiliación
  • Fang T; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
  • Liu Y; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
  • Woicik A; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
  • Lu M; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
  • Jha A; Department of Genome Sciences, University of Washington, Seattle, WA 98195, United States.
  • Wang X; Department of Computer Science, Purdue University, West Lafayette, IN 47907, United States.
  • Li G; Department of Genome Sciences, University of Washington, Seattle, WA 98195, United States.
  • Hristov B; eScience Institute, University of Washington, Seattle, WA 98195, United States.
  • Liu Z; Department of Genome Sciences, University of Washington, Seattle, WA 98195, United States.
  • Xu H; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
  • Noble WS; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
  • Wang S; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, United States.
Bioinformatics ; 40(Supplement_1): i471-i480, 2024 Jun 28.
Article en En | MEDLINE | ID: mdl-38940142
ABSTRACT
MOTIVATION High-resolution Hi-C contact matrices reveal the detailed three-dimensional architecture of the genome, but high-coverage experimental Hi-C data are expensive to generate. Simultaneously, chromatin structure analyses struggle with extremely sparse contact matrices. To address this problem, computational methods to enhance low-coverage contact matrices have been developed, but existing methods are largely based on resolution enhancement methods for natural images and hence often employ models that do not distinguish between biologically meaningful contacts, such as loops and other stochastic contacts.

RESULTS:

We present Capricorn, a machine learning model for Hi-C resolution enhancement that incorporates small-scale chromatin features as additional views of the input Hi-C contact matrix and leverages a diffusion probability model backbone to generate a high-coverage matrix. We show that Capricorn outperforms the state of the art in a cross-cell-line setting, improving on existing methods by 17% in mean squared error and 26% in F1 score for chromatin loop identification from the generated high-coverage data. We also demonstrate that Capricorn performs well in the cross-chromosome setting and cross-chromosome, cross-cell-line setting, improving the downstream loop F1 score by 14% relative to existing methods. We further show that our multiview idea can also be used to improve several existing methods, HiCARN and HiCNN, indicating the wide applicability of this approach. Finally, we use DNA sequence to validate discovered loops and find that the fraction of CTCF-supported loops from Capricorn is similar to those identified from the high-coverage data. Capricorn is a powerful Hi-C resolution enhancement method that enables scientists to find chromatin features that cannot be identified in the low-coverage contact matrix. AVAILABILITY AND IMPLEMENTATION Implementation of Capricorn and source code for reproducing all figures in this paper are available at https//github.com/CHNFTQ/Capricorn.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Cromatina / Aprendizaje Automático Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Cromatina / Aprendizaje Automático Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos
...