Your browser doesn't support javascript.
loading
Adjustment of spurious correlations in co-expression measurements from RNA-Sequencing data.
Hsieh, Ping-Han; Lopes-Ramos, Camila Miranda; Zucknick, Manuela; Sandve, Geir Kjetil; Glass, Kimberly; Kuijjer, Marieke Lydia.
Afiliación
  • Hsieh PH; Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo 0318, Norway.
  • Lopes-Ramos CM; Department of Informatics, University of Oslo, Oslo 0316, Norway.
  • Zucknick M; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States.
  • Sandve GK; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
  • Glass K; Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, United States.
  • Kuijjer ML; Oslo Centre for Biostatistics and Epidemiology, Institute of Basic Medical Sciences, University of Oslo, Oslo 0317, Norway.
Bioinformatics ; 39(10)2023 10 03.
Article en En | MEDLINE | ID: mdl-37802917
MOTIVATION: Gene co-expression measurements are widely used in computational biology to identify coordinated expression patterns across a group of samples. Coordinated expression of genes may indicate that they are controlled by the same transcriptional regulatory program, or involved in common biological processes. Gene co-expression is generally estimated from RNA-Sequencing data, which are commonly normalized to remove technical variability. Here, we demonstrate that certain normalization methods, in particular quantile-based methods, can introduce false-positive associations between genes. These false-positive associations can consequently hamper downstream co-expression network analysis. Quantile-based normalization can, however, be extremely powerful. In particular, when preprocessing large-scale heterogeneous data, quantile-based normalization methods such as smooth quantile normalization can be applied to remove technical variability while maintaining global differences in expression for samples with different biological attributes. RESULTS: We developed SNAIL (Smooth-quantile Normalization Adaptation for the Inference of co-expression Links), a normalization method based on smooth quantile normalization specifically designed for modeling of co-expression measurements. We show that SNAIL avoids formation of false-positive associations in co-expression as well as in downstream network analyses. Using SNAIL, one can avoid arbitrary gene filtering and retain associations to genes that only express in small subgroups of samples. This highlights the method's potential future impact on network modeling and other association-based approaches in large-scale heterogeneous data. AVAILABILITY AND IMPLEMENTATION: The implementation of the SNAIL algorithm and code to reproduce the analyses described in this work can be found in the GitHub repository https://github.com/kuijjerlab/PySNAIL.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: ARN / Perfilación de la Expresión Génica Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: Noruega

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: ARN / Perfilación de la Expresión Génica Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: Noruega
...