Your browser doesn't support javascript.
loading
Cosbin: cosine score-based iterative normalization of biologically diverse samples.
Wu, Chiung-Ting; Shen, Minjie; Du, Dongping; Cheng, Zuolin; Parker, Sarah J; Lu, Yingzhou; Van Eyk, Jennifer E; Yu, Guoqiang; Clarke, Robert; Herrington, David M; Wang, Yue.
Affiliation
  • Wu CT; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
  • Shen M; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
  • Du D; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
  • Cheng Z; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
  • Parker SJ; Advanced Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA.
  • Lu Y; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
  • Van Eyk JE; Advanced Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA.
  • Yu G; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
  • Clarke R; The Hormel Institute, University of Minnesota, Austin, MN 55912, USA.
  • Herrington DM; Department of Internal Medicine, Wake Forest University, Winston-Salem, NC 27157, USA.
  • Wang Y; Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
Bioinform Adv ; 2(1): vbac076, 2022.
Article de En | MEDLINE | ID: mdl-36330358
ABSTRACT
Motivation Data normalization is essential to ensure accurate inference and comparability of gene expression measures across samples or conditions. Ideally, gene expression data should be rescaled based on consistently expressed reference genes. However, to normalize biologically diverse samples, the most commonly used reference genes exhibit striking expression variability and size-factor or distribution-based normalization methods can be problematic when the amount of asymmetry in differential expression is significant.

Results:

We report an efficient and accurate data-driven method-Cosine score-based iterative normalization (Cosbin)-to normalize biologically diverse samples. Based on the Cosine scores of cross-condition expression patterns, the Cosbin pipeline iteratively eliminates asymmetric differentially expressed genes, identifies consistently expressed genes, and calculates sample-wise normalization factors. We demonstrate the superior performance and enhanced utility of Cosbin compared with six representative peer methods using both simulation and real multi-omics expression datasets. Implemented in open-source R scripts and specifically designed to address normalization bias due to significant asymmetry in differential expression across multiple conditions, the Cosbin tool complements rather than replaces the existing methods and will allow biologists to more accurately detect true molecular signals among diverse phenotypic groups. Availability and implementation The R scripts of Cosbin pipeline are freely available at https//github.com/MinjieSh/Cosbin. Supplementary information Supplementary data are available at Bioinformatics Advances online.

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Type d'étude: Prognostic_studies Langue: En Journal: Bioinform Adv Année: 2022 Type de document: Article Pays d'affiliation: États-Unis d'Amérique

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Type d'étude: Prognostic_studies Langue: En Journal: Bioinform Adv Année: 2022 Type de document: Article Pays d'affiliation: États-Unis d'Amérique