Your browser doesn't support javascript.
loading
Addressing noise in co-expression network construction.
Burns, Joshua J R; Shealy, Benjamin T; Greer, Mitchell S; Hadish, John A; McGowan, Matthew T; Biggs, Tyler; Smith, Melissa C; Feltus, F Alex; Ficklin, Stephen P.
Afiliación
  • Burns JJR; Department of Horticulture, 149 Johnson Hall. Washington State University, Pullman, WA 99164. USA.
  • Shealy BT; Department of Electrical & Computer Engineering, 105 Riggs Hall. Clemson University, Clemson, SC 29631. USA.
  • Greer MS; School of Electrical Engineering and Computer Science, EME 102. Washington State University, Pullman, WA 99164. USA.
  • Hadish JA; Molecular Plant Sciences Program, French Ad 324g. Washington State University, Pullman, WA 99164. USA.
  • McGowan MT; Molecular Plant Sciences Program, French Ad 324g. Washington State University, Pullman, WA 99164. USA.
  • Biggs T; Department of Horticulture, 149 Johnson Hall. Washington State University, Pullman, WA 99164. USA.
  • Smith MC; Department of Electrical & Computer Engineering, 105 Riggs Hall. Clemson University, Clemson, SC 29631. USA.
  • Feltus FA; Department of Genetics and Biochemistry, 130 McGinty Court. Clemson University, Clemson, SC 29634. USA.
  • Ficklin SP; Biomedical Data Science & Informatics Program, 100 McAdams Hall. Clemson University, Clemson, SC 29634. USA.
Brief Bioinform ; 23(1)2022 01 17.
Article en En | MEDLINE | ID: mdl-34850822
ABSTRACT
Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges. To demonstrate this problem, a 475-sample dataset is used to show that up to 97% of GCN edges can be misleading because correlations are false or incorrect. False and incorrect correlations can occur when tests are applied without ensuring assumptions are met, and pairwise gene expression may not meet test assumptions if the expression of at least one gene in the pairwise comparison is a function of multiple confounding variables. The 'one-size-fits-all' approach to GCN construction is therefore problematic for large, multivariable datasets. Recently, the Knowledge Independent Network Construction toolkit has been used in multiple studies to provide a dynamic approach to GCN construction that ensures statistical tests meet assumptions and confounding variables are addressed. Additionally, it can associate experimental context for each edge of the network resulting in context-specific GCNs (csGCNs). To help researchers recognize such challenges in GCN construction, and the creation of csGCNs, we provide a review of the workflow.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Redes Reguladoras de Genes / Transcriptoma Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2022 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Redes Reguladoras de Genes / Transcriptoma Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2022 Tipo del documento: Article