Búsqueda | Biblioteca Virtual en Salud

SGCP: a spectral self-learning method for clustering genes in co-expression networks.

Aghaieabiane, Niloofar; Koutis, Ioannis.

BMC Bioinformatics ; 25(1): 230, 2024 Jul 02.

Artículo en Inglés | MEDLINE | ID: mdl-38956463

RESUMEN

BACKGROUND: A widely used approach for extracting information from gene expression data employs the construction of a gene co-expression network and the subsequent computational detection of gene clusters, called modules. WGCNA and related methods are the de facto standard for module detection. The purpose of this work is to investigate the applicability of more sophisticated algorithms toward the design of an alternative method with enhanced potential for extracting biologically meaningful modules. RESULTS: We present self-learning gene clustering pipeline (SGCP), a spectral method for detecting modules in gene co-expression networks. SGCP incorporates multiple features that differentiate it from previous work, including a novel step that leverages gene ontology (GO) information in a self-leaning step. Compared with widely used existing frameworks on 12 real gene expression datasets, we show that SGCP yields modules with higher GO enrichment. Moreover, SGCP assigns highest statistical importance to GO terms that are mostly different from those reported by the baselines. CONCLUSION: Existing frameworks for discovering clusters of genes in gene co-expression networks are based on relatively simple algorithmic components. SGCP relies on newer algorithmic techniques that enable the computation of highly enriched modules with distinctive characteristics, thus contributing a novel alternative tool for gene co-expression analysis.

Asunto(s)

Algoritmos , Redes Reguladoras de Genes , Análisis por Conglomerados , Redes Reguladoras de Genes/genética , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Humanos , Ontología de Genes , Familia de Multigenes , Bases de Datos Genéticas

A Novel Calibration Step in Gene Co-Expression Network Construction.

Aghaieabiane, Niloofar; Koutis, Ioannis.

Front Bioinform ; 1: 704817, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-36303738

RESUMEN

High-throughput technologies such as DNA microarrays and RNA-sequencing are used to measure the expression levels of large numbers of genes simultaneously. To support the extraction of biological knowledge, individual gene expression levels are transformed to Gene Co-expression Networks (GCNs). In a GCN, nodes correspond to genes, and the weight of the connection between two nodes is a measure of similarity in the expression behavior of the two genes. In general, GCN construction and analysis includes three steps; 1) calculating a similarity value for each pair of genes 2) using these similarity values to construct a fully connected weighted network 3) finding clusters of genes in the network, commonly called modules. The specific implementation of these three steps can significantly impact the final output and the downstream biological analysis. GCN construction is a well-studied topic. Existing algorithms rely on relatively simple statistical and mathematical tools to implement these steps. Currently, software package WGCNA appears to be the most widely accepted standard. We hypothesize that the raw features provided by sequencing data can be leveraged to extract modules of higher quality. A novel preprocessing step of the gene expression data set is introduced that in effect calibrates the expression levels of individual genes, before computing pairwise similarities. Further, the similarity is computed as an inner-product of positive vectors. In experiments, this provides a significant improvement over WGCNA, as measured by aggregate p-values of the gene ontology term enrichment of the computed modules.

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

Detalles de la búsqueda