Your browser doesn't support javascript.
loading
DeClUt: Decluttering differentially expressed genes through clustering of their expression profiles.
Zanfardino, Mario; Franzese, Monica; Geraci, Filippo.
Afiliação
  • Zanfardino M; IRCCS Synlab SDN, Via E. Gianturco, 113, Naples, 80143, Italy.
  • Franzese M; IRCCS Synlab SDN, Via E. Gianturco, 113, Naples, 80143, Italy. Electronic address: monica.franzese@synlab.it.
  • Geraci F; Institute for Informatics and Telematics, CNR, Via G. Moruzzi 1, Pisa, 56124, Italy.
Comput Methods Programs Biomed ; 254: 108258, 2024 Sep.
Article em En | MEDLINE | ID: mdl-38851122
ABSTRACT
BACKGROUND AND

OBJECTIVE:

differential expression analysis is one of the most popular activities in transcriptomic studies based on next-generation sequencing technologies. In fact, differentially expressed genes (DEGs) between two conditions represent ideal prognostic and diagnostic candidate biomarkers for many pathologies. As a result, several algorithms, such as DESeq2 and edgeR, have been developed to identify DEGs. Despite their widespread use, there is no consensus on which model performs best for different types of data, and many existing methods suffer from high False Discovery Rates (FDR).

METHODS:

we present a new algorithm, DeClUt, based on the intuition that the expression profile of differentially expressed genes should form two reasonably compact and well-separated clusters. This, in turn, implies that the bipartition induced by the two conditions being compared should overlap with the clustering. The clustering algorithm underlying DeClUt was designed to be robust to outliers typical of RNA-seq data. In particular, we used the average silhouette function to enforce membership assignment of samples to the most appropriate condition.

RESULTS:

DeClUt was tested on real RNA-seq datasets and benchmarked against four of the most widely used methods (edgeR, DESeq2, NOISeq, and SAMseq). Experiments showed a higher self-consistency of results than the competitors as well as a significantly lower False Positive Rate (FPR). Moreover, tested on a real prostate cancer RNA-seq dataset, DeClUt has highlighted 8 DE genes, linked to neoplastic process according to DisGeNET database, that none of the other methods had identified.

CONCLUSIONS:

our work presents a novel algorithm that builds upon basic concepts of data clustering and exhibits greater consistency and significantly lower False Positive Rate than state-of-the-art methods. Additionally, DeClUt is able to highlight relevant differentially expressed genes not otherwise identified by other tools contributing to improve efficacy of differential expression analyses in various biological applications.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Perfilação da Expressão Gênica Limite: Humans / Male Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Perfilação da Expressão Gênica Limite: Humans / Male Idioma: En Ano de publicação: 2024 Tipo de documento: Article