Your browser doesn't support javascript.
loading
Doppelgänger spotting in biomedical gene expression data.
Wang, Li Rong; Choy, Xin Yun; Goh, Wilson Wen Bin.
Afiliación
  • Wang LR; School of Computer Science and Engineering, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore.
  • Choy XY; School of Computer Science and Engineering, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore.
  • Goh WWB; School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore.
iScience ; 25(8): 104788, 2022 Aug 19.
Article en En | MEDLINE | ID: mdl-35992056
ABSTRACT
Doppelgänger effects (DEs) occur when samples exhibit chance similarities such that, when split across training and validation sets, inflates the trained machine learning (ML) model performance. This inflationary effect causes misleading confidence on the deployability of the model. Thus, so far, there are no tools for doppelgänger identification or standard practices to manage their confounding implications. We present doppelgangerIdentifier, a software suite for doppelgänger identification and verification. Applying doppelgangerIdentifier across a multitude of diseases and data types, we show the pervasive nature of DEs in biomedical gene expression data. We also provide guidelines toward proper doppelgänger identification by exploring the ramifications of lingering batch effects from batch imbalances on the sensitivity of our doppelgänger identification algorithm. We suggest doppelgänger verification as a useful procedure to establish baselines for model evaluation that may inform on whether feature selection and ML on the data set may yield meaningful insights.
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Guideline Idioma: En Revista: IScience Año: 2022 Tipo del documento: Article País de afiliación: Singapur

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Guideline Idioma: En Revista: IScience Año: 2022 Tipo del documento: Article País de afiliación: Singapur