Your browser doesn't support javascript.
loading
Variable selection for nonlinear dimensionality reduction of biological datasets through bootstrapping of correlation networks.
Aragones, David G; Palomino-Segura, Miguel; Sicilia, Jon; Crainiciuc, Georgiana; Ballesteros, Iván; Sánchez-Cabo, Fátima; Hidalgo, Andrés; Calvo, Gabriel F.
Afiliação
  • Aragones DG; Department of Mathematics & MOLAB-Mathematical Oncology Laboratory, Universidad de Castilla-La Mancha, Ciudad Real, Spain.
  • Palomino-Segura M; Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain; Immunophysiology Research Group, Instituto Universitario de Investigación Biosanitaria de Extremadura (INUBE), Badajoz, Spain; Department of Physiology, Faculty of Sciences, Univers
  • Sicilia J; Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain.
  • Crainiciuc G; Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain.
  • Ballesteros I; Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain.
  • Sánchez-Cabo F; Bioinformatics Unit, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain.
  • Hidalgo A; Vascular Biology and Therapeutics Program and Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA.
  • Calvo GF; Department of Mathematics & MOLAB-Mathematical Oncology Laboratory, Universidad de Castilla-La Mancha, Ciudad Real, Spain. Electronic address: gabriel.fernandez@uclm.es.
Comput Biol Med ; 168: 107827, 2024 01.
Article em En | MEDLINE | ID: mdl-38086138
ABSTRACT
Identifying the most relevant variables or features in massive datasets for dimensionality reduction can lead to improved and more informative display, faster computation times, and more explainable models of complex systems. Despite significant advances and available algorithms, this task generally remains challenging, especially in unsupervised settings. In this work, we propose a method that constructs correlation networks using all intervening variables and then selects the most informative ones based on network bootstrapping. The method can be applied in both supervised and unsupervised scenarios. We demonstrate its functionality by applying Uniform Manifold Approximation and Projection for dimensionality reduction to several high-dimensional biological datasets, derived from 4D live imaging recordings of hundreds of morpho-kinetic variables, describing the dynamics of thousands of individual leukocytes at sites of prominent inflammation. We compare our method with other standard ones in the field, such as Principal Component Analysis and Elastic Net, showing that it outperforms them. The proposed method can be employed in a wide range of applications, encompassing data analysis and machine learning.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Aprendizado de Máquina Idioma: En Revista: Comput Biol Med Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Espanha

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Aprendizado de Máquina Idioma: En Revista: Comput Biol Med Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Espanha