Your browser doesn't support javascript.
loading
Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data.
Amin, Md Ruhul; Hasan, Mahmudul; Arnab, Sandipan Paul; DeGiorgio, Michael.
Afiliação
  • Amin MR; Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA.
  • Hasan M; Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA.
  • Arnab SP; Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA.
  • DeGiorgio M; Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA.
Mol Biol Evol ; 40(10)2023 10 04.
Article em En | MEDLINE | ID: mdl-37772983
ABSTRACT
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Genômica Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Genômica Idioma: En Ano de publicação: 2023 Tipo de documento: Article