High-dimension to high-dimension screening for detecting genome-wide epigenetic and noncoding RNA regulators of gene expression.
Bioinformatics
; 38(17): 4078-4087, 2022 09 02.
Article
em En
| MEDLINE
| ID: mdl-35856716
ABSTRACT
MOTIVATION The advancement of high-throughput technology characterizes a wide variety of epigenetic modifications and noncoding RNAs across the genome involved in disease pathogenesis via regulating gene expression. The high dimensionality of both epigenetic/noncoding RNA and gene expression data make it challenging to identify the important regulators of genes. Conducting univariate test for each possible regulator-gene pair is subject to serious multiple comparison burden, and direct application of regularization methods to select regulator-gene pairs is computationally infeasible. Applying fast screening to reduce dimension first before regularization is more efficient and stable than applying regularization methods alone. RESULTS:
We propose a novel screening method based on robust partial correlation to detect epigenetic and noncoding RNA regulators of gene expression over the whole genome, a problem that includes both high-dimensional predictors and high-dimensional responses. Compared to existing screening methods, our method is conceptually innovative that it reduces the dimension of both predictor and response, and screens at both node (regulators or genes) and edge (regulator-gene pairs) levels. We develop data-driven procedures to determine the conditional sets and the optimal screening threshold, and implement a fast iterative algorithm. Simulations and applications to long noncoding RNA and microRNA regulation in Kidney cancer and DNA methylation regulation in Glioblastoma Multiforme illustrate the validity and advantage of our method. AVAILABILITY AND IMPLEMENTATION The R package, related source codes and real datasets used in this article are provided at https//github.com/kehongjie/rPCor. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Genoma
/
RNA Longo não Codificante
Tipo de estudo:
Diagnostic_studies
/
Screening_studies
Idioma:
En
Ano de publicação:
2022
Tipo de documento:
Article