Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 3 de 3
1.
Nat Methods ; 21(5): 835-845, 2024 May.
Article En | MEDLINE | ID: mdl-38374265

Modern multiomic technologies can generate deep multiscale profiles. However, differences in data modalities, multicollinearity of the data, and large numbers of irrelevant features make analyses and integration of high-dimensional omic datasets challenging. Here we present Significant Latent Factor Interaction Discovery and Exploration (SLIDE), a first-in-class interpretable machine learning technique for identifying significant interacting latent factors underlying outcomes of interest from high-dimensional omic datasets. SLIDE makes no assumptions regarding data-generating mechanisms, comes with theoretical guarantees regarding identifiability of the latent factors/corresponding inference, and has rigorous false discovery rate control. Using SLIDE on single-cell and spatial omic datasets, we uncovered significant interacting latent factors underlying a range of molecular, cellular and organismal phenotypes. SLIDE outperforms/performs at least as well as a wide range of state-of-the-art approaches, including other latent factor approaches. More importantly, it provides biological inference beyond prediction that other methods do not afford. Thus, SLIDE is a versatile engine for biological discovery from modern multiomic datasets.


Machine Learning , Humans , Computational Biology/methods , Animals , Single-Cell Analysis/methods , Algorithms
2.
Patterns (N Y) ; 3(5): 100473, 2022 May 13.
Article En | MEDLINE | ID: mdl-35607614

High-dimensional cellular and molecular profiling of biological samples highlights the need for analytical approaches that can integrate multi-omic datasets to generate prioritized causal inferences. Current methods are limited by high dimensionality of the combined datasets, the differences in their data distributions, and their integration to infer causal relationships. Here, we present Essential Regression (ER), a novel latent-factor-regression-based interpretable machine-learning approach that addresses these problems by identifying latent factors and their likely cause-effect relationships with system-wide outcomes/properties of interest. ER can integrate many multi-omic datasets without structural or distributional assumptions regarding the data. It outperforms a range of state-of-the-art methods in terms of prediction. ER can be coupled with probabilistic graphical modeling, thereby strengthening the causal inferences. The utility of ER is demonstrated using multi-omic system immunology datasets to generate and validate novel cellular and molecular inferences in a wide range of contexts including immunosenescence and immune dysregulation.

3.
BMC Genomics ; 16: 263, 2015 Apr 03.
Article En | MEDLINE | ID: mdl-25887568

BACKGROUND: With the explosion of genomic data over the last decade, there has been a tremendous amount of effort to understand the molecular basis of cancer using informatics approaches. However, this has proven to be extremely difficult primarily because of the varied etiology and vast genetic heterogeneity of different cancers and even within the same cancer. One particularly challenging problem is to predict prognostic outcome of the disease for different patients. RESULTS: Here, we present ENCAPP, an elastic-net-based approach that combines the reference human protein interactome network with gene expression data to accurately predict prognosis for different human cancers. Our method identifies functional modules that are differentially expressed between patients with good and bad prognosis and uses these to fit a regression model that can be used to predict prognosis for breast, colon, rectal, and ovarian cancers. Using this model, ENCAPP can also identify prognostic biomarkers with a high degree of confidence, which can be used to generate downstream mechanistic and therapeutic insights. CONCLUSION: ENCAPP is a robust method that can accurately predict prognostic outcome and identify biomarkers for different human cancers.


Biomarkers, Tumor/metabolism , Neoplasms/diagnosis , Neoplasms/metabolism , Software , Computational Biology , Gene Expression , Humans , Neoplasms/genetics , Prognosis , Protein Interaction Maps
...