Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
bioRxiv ; 2024 Jan 21.
Article in English | MEDLINE | ID: mdl-38293135

ABSTRACT

Dimensionality reduction-based data visualization is pivotal in comprehending complex biological data. The most common methods, such as PHATE, t-SNE, and UMAP, are unsupervised and therefore reflect the dominant structure in the data, which may be independent of expert-provided labels. Here we introduce a supervised data visualization method called RF-PHATE, which integrates expert knowledge for further exploration of the data. RF-PHATE leverages random forests to capture intricate featurelabel relationships. Extracting information from the forest, RF-PHATE generates low-dimensional visualizations that highlight relevant data relationships while disregarding extraneous features. This approach scales to large datasets and applies to classification and regression. We illustrate RF-PHATE's prowess through three case studies. In a multiple sclerosis study using longitudinal clinical and imaging data, RF-PHATE unveils a sub-group of patients with non-benign relapsingremitting Multiple Sclerosis, demonstrating its aptitude for time-series data. In the context of Raman spectral data, RF-PHATE effectively showcases the impact of antioxidants on diesel exhaust-exposed lung cells, highlighting its proficiency in noisy environments. Furthermore, RF-PHATE aligns established geometric structures with COVID-19 patient outcomes, enriching interpretability in a hierarchical manner. RF-PHATE bridges expert insights and visualizations, promising knowledge generation. Its adaptability, scalability, and noise tolerance underscore its potential for widespread adoption.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 10947-10959, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37015125

ABSTRACT

Random forests are considered one of the best out-of-the-box classification and regression algorithms due to their high level of predictive performance with relatively little tuning. Pairwise proximities can be computed from a trained random forest and measure the similarity between data points relative to the supervised task. Random forest proximities have been used in many applications including the identification of variable importance, data imputation, outlier detection, and data visualization. However, existing definitions of random forest proximities do not accurately reflect the data geometry learned by the random forest. In this paper, we introduce a novel definition of random forest proximities called Random Forest-Geometry- and Accuracy-Preserving proximities (RF-GAP). We prove that the proximity-weighted sum (regression) or majority vote (classification) using RF-GAP exactly matches the out-of-bag random forest prediction, thus capturing the data geometry learned by the random forest. We empirically show that this improved geometric representation outperforms traditional random forest proximities in tasks such as data imputation and provides outlier detection and visualization results consistent with the learned data geometry.

3.
Biomed Opt Express ; 11(11): 6197-6210, 2020 Nov 01.
Article in English | MEDLINE | ID: mdl-33282484

ABSTRACT

We developed a hyperspectral imaging tool based on surface-enhanced Raman spectroscopy (SERS) probes to determine the expression level and visualize the distribution of PD-L1 in individual cells. Electron-microscopic analysis of PD-L1 antibody - gold nanorod conjugates demonstrated binding the cell surface and internalization into endosomal vesicles. Stimulation of cells with IFN-γ or metformin was used to confirm the ability of SERS probes to report treatment-induced changes. The multivariate curve resolution-alternating least squares (MCR-ALS) analysis of spectra provided a greater signal-noise ratio than single peak mapping. However, single peak mapping allowed a systematic subtraction of background and the removal of non-specific binding and endocytic SERS signals. The mean or maximum peak height in the cell or the mean peak height in the area of specific PD-L1 positive pixels was used to estimate the PD-L1 expression levels in single cells. The PD-L1 levels were significantly up-regulated by IFN-γ and inhibited by metformin in human lung cancer cells from the A549 cell line. In conclusion, the method of analyzing hyperspectral SERS imaging data together with systematic and comprehensive removal of non-specific signals allows SERS imaging to be a quantitative tool in the detection of the cancer biomarker, PD-L1.

4.
Anal Chim Acta ; 1128: 221-230, 2020 Sep 01.
Article in English | MEDLINE | ID: mdl-32825906

ABSTRACT

Diesel exhaust particles (DEPs) are major constituents of air pollution and associated with numerous oxidative stress-induced human diseases. In vitro toxicity studies are useful for developing a better understanding of species-specific in vivo conditions. Conventional in vitro assessments based on oxidative biomarkers are destructive and inefficient. In this study, Raman spectroscopy, as a non-invasive imaging tool, was used to capture the molecular fingerprints of overall cellular component responses (nucleic acid, lipids, proteins, carbohydrates) to DEP damage and antioxidant protection. We apply a novel data visualization algorithm called PHATE, which preserves both global and local structure, to display the progression of cell damage over DEP exposure time. Meanwhile, a mutual information (MI) estimator was used to identify the most informative Raman peaks associated with cytotoxicity. A health index was defined to quantitatively assess the protective effects of two antioxidants (resveratrol and mesobiliverdin IXα) against DEP induced cytotoxicity. In addition, a number of machine learning classifiers were applied to successfully discriminate different treatment groups with high accuracy. Correlations between Raman spectra and immunomodulatory cytokine and chemokine levels were evaluated. In conclusion, the combination of label-free, non-disruptive Raman micro-spectroscopy and machine learning analysis is demonstrated as a useful tool in quantitative analysis of oxidative stress induced cytotoxicity and for effectively assessing various antioxidant treatments, suggesting that this framework can serve as a high throughput platform for screening various potential antioxidants based on their effectiveness at battling the effects of air pollution on human health.


Subject(s)
Antioxidants , Particulate Matter , Antioxidants/pharmacology , Humans , Machine Learning , Oxidative Stress , Spectrum Analysis, Raman , Vehicle Emissions
SELECTION OF CITATIONS
SEARCH DETAIL
...