Pesquisa | Biblioteca Virtual em Saúde

The Unsupervised Feature Selection Algorithms Based on Standard Deviation and Cosine Similarity for Genomic Data Analysis.

Xie, Juanying; Wang, Mingzhao; Xu, Shengquan; Huang, Zhao; Grant, Philip W.

Front Genet ; 12: 684100, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34054930

RESUMO

To tackle the challenges in genomic data analysis caused by their tens of thousands of dimensions while having a small number of examples and unbalanced examples between classes, the technique of unsupervised feature selection based on standard deviation and cosine similarity is proposed in this paper. We refer to this idea as SCFS (Standard deviation and Cosine similarity based Feature Selection). It defines the discernibility and independence of a feature to value its distinguishable capability between classes and its redundancy to other features, respectively. A 2-dimensional space is constructed using discernibility as x-axis and independence as y-axis to represent all features where the upper right corner features have both comparatively high discernibility and independence. The importance of a feature is defined as the product of its discernibility and its independence (i.e., the area of the rectangular enclosed by the feature's coordinate lines and axes). The upper right corner features are by far the most important, comprising the optimal feature subset. Based on different definitions of independence using cosine similarity, there are three feature selection algorithms derived from SCFS. These are SCEFS (Standard deviation and Exponent Cosine similarity based Feature Selection), SCRFS (Standard deviation and Reciprocal Cosine similarity based Feature Selection) and SCAFS (Standard deviation and Anti-Cosine similarity based Feature Selection), respectively. The KNN and SVM classifiers are built based on the optimal feature subsets detected by these feature selection algorithms, respectively. The experimental results on 18 genomic datasets of cancers demonstrate that the proposed unsupervised feature selection algorithms SCEFS, SCRFS and SCAFS can detect the stable biomarkers with strong classification capability. This shows that the idea proposed in this paper is powerful. The functional analysis of these biomarkers show that the occurrence of the cancer is closely related to the biomarker gene regulation level. This fact will benefit cancer pathology research, drug development, early diagnosis, treatment and prevention.

Visualizing natural image statistics.

Fang, Hui; Tam, Gary Kwok-Leung; Borgo, Rita; Aubrey, Andrew J; Grant, Philip W; Rosin, Paul L; Wallraven, Christian; Cunningham, Douglas; Marshall, David; Chen, Min.

IEEE Trans Vis Comput Graph ; 19(7): 1228-41, 2013 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-23661013

RESUMO

Natural image statistics is an important area of research in cognitive sciences and computer vision. Visualization of statistical results can help identify clusters and anomalies as well as analyze deviation, distribution, and correlation. Furthermore, they can provide visual abstractions and symbolism for categorized data. In this paper, we begin our study of visualization of image statistics by considering visual representations of power spectra, which are commonly used to visualize different categories of images. We show that they convey a limited amount of statistical information about image categories and their support for analytical tasks is ineffective. We then introduce several new visual representations, which convey different or more information about image statistics. We apply ANOVA to the image statistics to help select statistically more meaningful measurements in our design process. A task-based user evaluation was carried out to compare the new visual representations with the conventional power spectra plots. Based on the results of the evaluation, we made further improvement of visualizations by introducing composite visual representations of image statistics.

Assuntos

Algoritmos , Percepção Visual , Adulto , Análise de Variância , Feminino , Humanos , Masculino , Estimulação Luminosa , Interface Usuário-Computador , Adulto Jovem

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA