Pesquisa | BVS Bolivia

Chao, Guoqing; Sun, Shiliang; Bi, Jinbo.

IEEE Trans Artif Intell ; 2(2): 146-168, 2021 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-35308425

RESUMO

Clustering is a machine learning paradigm of dividing sample subjects into a number of groups such that subjects in the same groups are more similar to those in other groups. With advances in information acquisition technologies, samples can frequently be viewed from different angles or in different modalities, generating multi-view data. Multi-view clustering, that clusters subjects into subgroups using multi-view data, has attracted more and more attentions. Although MVC methods have been developed rapidly, there has not been enough survey to summarize and analyze the current progress. Therefore, we propose a novel taxonomy of the MVC approaches. Similar to other machine learning methods, we categorize them into generative and discriminative classes. In discriminative class, based on the way of view integration, we split it further into five groups: Common Eigenvector Matrix, Common Coefficient Matrix, Common Indicator Matrix, Direct Combination and Combination After Projection. Furthermore, we relate MVC to other topics: multi-view representation, ensemble clustering, multi-task clustering, multi-view supervised and semi-supervised learning. Several representative real-world applications are elaborated for practitioners. Some benchmark multi-view datasets are introduced and representative MVC algorithms from each group are empirically evaluated to analyze how they perform on benchmark datasets. To promote future development of MVC approaches, we point out several open problems that may require further investigation and thorough examination.

Multi-View Cluster Analysis with Incomplete Data to Understand Treatment Effects.

Chao, Guoqing; Sun, Jiangwen; Lu, Jin; Wang, An-Li; Langleben, Daniel D; Li, Chiang-Shan; Bi, Jinbo.

Inf Sci (N Y) ; 494: 278-293, 2019 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-32863420

RESUMO

Multi-view cluster analysis, as a popular granular computing method, aims to partition sample subjects into consistent clusters across different views in which the subjects are characterized. Frequently, data entries can be missing from some of the views. The latest multi-view co-clustering methods cannot effectively deal with incomplete data, especially when there are mixed patterns of missing values. We propose an enhanced formulation for a family of multi-view co-clustering methods to cope with the missing data problem by introducing an indicator matrix whose elements indicate which data entries are observed and assessing cluster validity only on observed entries. In comparison with the simple strategy of removing subjects with missing values, our approach can use all available data in cluster analysis. In comparison with common methods that impute missing data in order to use regular multi-view analytics, our approach is less sensitive to imputation uncertainty. In comparison with other state-of-the-art multi-view incomplete clustering methods, our approach is sensible in the cases of missing any value in a view or missing the entire view, the most common scenario in practice. We first validated the proposed strategy in simulations, and then applied it to a treatment study of heroin dependence which would have been impossible with previous methods due to a number of missing-data patterns. Patients in a treatment study were naturally assessed in different feature spaces such as in the pre-, during-and post-treatment time windows. Our algorithm was able to identify subgroups where patients in each group showed similarities in all of the three time windows, thus leading to the recognition of pre-treatment (baseline) features predictive of post-treatment outcomes.

Supervised Nonnegative Matrix Factorization to Predict ICU Mortality Risk.

Chao, Guoqing; Mao, Chengsheng; Wang, Fei; Zhao, Yuan; Luo, Yuan.

Proceedings (IEEE Int Conf Bioinformatics Biomed) ; 2018: 1189-1194, 2018 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-31360595

RESUMO

ICU mortality risk prediction is a tough yet important task. On one hand, due to the complex temporal data collected, it is difficult to identify the effective features and interpret them easily; on the other hand, good prediction can help clinicians take timely actions to prevent the mortality. These correspond to the interpretability and accuracy problems. Most existing methods lack of the interpretability, but recently Subgraph Augmented Nonnegative Matrix Factorization (SANMF) has been successfully applied to time series data to provide a path to interpret the features well. Therefore, we adopted this approach as the backbone to analyze the patient data. One limitation of the original SANMF method is its poor prediction ability due to its unsupervised nature. To deal with this problem, we proposed a supervised SANMF algorithm by integrating the logistic regression loss function into the NMF framework and solved it with an alternating optimization procedure. We used the simulation data to verify the effectiveness of this method, and then we applied it to ICU mortality risk prediction and demonstrated its superiority over other conventional supervised NMF methods.

Alternative Multiview Maximum Entropy Discrimination.

Chao, Guoqing; Sun, Shiliang.

IEEE Trans Neural Netw Learn Syst ; 27(7): 1445-56, 2016 07.

Artigo em Inglês | MEDLINE | ID: mdl-26111403

RESUMO

Maximum entropy discrimination (MED) is a general framework for discriminative estimation based on maximum entropy and maximum margin principles, and can produce hard-margin support vector machines under some assumptions. Recently, the multiview version of MED multiview MED (MVMED) was proposed. In this paper, we try to explore a more natural MVMED framework by assuming two separate distributions p1( Θ1) over the first-view classifier parameter Θ1 and p2( Θ2) over the second-view classifier parameter Θ2 . We name the new MVMED framework as alternative MVMED (AMVMED), which enforces the posteriors of two view margins to be equal. The proposed AMVMED is more flexible than the existing MVMED, because compared with MVMED, which optimizes one relative entropy, AMVMED assigns one relative entropy term to each of the two views, thus incorporating a tradeoff between the two views. We give the detailed solving procedure, which can be divided into two steps. The first step is solving our optimization problem without considering the equal margin posteriors from two views, and then, in the second step, we consider the equal posteriors. Experimental results on multiple real-world data sets verify the effectiveness of the AMVMED, and comparisons with MVMED are also reported.

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA