Search | VHL Regional Portal

1.

[A late-type star spectra outlier data mining system].

Cai, Jiang-Hui; Yang, Hai-Feng; Zhao, Xu-Jun; Zhang, Ji-Fu.

Guang Pu Xue Yu Guang Pu Fen Xi ; 34(5): 1421-4, 2014 May.

Article in Chinese | MEDLINE | ID: mdl-25095451

ABSTRACT

In M star population, some special objects, which may be of magnetic activity, may be giant stars, or may be of other rare properties, are very important for the follow-up observation and the scientific research on galactic structure and evolution. For local bias of M-type star spectral characteristic lines contained in subspace, a late-type star spectra outlier data mining system is given in the present paper. Firstly, for the sample of M stellar spectral characteristic lines indices, its distribution characteristics in attribute spaces are measured by using the sparse factor and sparsity coefficient, and then this sample is discretized and dimension-reduced to the spectral subspace. Secondly, local outlier subspaces are extracted by PSO (particle swarm optimization) algorithm and identified. Additionally, the effects of sparse coefficient and sparse factor on the number of outliers are discussed by experiments on the sample of SDSS M stellar spectral line index set, and the outliers are compared with spectral type provided by SDSS. In this way, the feasibility and value of this system were validated.

2.

[A post-processing method of classification rule on stellar spectra].

Cai, Jiang-Hui; Yang, Hai-Feng; Zhao, Xu-Jun; Zhang, Ji-Fu.

Guang Pu Xue Yu Guang Pu Fen Xi ; 33(1): 237-40, 2013 Jan.

Article in Chinese | MEDLINE | ID: mdl-23586264

ABSTRACT

Automatic classification and analysis of observational data is of great significance along with the gradual implementation of LAMOST Survey, which will obtain a large number of spectra data. In classification rules extracted, there is often a great deal of redundancy which will reduce the classification efficiency and quality seriously. In the present paper, a post-processing method of star spectra classification rule based on predicate logic is presented by using predication to describe the classification rules and logical reasoning to eliminate redundant rules. In the end, some experimental results on LAMOST's stellar spectra data show that, with no classification accuracy reduction, the efficiency of auto classification is significantly improved.

3.

[Automatic classification method of star spectrum data based on classification pattern tree].

Zhao, Xu-Jun; Cai, Jiang-Hui; Zhang, Ji-Fu; Yang, Hai-Feng; Ma, Yang.

Guang Pu Xue Yu Guang Pu Fen Xi ; 33(10): 2875-8, 2013 Oct.

Article in Chinese | MEDLINE | ID: mdl-24409754

ABSTRACT

Frequent pattern, frequently appearing in the data set, plays an important role in data mining. For the stellar spectrum classification tasks, a classification rule mining method based on classification pattern tree is presented on the basis of frequent pattern. The procedures can be shown as follows. Firstly, a new tree structure, i. e., classification pattern tree, is introduced based on the different frequencies of stellar spectral attributes in data base and its different importance used for classification. The related concepts and the construction method of classification pattern tree are also described in this paper. Then, the characteristics of the stellar spectrum are mapped to the classification pattern tree. Two modes of top-to-down and bottom-to-up are used to traverse the classification pattern tree and extract the classification rules. Meanwhile, the concept of pattern capability is introduced to adjust the number of classification rules and improve the construction efficiency of the classification pattern tree. Finally, the SDSS (the Sloan Digital Sky Survey) stellar spectral data provided by the National Astronomical Observatory are used to verify the accuracy of the method. The results show that a higher classification accuracy has been got.

4.

[Rapid identification of variable star spectrum based on information entropy].

Cai, Jiang-hui; Meng, Wen-jun; Sun, Shi-wei; Zhao, Xu-jun; Zhang, Ji-fu.

Guang Pu Xue Yu Guang Pu Fen Xi ; 32(1): 255-8, 2012 Jan.

Article in Chinese | MEDLINE | ID: mdl-22497171

ABSTRACT

Variable star is very important for mankind studying cosmic origin and evolution. For studying variable star, the chief difficulty results from the filtration and identification of variable star, that is how to validly identify variable star spectra from large high-dimensional star spectra data. The traditional outlier definition tries to find the difference between the outlier data and the general model by different ways, and then the result is quantitatively analyzed and filtrated. However, the time complexity of this method is over size and its results are inscrutable and unaccountable. Information entropy is a measure of the uncertainty associated with a random variable. In the present paper, information entropy is imported as the standard of dataset common mode. A novel method is proposed to identify the variable star spectrum rapidly based on information entropy. The time complexity of this method is observably reduced and the man-made impact is effectively overcome. The preliminary experimental results based on Sloan star spectrum data show that the method is workable for rapid identification of variable star spectrum.

5.

[Research on two-stage fuzzy clustering method for spectrum data based on PSO].

Cai, Jiang-hui; Zhang, Ji-fu; Zhao, Xu-jun.

Guang Pu Xue Yu Guang Pu Fen Xi ; 29(4): 1137-41, 2009 Apr.

Article in Chinese | MEDLINE | ID: mdl-19626920

ABSTRACT

A novel high-dimensional clustering algorithm is proposed. On the basis of this, a two-stage fuzzy clustering approach, named TSPFCM, is presented. On the first stage, data is clustered by a new clustering method. On the second stage, the result of the first stage is taken as the initial cluster centers, and PSO mechanism is inducted into fuzzy clustering to solve the locality and the sensitiveness of the initial condition of Fuzzy C-means Clustering. The running results of the system show that it is feasible and valuable to apply this method to mining the clustering in spectrum data.

6.

[Research on the interrelation analysis system of celestial spectrum data based on constraint FP tree].

Zhao, Xu-jun; Zhang, Ji-fu; Cai, Jiang-hui.

Guang Pu Xue Yu Guang Pu Fen Xi ; 28(12): 2996-9, 2008 Dec.

Article in Chinese | MEDLINE | ID: mdl-19248531

ABSTRACT

It is an effective method of the mankind seeking after the celestial law that the inherent and unknown interrelationships between characteristics of celestial spectrum data and its physical and chemical properties are mined from the mass celestial body spectrum data. In the present paper, the interrelation analysis system of celestial body spectrum data based on constraint FP tree is designed and implemented by using the association rule based constraint FP tree as the way of analyzing celestial spectrum data, and adopting VC++ and Oracle9i as the development tools. At the same time, its software architecture and function modules are outlined. Its key techniques such as preprocessing of celestial body spectrum data,background knowledge representing, constraint FP-tree constructing, constraint frequent patterns and association rules mining etc are discussed in details. The running results show that the system is feasible and valuable for adopting association rule to describe the above interrelationships. Therefore, the interrelation analysis system of celestial body spectrum data provides an effective means for seeking after the inherent and unknown celestial law.

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL