Incremental Interval Type-2 Fuzzy Clustering of Data Streams using Single Pass Method.

Qaiyum, Sana; Aziz, Izzatdin; Hasan, Mohd Hilmi; Khan, Asif Irshad; Almalawi, Abdulmohsen

Qaiyum, Sana; Aziz, Izzatdin; Hasan, Mohd Hilmi; Khan, Asif Irshad; Almalawi, Abdulmohsen.

Afiliação

Qaiyum S; Center for Research in Data Sciences, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia.
Aziz I; Center for Research in Data Sciences, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia.
Hasan MH; Center for Research in Data Sciences, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia.
Khan AI; Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia.
Almalawi A; Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia.

Sensors (Basel) ; 20(11)2020 Jun 05.

Article em En | MEDLINE | ID: mdl-32517018

ABSTRACT

ABSTRACT

Data Streams create new challenges for fuzzy clustering algorithms, specifically Interval Type-2 Fuzzy C-Means (IT2FCM). One problem associated with IT2FCM is that it tends to be sensitive to initialization conditions and therefore, fails to return global optima. This problem has been addressed by optimizing IT2FCM using Ant Colony Optimization approach. However, IT2FCM-ACO obtain clusters for the whole dataset which is not suitable for clustering large streaming datasets that may be coming continuously and evolves with time. Thus, the clusters generated will also evolve with time. Additionally, the incoming data may not be available in memory all at once because of its size. Therefore, to encounter the challenges of a large data stream environment we propose improvising IT2FCM-ACO to generate clusters incrementally. The proposed algorithm produces clusters by determining appropriate cluster centers on a certain percentage of available datasets and then the obtained cluster centroids are combined with new incoming data points to generate another set of cluster centers. The process continues until all the data are scanned. The previous data points are released from memory which reduces time and space complexity. Thus, the proposed incremental method produces data partitions comparable to IT2FCM-ACO. The performance of the proposed method is evaluated on large real-life datasets. The results obtained from several fuzzy cluster validity index measures show the enhanced performance of the proposed method over other clustering algorithms. The proposed algorithm also improves upon the run time and produces excellent speed-ups for all datasets.

Palavras-chave

ant colony optimization; data stream; incremental learning; interval type-2 fuzzy c-means

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2020 Tipo de documento: Article