eCite Digital Repository

Hierarchical clustering using homogeneity as similarity measure for big data analytics

Citation

Zhao, Y and Wong, R and Chi, C-H and Zhou, W and Ding, C and Wang, C, Hierarchical clustering using homogeneity as similarity measure for big data analytics, Proceedings of the 12th IEEE International Conference on Services Computing, 27 June-02 July 2015, New York City, NY, USA, pp. 348-354. ISBN 9781467372817 (2015) [Refereed Conference Paper]


Preview
PDF
Not available
1Mb
  

Copyright Statement

Copyright 2015 IEEE

DOI: doi:10.1109/SCC.2015.55

Abstract

In big data analytics, clustering plays a fundamental and decisive role in supporting pattern mining and value creation. To help improve user experience and satisfaction level of clustering algorithms, one important key is to let users define the quality of the aggregated clusters (e.g. in terms of the homogeneity and the relative population of each resulting cluster) they prefer instead of to fix the number of clusters to be obtained before the clustering process. In this paper, we first propose a new measure, called the Clustering Performance Index (or CPI), that takes into consideration of homogeneity, relative population, and number of clusters aggregated. Then we propose a new hierarchical clustering algorithm by adopting homogeneity as its key similarity. Experimental results show that our proposed clustering algorithm can achieve a good balance among CPI, the number of clusters aggregated, and the time cost of the algorithm.

Item Details

Item Type:Refereed Conference Paper
Keywords:keywords-clustering, homogeneity, relative population, clustering performance index
Research Division:Information and Computing Sciences
Research Group:Computation Theory and Mathematics
Research Field:Analysis of Algorithms and Complexity
Objective Division:Expanding Knowledge
Objective Group:Expanding Knowledge
Objective Field:Expanding Knowledge in the Information and Computing Sciences
Author:Chi, C-H (Dr Chi-Hung Chi)
ID Code:110679
Year Published:2015
Deposited By:Computing and Information Systems
Deposited On:2016-08-09
Last Modified:2016-09-05
Downloads:0

Repository Staff Only: item control page