File(s) under permanent embargo
Hierarchical clustering using homogeneity as similarity measure for big data analytics
conference contribution
posted on 2023-05-23, 11:17 authored by Zhao, Y, Wong, R, Chi, C-H, Zhou, W, Ding, C, Wang, CIn big data analytics, clustering plays a fundamental and decisive role in supporting pattern mining and value creation. To help improve user experience and satisfaction level of clustering algorithms, one important key is to let users define the quality of the aggregated clusters (e.g. in terms of the homogeneity and the relative population of each resulting cluster) they prefer instead of to fix the number of clusters to be obtained before the clustering process. In this paper, we first propose a new measure, called the Clustering Performance Index (or CPI), that takes into consideration of homogeneity, relative population, and number of clusters aggregated. Then we propose a new hierarchical clustering algorithm by adopting homogeneity as its key similarity. Experimental results show that our proposed clustering algorithm can achieve a good balance among CPI, the number of clusters aggregated, and the time cost of the algorithm.
History
Publication title
Proceedings of the 12th IEEE International Conference on Services ComputingEditors
PP Maglio, I Paik, W ChouPagination
348-354ISBN
9781467372817Department/School
School of Information and Communication TechnologyPublisher
IEEE-Inst Electrical Electronics Engineers IncPlace of publication
New York, USAEvent title
12th IEEE International Conference on Services ComputingEvent Venue
New York City, NY, USADate of Event (Start Date)
2015-06-27Date of Event (End Date)
2015-07-02Rights statement
Copyright 2015 IEEERepository Status
- Restricted