eCite Digital Repository
A new multiple seeds based genetic algorithm for discovering a set of interesting Boolean association rules
Citation
Kabir, MMJ and Xu, S and Kang, BH and Zhao, Z, A new multiple seeds based genetic algorithm for discovering a set of interesting Boolean association rules, Expert Systems With Applications, 74 pp. 55-69. ISSN 0957-4174 (2017) [Contribution to Refereed Journal]
Copyright Statement
© 2017 Elsevier Ltd. All rights reserved.
DOI: doi:10.1016/j.eswa.2017.01.001
Abstract
Association rule mining algorithms mostly use a randomly generated single seed to initialize a population without paying attention to the effectiveness of that population in evolutionary learning. Recently, research has shown significant impact of the initial population on the production of good solutions over several generations of a genetic algorithm. Single seed based genetic algorithms suffer from the following major challenges (1) solutions of a genetic algorithm are varied, since different seeds generate different initial population, (2) difficulty in defining a good seed for a specific application. To avoid these problems, in this paper we propose the MSGA, a new multiple seeds based genetic algorithm which generates multiple seeds from different domains of a solution space to discover high quality rules from a large data set. This scheme introduces m-domain model and m-seeds selection process through which the whole solution space is subdivided into m- number of same size domains, selecting a seed from each domain. Use of these seeds enables this method to generate an effective initial population for evolutionary learning of the fitness value of each rule. As a result, strong searching efficiency is obtained at the beginning of the evolution, achieving fast convergence. The MSGA is tested with different mutation and crossover operators for mining interesting Boolean association rules from four real world data sets. The results are compared to different single seeds based genetic algorithms under the same conditions.
Item Details
Item Type: | Contribution to Refereed Journal |
---|---|
Keywords: | multiple seeds based genetic algorithm, initial population, Boolean association rules, conditional probability, search efficiency |
Research Division: | Information and Computing Sciences |
Research Group: | Machine learning |
Research Field: | Neural networks |
Objective Division: | Information and Communication Services |
Objective Group: | Information systems, technologies and services |
Objective Field: | Information systems, technologies and services not elsewhere classified |
UTAS Author: | Kabir, MMJ (Mr Mir Kabir) |
UTAS Author: | Xu, S (Dr Shuxiang Xu) |
UTAS Author: | Kang, BH (Professor Byeong Kang) |
UTAS Author: | Zhao, Z (Dr Zongyuan Zhao) |
ID Code: | 113690 |
Year Published: | 2017 |
Web of Science® Times Cited: | 20 |
Deposited By: | Information and Communication Technology |
Deposited On: | 2017-01-14 |
Last Modified: | 2018-12-13 |
Downloads: | 0 |
Repository Staff Only: item control page