eCite Digital Repository

An under-sampling method with support vectors in multi-class imbalanced data classification

Citation

Arafat, MY and Hoque, S and Xu, S and Farid, DM, An under-sampling method with support vectors in multi-class imbalanced data classification, Proceedings of the 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2019), 26-28 August 2019, Ukulhas, Maldives, pp. 1-6. ISBN 978-1-7281-2741-5 (2019) [Refereed Conference Paper]

Copyright Statement

Copyright 2019 IEEE

DOI: doi:10.1109/SKIMA47702.2019.8982391

Abstract

Multi-class imbalanced data classification in supervised learning is one of the most challenging research issues in machine learning for data mining applications. Although several data sampling methods have been introduced by computational intelligence researchers in the past decades for handling imbalanced data, still learning from imbalanced data is a challenging task and played as a significant focused research interest as well. Traditional machine learning algorithms usually biased to the majority class instances whereas ignored the minority class instances. As a result, ignoring minority class instances may affect the prediction accuracy of classifiers. Generally, under-sampling and over-sampling methods are commonly used in single model classifiers or ensemble learning for dealing with imbalanced data. In this paper, we have introduced an under-sampling method with support vectors for classifying imbalanced data. The proposed approach selects the most informative majority class instances based on the support vectors that help to engender decision boundary. We have tested the performance of the proposed method with single classifiers (C4.5 Decision Tree classifier and na´ve Bayes classifier) and ensemble classifiers (Random Forest and AdaBoost) on 13 benchmark imbalanced datasets. It is explicitly shown by the experimental result that the proposed method produces high accuracy when classifying both the minority and majority class instances compared to other existing methods.

Item Details

Item Type:Refereed Conference Paper
Keywords:data sampling methods, ensemble learning, imbalanced data, over-sampling, under-sampling
Research Division:Information and Computing Sciences
Research Group:Machine learning
Research Field:Neural networks
Objective Division:Information and Communication Services
Objective Group:Information systems, technologies and services
Objective Field:Information systems, technologies and services not elsewhere classified
UTAS Author:Xu, S (Dr Shuxiang Xu)
ID Code:141241
Year Published:2019
Deposited By:Information and Communication Technology
Deposited On:2020-10-07
Last Modified:2020-12-18
Downloads:0

Repository Staff Only: item control page