Induction of classication rules by Gini-index based rule generation

Han Liu, Mihaela Cocea

Research output: Contribution to journalArticlepeer-review

467 Downloads (Pure)

Abstract

Rule learning is one of the most popular areas in machine learning research, because the outcome of learning is to produce a set of rules, which not only provides accurate predictions but also shows a transparent process of mapping inputs to outputs. In general, rule learning approaches can be divided into two main types, namely, 'divide and conquer' and 'separate and conquer'. The former type of rule learning is also known as Top-Down Induction of Decision Trees, which means to learn a set of rules represented in the form of a decision tree. This approach results in the production of a large number of complex rules (usually due to the replicated sub-tree problem), which lowers the computational efficiency in both the training and testing stages, and leads to the overfitting of training data. Due to this problem, researchers have been gradually motivated to develop 'separate and conquer' rule learning approaches, also known as covering approaches, by learning a set of rules on a sequential basis. In particular, a rule is learned and the instances covered by this rule are deleted from the training set, such that the learning of the next rule is based on a smaller training set. In this paper, we propose a new algorithm, GIBRG, which employs Gini-Index to measure the quality of each rule being learned, in the context of 'separate and conquer' rule learning. Our experiments show that the proposed algorithm outperforms both decision tree learning algorithms (C4.5, CART) and 'separate and conquer' approaches (Prism). In addition, it also leads to a smaller number of rules and rule terms, thus being more computationally efficient and less prone to overfitting.
Original languageEnglish
Pages (from-to)227-246
Number of pages20
JournalInformation Sciences
Volume436-437
Early online date17 Jan 2018
DOIs
Publication statusPublished - Apr 2018

Keywords

  • Data Mining
  • Machine Learning
  • Decision Tree Learning
  • Rule Learning
  • Classification
  • If-Then Rules

Fingerprint

Dive into the research topics of 'Induction of classication rules by Gini-index based rule generation'. Together they form a unique fingerprint.

Cite this