Heuristic target class selection for advancing performance of coverage-based rule learning

Han Liu, Shyi-Ming Chen, Mihaela Cocea

Research output: Contribution to journalArticlepeer-review

49 Downloads (Pure)

Abstract

Rule learning is a popular branch of machine learning, which can provide accurate and interpretable classification results. In general, two main strategies of rule learning are referred to as ‘divide and conquer’ and ‘separate and conquer’. Decision tree generation that follows the former strategy has a serious drawback, which is known as the replicated sub-tree problem, resulting from the constraint that all branches of a decision tree must have one or more common attributes. The above problem is likely to result in high computational complexity and the risk of overfitting, which leads to the necessity to develop rule learning algorithms (e.g., Prism) that follow the separate and conquer strategy. The replicated sub-tree problem can be effectively solved using the Prism algorithm, but the trained models are still complex due to the need of training an independent rule set for each selected target class. In order to reduce the risk of overfitting and the model complexity, we propose in this paper a variant of the Prism algorithm referred to as PrismCTC. The experimental results show that the PrismCTC algorithm leads to advances in classification performance and reduction of model complexity, in comparison with the C4.5 and Prism algorithms.
Original languageEnglish
Pages (from-to)164-179
Number of pages16
JournalInformation Sciences
Volume479
Early online date3 Dec 2018
DOIs
Publication statusPublished - 1 Apr 2019

Keywords

  • Machine learning
  • Rule based systems
  • Rule based classification
  • Decision tree learning
  • Rule learning
  • Prism

Fingerprint

Dive into the research topics of 'Heuristic target class selection for advancing performance of coverage-based rule learning'. Together they form a unique fingerprint.

Cite this