Abstract
Due to the daily increase in the size of data, machine learning has become a popular approach for intelligent processing of data. In particular, machine learning algorithms are used to discover meaningful knowledge or build predictive models from data. For example, inductive learning algorithms involve generation of rules which can be in the form of either a decision tree or if-then rules. However, most of learning algorithms suffer from overfitting of training data. In other words, these learning algorithms can build models that perform extremely well on training data but poorly on other data. The overfitting problem is originating from both learning algorithms and data. In this context, the nature of machine learning problem can be referred to as bias and variance. The former is originating from learning algorithms whereas the latter is originating from data. Therefore, the reduction of overfitting can be achieved through scaling up algorithms on one side or scaling down data on the other side. Both bias and variance can be reduced through use of ensemble learning approaches. This paper introduces particular ways to address the issues on overfitting of rule based classifiers through both scaling up algorithms and scaling down data in the context of ensemble learning.
Original language | English |
---|---|
Title of host publication | 2015 International Conference on Machine Learning and Cybernetics (ICMLC) |
Place of Publication | Guangzhou |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 377-382 |
Number of pages | 6 |
Volume | 1 |
ISBN (Print) | 978-1-4673-7220-6 |
DOIs | |
Publication status | Published - Jul 2015 |
Keywords
- data mining
- machine learning
- ensemble learning
- inductive learning
- if-then rules
- rule based classification