Abstract
The original random forests algorithm has been widely used and has achieved excellent performance for the classification and regression tasks. However, the research on the theory of random forests lags far behind its applications. In this paper, to narrow the gap between the applications and theory of random forests, we propose a new random forests algorithm, called random Shapley forests (RSFs), based on the Shapley value. The Shapley value is one of the well-known
solutions in the cooperative game, which can fairly assess the power of each player in a game. In the construction of RSFs, RSFs uses the Shapley value to evaluate the importance of each feature at each tree node by computing the dependency among the possible feature coalitions. In particular, inspired by the existing consistency theory, we have proved the consistency of the proposed random forests algorithm. Moreover, to verify the effectiveness of the proposed algorithm, experiments on eight UCI benchmark datasets and four real-world datasets have been conducted. The results show that RSFs perform better than or at least comparable with the existing consistent random forests, the original random forests and a classic classifier, support vector machines.
solutions in the cooperative game, which can fairly assess the power of each player in a game. In the construction of RSFs, RSFs uses the Shapley value to evaluate the importance of each feature at each tree node by computing the dependency among the possible feature coalitions. In particular, inspired by the existing consistency theory, we have proved the consistency of the proposed random forests algorithm. Moreover, to verify the effectiveness of the proposed algorithm, experiments on eight UCI benchmark datasets and four real-world datasets have been conducted. The results show that RSFs perform better than or at least comparable with the existing consistent random forests, the original random forests and a classic classifier, support vector machines.
Original language | English |
---|---|
Number of pages | 10 |
Journal | IEEE Transactions on Cybernetics |
Early online date | 23 Mar 2020 |
DOIs | |
Publication status | Early online - 23 Mar 2020 |
Keywords
- Random forests
- feature evaluation
- Shapley value
- consistency