Imbalanced classification using genetically optimized random forests

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Class imbalance is a problem that commonly affects 'real world' classification datasets, and has been shown to hinder the performance of classifiers. A dataset suffers from class imbalance when the number of instances belonging to one class outnumbers the number of instance belonging to another class. Two ways of dealing with class imbalance are modifying the dataset to reduce the number of instances belonging to the majority class(es) (known as resampling), or allowing the classifier to penalize misclassifying the minority class(es) more than the majority class(es), this can be done by implementing a cost matrix. This paper attempts to improve the classification performance of the Random Forest classifier on imbalanced datasets by exploiting these two techniques, to do this a genetic algorithm is employed to find optimal parameters. Results are compared to commonly used classification algorithms.
Original languageEnglish
Title of host publicationGECCO Companion '15
Subtitle of host publicationproceedings of the companion publication of the 2015 on genetic and evolutionary computation conference
Place of PublicationNew York
PublisherACM
Pages1453-1454
ISBN (Print)978-1450334884
Publication statusPublished - 2015
EventGenetic and Evolutionary Computation Conference - Madrid, Spain
Duration: 11 Jul 201515 Jul 2015

Conference

ConferenceGenetic and Evolutionary Computation Conference
Country/TerritorySpain
CityMadrid
Period11/07/1515/07/15

Keywords

  • random forest
  • genetic algorithms
  • classification
  • cost-sensitive classification
  • cost matrix

Fingerprint

Dive into the research topics of 'Imbalanced classification using genetically optimized random forests'. Together they form a unique fingerprint.

Cite this