Improving imbalanced question classification using structured smote based approach

Research output: Chapter in Book/Report/Conference proceedingConference contribution

145 Downloads (Pure)

Abstract

Questions Classification (QC) is one of the most popular text classification applications. QC plays an important role in question-answering systems. However, as in many real-world classification problems, QC may suffer from the problem of class imbalance. The classification of imbalanced data has been a key problem in machine learning and data mining. In this paper, we propose a framework that deals with the class imbalance using a hierarchical SMOTE algorithm for balancing different types of questions. The proposed framework is grammar-based, which involves using the grammatical pattern for each question and using machine learning algorithms to classify them. Experimental
results imply that the proposed framework demonstrates a good level of accuracy in identifying different question types and handling class imbalance.
Original languageEnglish
Title of host publicationProceedings of the 2018 International Conference on Machine Learning and Cybernetics (ICMLC)
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages593-597
Number of pages6
Volume2
ISBN (Electronic)978-1-5386-5214-5
ISBN (Print)978-1-5386-5215-2
DOIs
Publication statusPublished - 12 Nov 2018
Event2018 International Conference on Machine Learning and Cybernetics - http://www.icmlc.com/icmlc/welcome.html, Chengdu, China
Duration: 15 Jul 201818 Jul 2018

Publication series

NameInternational Conference on Machine Learning and Cybernetics (ICMLC)
PublisherIEEE
ISSN (Print)2160-133X
ISSN (Electronic)2160-1348

Conference

Conference2018 International Conference on Machine Learning and Cybernetics
Abbreviated titleICMLC 2018
Country/TerritoryChina
CityChengdu
Period15/07/1818/07/18

Keywords

  • Information Retrieval
  • Text classification
  • Question classification
  • Machine Learning
  • Class Imbalance

Fingerprint

Dive into the research topics of 'Improving imbalanced question classification using structured smote based approach'. Together they form a unique fingerprint.

Cite this