Abstract
Opinion Mining from user reviews is an emerging field. Sentiment Analysis of Natural Language helps us in finding the opinion of the customers. These reviews can be in any language e.g. English, Chinese, Arabic, Japanese, Urdu, and Hindi. This research presents a model to classify the polarity of the review(s) in Roman Urdu (reviews). For the purpose, raw data was scraped from the reviews of 20 songs from Indo-Pak Music Industry. In this research a new dataset of 24000 reviews of Roman Urdu is created. Nine Machine Learning algorithms - Naïve Bayes, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Artificial Neural Networks, Convolutional Neural Network, Recurrent Neural Networks, ID3 and Gradient Boost Tree, are attempted. Logistic Regression outperformed the rest, based on testing and cross validation accuracies that are 92.25% and 91.47% respectively.
Original language | English |
---|---|
Pages (from-to) | 24945-24954 |
Number of pages | 10 |
Journal | IEEE Access |
Volume | 10 |
DOIs | |
Publication status | Published - 15 Feb 2022 |
Keywords
- ANN
- classification
- CNN
- decision tree
- deep learning
- K-NN
- machine learning
- Naïve Bayes
- RNN
- Roman Urdu
- Roman Urdu corpus
- Sentiment analysis
- sentiment classification
- song reviews
- supervised learning