Sentiment analysis of reviews in natural language: Roman Urdu as a case study

Muhammad Aasim Qureshi*, Muhammad Asif*, Mohd Fadzil Hassan*, Adnan Abid, Asad Kamal, Sohail Safdar, Rehan Akber

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Downloads (Pure)


Opinion Mining from user reviews is an emerging field. Sentiment Analysis of Natural Language helps us in finding the opinion of the customers. These reviews can be in any language e.g. English, Chinese, Arabic, Japanese, Urdu, and Hindi. This research presents a model to classify the polarity of the review(s) in Roman Urdu (reviews). For the purpose, raw data was scraped from the reviews of 20 songs from Indo-Pak Music Industry. In this research a new dataset of 24000 reviews of Roman Urdu is created. Nine Machine Learning algorithms - Naïve Bayes, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Artificial Neural Networks, Convolutional Neural Network, Recurrent Neural Networks, ID3 and Gradient Boost Tree, are attempted. Logistic Regression outperformed the rest, based on testing and cross validation accuracies that are 92.25% and 91.47% respectively.

Original languageEnglish
Pages (from-to)24945-24954
Number of pages10
JournalIEEE Access
Publication statusPublished - 15 Feb 2022


  • ANN
  • classification
  • CNN
  • decision tree
  • deep learning
  • K-NN
  • machine learning
  • Naïve Bayes
  • RNN
  • Roman Urdu
  • Roman Urdu corpus
  • Sentiment analysis
  • sentiment classification
  • song reviews
  • supervised learning

Cite this