SMS spam identification and risk assessment evaluations

Alaa Mohasseb, Benjamin Aziz, Andreas Kanavos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

397 Downloads (Pure)

Abstract

Short Message Service (SMS) constitutes one of the most used communication medium. It has become an integral part of people's lives and like other communication media, {SMS} texts have been used for propagating spam messages. Despite the fact that a broad range of spam techniques have been proposed to reduce the frequency of such incidents, many difficulties are still present due to text ambiguity; there, the same words can be used in seemingly similar texts which makes it more difficult to identify spam messages. In this paper, we propose an approach for identifying and classifying spam SMS based on the Syntactical features and patterns of the message. The proposed approach consists of three main parts, namely Data Pre-processing, Features Extraction, and Classification. Experimental results show that the proposed approach achieves a good level of accuracy. In addition, to show the effectiveness of handling class imbalance on the classification performance, two additional experiments were conducted using the implementation of the SMOTE algorithm. There, the results depicted that handling class imbalance help in improving identification and classification accuracy. Furthermore, based on the above, a risk model has been proposed that addresses the risk probability and the impact of spam SMS.
Original languageEnglish
Title of host publicationProceedings of the 16th International Conference on Web Information Systems and Technologies (WEBIST)
EditorsMassimo Marchiori, Francisco Domínguez Mayo, Joaquim Filipe
PublisherSciTePress
Pages417-424
Volume1
ISBN (Print)978-989-758-478-7
DOIs
Publication statusPublished - 3 Nov 2020
Event16th International Conference on Web Information Systems and Technologies - Online
Duration: 3 Nov 20205 Nov 2020
http://www.webist.org/

Conference

Conference16th International Conference on Web Information Systems and Technologies
Abbreviated titleWEBIST
Period3/11/205/11/20
Internet address

Keywords

  • Information Retrieval
  • Machine Learning
  • Spam SMS Detection
  • Risk Assessment
  • Class Imbalance

Fingerprint

Dive into the research topics of 'SMS spam identification and risk assessment evaluations'. Together they form a unique fingerprint.

Cite this