Abstract
Short Message Service (SMS) constitutes one of the most used communication medium. It has become an integral part of people's lives and like other communication media, {SMS} texts have been used for propagating spam messages. Despite the fact that a broad range of spam techniques have been proposed to reduce the frequency of such incidents, many difficulties are still present due to text ambiguity; there, the same words can be used in seemingly similar texts which makes it more difficult to identify spam messages. In this paper, we propose an approach for identifying and classifying spam SMS based on the Syntactical features and patterns of the message. The proposed approach consists of three main parts, namely Data Pre-processing, Features Extraction, and Classification. Experimental results show that the proposed approach achieves a good level of accuracy. In addition, to show the effectiveness of handling class imbalance on the classification performance, two additional experiments were conducted using the implementation of the SMOTE algorithm. There, the results depicted that handling class imbalance help in improving identification and classification accuracy. Furthermore, based on the above, a risk model has been proposed that addresses the risk probability and the impact of spam SMS.
Original language | English |
---|---|
Title of host publication | Proceedings of the 16th International Conference on Web Information Systems and Technologies (WEBIST) |
Editors | Massimo Marchiori, Francisco Domínguez Mayo, Joaquim Filipe |
Publisher | SciTePress |
Pages | 417-424 |
Volume | 1 |
ISBN (Print) | 978-989-758-478-7 |
DOIs | |
Publication status | Published - 3 Nov 2020 |
Event | 16th International Conference on Web Information Systems and Technologies - Online Duration: 3 Nov 2020 → 5 Nov 2020 http://www.webist.org/ |
Conference
Conference | 16th International Conference on Web Information Systems and Technologies |
---|---|
Abbreviated title | WEBIST |
Period | 3/11/20 → 5/11/20 |
Internet address |
Keywords
- Information Retrieval
- Machine Learning
- Spam SMS Detection
- Risk Assessment
- Class Imbalance