A comparative study of machine learning algorithms and text vectorization methods for fake news detection

Andreas Kanavos, Ioannis Karamitsos, Alaa Mohasseb, Vassilis Gerogiannis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Downloads (Pure)

Abstract

The detection of fake news is a crucial task in today's society, given the widespread use of social media and online platforms. In this study, we investigate the application of Machine Learning (ML) algorithms for the detection of fake news. We consider two different datasets of categorized news articles of various sizes and apply various ML algorithms, along with two methods of text vectorization. Specifically, we examine Bag of Words and Tf-Idf, with the use of stemming and with different n-gram values. The resulting vectors are processed by Naive Bayes algorithms, Linear algorithms, Support Vector Machines, and Random Forest Classifiers. F1-Score and computational time for each algorithm-vectorization combination were recorded. Our results have shown that Linear Algorithms and Support Vector Machines combined with Tf-Idf vectors and n-gram value of (1,2) produced the highest accuracies, with an F1-Score up to 96.8%.
Original languageEnglish
Title of host publicationProceedings of the 14th International Conference on Information, Intelligence, Systems and Applications (IISA2023)
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9798350318067
ISBN (Print)9798350318074
DOIs
Publication statusPublished - 15 Dec 2023
Event14th International Conference on Information, Intelligence, Systems and Applications - University of Thessaly, Volos, Greece
Duration: 10 Jul 202312 Jul 2023
Conference number: 14
https://easyconferences.eu/iisa2023/

Conference

Conference14th International Conference on Information, Intelligence, Systems and Applications
Abbreviated titleIISA 2023
Country/TerritoryGreece
CityVolos
Period10/07/2312/07/23
Internet address

Keywords

  • Machine Learning
  • Text Mining
  • Information Retrieval
  • Fake News Detection

Cite this