Raw but valuable user data is continuously being generated on social media platforms. This data is, however, more valuable when they are mined using different approaches such as machine learning techniques. Additionally, this user-generated data can be used to potentially save lives especially of vulnerable social media users, as several studies carried out have shown the correlation between social media and suicide. In this study, we aim at contributing to the research relating to suicide communication on social media. We measured the performance of five machine learning algorithms: Prism, Decision Tree, Naive Bayes, Random Forest and Support Vector Machine, in classifying suicide-related text from Twitter. The results of the study showed that the Prism algorithm has outperformed the other machine learning algorithms with an F-measure of 0.84 for the target classes (Suicide and Flippant). This result, to the best of our knowledge, is the highest performance that has been achieved in classifying social media suicide-related text.
|Name||International Conference on Machine Learning and Cybernetics (ICMLC)|
|Conference||2018 International Conference on Machine Learning and Cybernetics|
|Abbreviated title||ICMLC 2018|
|Period||15/07/18 → 18/07/18|
- Text classification
- Machine Learning
- Social media