Predicting HIV status among men who have sex with men in Bulawayo & Harare, Zimbabwe using bio-behavioural data, recurrent neural networks, and machine learning techniques

Innocent Chingombe, Tafadzwa Dzinamarira*, Diego Cuadros, Munyaradzi Paul Mapingure, Elliot Mbunge, Simbarashe Chaputsira, Roda Madziva, Panashe Chiurunge, Chesterfield Samba, Helena Herrera, Grant Murewanhema, Owen Mugurungi, Godfrey Musuka

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

33 Downloads (Pure)


HIV and AIDS continue to be major public health concerns globally. Despite significant progress in addressing their impact on the general population and achieving epidemic control, there is a need to improve HIV testing, particularly among men who have sex with men (MSM). This study applied deep and machine learning algorithms such as recurrent neural networks (RNNs), the bagging classifier, gradient boosting classifier, support vector machines, and Naïve Bayes classifier to predict HIV status among MSM using the dataset from the Zimbabwe Ministry of Health and Child Care. RNNs performed better than the bagging classifier, gradient boosting classifier, support vector machines, and Gaussian Naïve Bayes classifier in predicting HIV status. RNNs recorded a high prediction accuracy of 0.98 as compared to the Gaussian Naïve Bayes classifier (0.84), bagging classifier (0.91), support vector machine (0.91), and gradient boosting classifier (0.91). In addition, RNNs achieved a high precision of 0.98 for predicting both HIV-positive and -negative cases, a recall of 1.00 for HIV-negative cases and 0.94 for HIV-positive cases, and an F1-score of 0.99 for HIV-negative cases and 0.96 for positive cases. HIV status prediction models can significantly improve early HIV screening and assist healthcare professionals in effectively providing healthcare services to the MSM community. The results show that integrating HIV status prediction models into clinical software systems can complement indicator condition-guided HIV testing strategies and identify individuals that may require healthcare services, particularly for hard-to-reach vulnerable populations like MSM. Future studies are necessary to optimize machine learning models further to integrate them into primary care. The significance of this manuscript is that it presents results from a study population where very little information is available in Zimbabwe due to the criminalization of MSM activities in the country. For this reason, MSM tends to be a hidden sector of the population, frequently harassed and arrested. In almost all communities in Zimbabwe, MSM issues have remained taboo, and stigma exists in all sectors of society.

Original languageEnglish
Article number231
Number of pages15
JournalTropical Medicine and Infectious Disease
Issue number9
Publication statusPublished - 5 Sept 2022


  • deep learning
  • machine learning
  • MSM
  • prediction models
  • status

Cite this