TY - GEN
T1 - Machine learning approach into bacterial relationship: exploring 16S rRNA metabarcoding with association rule mining
AU - Sari, Omer Faruk
AU - Bader-El-Den, Mohamed
AU - Ince, Volkan
AU - Arabikhan, Farzad
PY - 2024/10/9
Y1 - 2024/10/9
N2 - Metabarcoding is a technique for analysing DNA sequences that target specific gene regions and plays a crucial role in the identification and classification of different organisms. In particular, 16S rRNA metabarcoding enables the elucidation of complex bacterial and archaeal communities in food. This research presents a novel dataset obtained by metabarcoding analysis of 16S rRNA aimed at elucidating the microbial dynamics of cooked, ready-to-eat ham products over a defined storage period. At the centre of our investigation is the application of association rule mining, an unsupervised machine learning approach in data mining, to uncover latent patterns and relationships within the dataset. At the taxonomic “family” level, our analysis shows a strong correlation between the presence of Bacillaceae and Staphylococcaceae with a support of 92%. This finding highlights the consistent co-occurrence of these microbial families with a confidence level of 96%, meaning that the presence of Bacillaceae strongly predicts the presence of Staphylococcaceae. Furthermore, at the genus level, a significant relationship is observed between Brochotrix and Arthrobacter, with both genera co-occurring in approximately 85% of samples in the dataset. Notably, the high confidence level of 98% suggests a strong association, suggesting that the presence of Brochotrix reliably predicts the presence of Arthrobacter. These results provide valuable insights into microbial dynamics in food and demonstrate the effectiveness of using advanced data mining techniques in deciphering complex food ecosystems interactions.
AB - Metabarcoding is a technique for analysing DNA sequences that target specific gene regions and plays a crucial role in the identification and classification of different organisms. In particular, 16S rRNA metabarcoding enables the elucidation of complex bacterial and archaeal communities in food. This research presents a novel dataset obtained by metabarcoding analysis of 16S rRNA aimed at elucidating the microbial dynamics of cooked, ready-to-eat ham products over a defined storage period. At the centre of our investigation is the application of association rule mining, an unsupervised machine learning approach in data mining, to uncover latent patterns and relationships within the dataset. At the taxonomic “family” level, our analysis shows a strong correlation between the presence of Bacillaceae and Staphylococcaceae with a support of 92%. This finding highlights the consistent co-occurrence of these microbial families with a confidence level of 96%, meaning that the presence of Bacillaceae strongly predicts the presence of Staphylococcaceae. Furthermore, at the genus level, a significant relationship is observed between Brochotrix and Arthrobacter, with both genera co-occurring in approximately 85% of samples in the dataset. Notably, the high confidence level of 98% suggests a strong association, suggesting that the presence of Brochotrix reliably predicts the presence of Arthrobacter. These results provide valuable insights into microbial dynamics in food and demonstrate the effectiveness of using advanced data mining techniques in deciphering complex food ecosystems interactions.
KW - Association rule mining
KW - dna sequencing
KW - pattern detection
KW - metabarcoding
KW - unsupervised learning
U2 - 10.1109/IS61756.2024.10705245
DO - 10.1109/IS61756.2024.10705245
M3 - Conference contribution
SN - 9798350350999
T3 - 2024 IEEE 12th International Conference on Intelligent Systems (IS)
BT - Proceedings of 2024 IEEE 12th International Conference on Intelligent Systems (IS)
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th IEEE International Conference on Intelligent Systems
Y2 - 29 August 2024 through 31 August 2024
ER -