Automating the harmonisation of heterogeneous data in digital forensics

Hussam J Mohammed, Nathyan L. Clark, Fudong Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

209 Downloads (Pure)


Digital forensics has become an increasingly important tool in the fight against cyber and computer-assisted crime. However, with an increasing range of technologies at people’s disposal, investigators find themselves having to process and analyse many systems (e.g. PC, laptop, tablet, Smartphone) in a single case. Unfortunately, current tools operate within an isolated manner, investigating systems and applications on an individual basis. The heterogeneity of the evidence places time constraints and additional cognitive loads upon the investigator. Examplels of heterogeneity include applications such as messaging (e.g. iMessenger, Viber, Snapchat and Whatsapp), web browsers (e.g. Firefox and Chrome) and file systems (e.g. NTFS, FAT, and HFS). Being able to analyse and investigate evidence from across devices and applications based upon categories would enable investigators to query all data at once. This paper proposes a novel algorithm to the merging of datasets through a ‘characterisation and harmonisation’ process. The characterisation process analyses the nature of the metadata and the harmonisation process merges the data. A series of experiments using real-life forensic datasets are conducted to evaluate the algorithm across five different categories of datasets (i.e. messaging, graphical files, file system, Internet history, and emails), each containing data from different applications across difference devices (a total of 22 disparate datasets). The results showed that the algorithm is able to merge all fields successfully, with the exception of some binary-based data found within the messaging datasets (contained within Viber and SMS). The error occurred due to a lack of information for the characterisation process to make a useful determination. However, upon the further analysis it was found the error had a minimal impact on subsequent merged data.
Original languageEnglish
Title of host publicationProceedings of the 17th European Conference on Information Warfare and Security
Subtitle of host publicationECCWS 2018
EditorsAudun Jøsang
PublisherAcademic Conferences and Publishing International Limited
ISBN (Print)978-1-911218-85-2
Publication statusPublished - 29 Jun 2018
Event17th European Conference on Cyber Warfare and Security - University of Oslo, Norway
Duration: 28 Jul 201829 Jul 2018


Conference17th European Conference on Cyber Warfare and Security
Internet address


Dive into the research topics of 'Automating the harmonisation of heterogeneous data in digital forensics'. Together they form a unique fingerprint.

Cite this