Automating electronic health record data quality assessment

Obinwa Ozonze, Philip J. Scott, Adrian A. Hopgood*

*Corresponding author for this work

Research output: Contribution to journalLiterature reviewpeer-review

49 Downloads (Pure)


Information systems such as Electronic Health Record (EHR) systems are susceptible to data quality (DQ) issues. Given the growing importance of EHR data, there is an increasing demand for strategies and tools to help ensure that available data are fit for use. However, developing reliable data quality assessment (DQA) tools necessary for guiding and evaluating improvement efforts has remained a fundamental challenge. This review examines the state of research on operationalising EHR DQA, mainly automated tooling, and highlights necessary considerations for future implementations. We reviewed 1841 articles from PubMed, Web of Science, and Scopus published between 2011 and 2021. 23 DQA programs deployed in real-world settings to assess EHR data quality (n = 14), and a few experimental prototypes (n = 9), were identified. Many of these programs investigate completeness (n = 15) and value conformance (n = 12) quality dimensions and are backed by knowledge items gathered from domain experts (n = 9), literature reviews and existing DQ measurements (n = 3). A few DQA programs also explore the feasibility of using data-driven techniques to assess EHR data quality automatically. Overall, the automation of EHR DQA is gaining traction, but current efforts are fragmented and not backed by relevant theory. Existing programs also vary in scope, type of data supported, and how measurements are sourced. There is a need to standardise programs for assessing EHR data quality, as current evidence suggests their quality may be unknown.

Original languageEnglish
Article number23 (2023)
Number of pages16
JournalJournal of Medical Systems
Issue number1
Publication statusPublished - 13 Feb 2023


  • Electronic health record (EHR)
  • Data quality
  • Data quality assessment
  • Automation

Cite this