An exploration of factors potentially affecting the perception and interpretation of medical images used in higher education
Student thesis: Doctoral Thesis
Much work is currently being undertaken to explore the impact of varying factors such as compression and image display parameters upon both measurable and perceived image quality in the clinical setting. However, little specific work was found that related to the effect of these factors within Higher Education, where high numbers of students, non-dedicated lecture theatres and a large number and variety of display devices results in many conditions that could impact upon the quality of digital radiographic images. Additionally, the College of Radiographers has identified (2006) that a radiographer comment accompanying radiographs may become a core competency. The aim of this thesis is to present and reflect upon a programme of research undertaken to explore which factors impacted upon students’ summary measures of performance and to begin to establish guidelines to ensure that images are presented optimally to the students, without creating unnecessary work for the academic staff. The effect of differing summary measures was also explored. A series of experiments were undertaken utilising volunteers from an undergraduate radiography programme. Research question The research question was: “What factors might potentially affect the perception and interpretation of medical images used in Higher Education?” Methods A series of six experiments were designed to evaluate the following factors: 1. The effect of compression upon diagnostic accuracy and perceived image quality; 2. The students’ perception of brightness and contrast changes of digital projectional radiographs and the effect of education upon this; 3. The ability of a detailed digital test image to discern limitations of a system; 4. The effect of image size, display device standardisation and image optimisation on summary measures of performance; 5. The ability of students to report consistently from digital test images; 6. The effect of differing marking criteria, confidence scales and summary measures of performance. Results This programme of research demonstrated that for digital projectional appendicular radiographs there was a significant difference between the levels of compression that observers preferred (p<0.05). However, there was no significant difference in accuracy for images reported uncompressed or at lossy levels of 40:1 (JPEG). Higher levels of compression were easily perceived, but low levels were not. It also confirmed other work established that low levels of compression were preferred by the human visual system due to the slight softening effect of the JPEG algorithm. Whilst individuals’ perception of brightness and contrast changes differed, the mean for groups of students was not significantly different and education did not have a significant effect. However, there was a significant difference (p<0.05) between those 30 and under and those over 30 in the level of perceived change, but not in the selection of the last acceptable image. A mid level grey background was shown to reduce perceived error of change compared to black or white backgrounds. Radar plots within this context are proposed as a way of identifying ideal images from students’ responses. Images corrected for the gamma of the system were identified as optimal by the cohorts. Images at 50% resolution stretched to 100%, the standardisation of display devices and image optimisation did not significantly affect student summary measures of performance. However this part of the study lacked power due to fewer participants than was initially anticipated. The summary measure of performance identified as optimal was the area under an AFROC curve, created from a five point category scale. This scale should be used by the observers to categorise their confidence and the marker to rate their confidence based upon the observers’ comments. This will allow a kappa value to be calculated that will give feedback on the level of conveyed confidence. Conclusions This programme of research has identified a number of factors that warrant more detailed research within the field of Higher Education. One is re-evaluating the effect of the year group on the quality factor proposed, as this research seems to indicate that education does have a positive effect on the reporting scores from a digital test image. In addition, there seems to be scope in considering the radar plot as a method of identifying where the ideal image lies. A range of minimum standards, as proven by these experiments and taken from literature, are proposed as the best practice for lecture presentation and assessment. Recommendations are made for further research into the effect of several parameters where power was low. This research has established some of the ground rules for improving the display and assessment of medical images in Higher Education.