Development and testing of a text-mining approach to analyse patients' comments on their experiences of colorectal cancer care

Richard Wagland, Alejandra Recio-Saucedo, Michael Simon, Michael Bracher, Katherine Hunt, Claire Foster, Amy Downing, Adam Glaser, Jessica Corner

Research output: Contribution to journalArticlepeer-review

148 Downloads (Pure)


Background - Quality of cancer care may greatly impact on patients' health-related quality of life (HRQoL). Free-text responses to patient-reported outcome measures (PROMs) provide rich data but analysis is time and resource-intensive. This study developed and tested a learning-based text-mining approach to facilitate analysis of patients' experiences of care and develop an explanatory model illustrating impact on HRQoL.

Methods - Respondents to a population-based survey of colorectal cancer survivors provided free-text comments regarding their experience of living with and beyond cancer. An existing coding framework was tested and adapted, which informed learning-based text mining of the data. Machine-learning algorithms were trained to identify comments relating to patients' specific experiences of service quality, which were verified by manual qualitative analysis. Comparisons between coded retrieved comments and a HRQoL measure (EQ5D) were explored.

Results - The survey response rate was 63.3% (21 802/34 467), of which 25.8% (n=5634) participants provided free-text comments. Of retrieved comments on experiences of care (n=1688), over half (n=1045, 62%) described positive care experiences. Most negative experiences concerned a lack of post-treatment care (n=191, 11% of retrieved comments) and insufficient information concerning self-management strategies (n=135, 8%) or treatment side effects (n=160, 9%). Associations existed between HRQoL scores and coded algorithm-retrieved comments. Analysis indicated that the mechanism by which service quality impacted on HRQoL was the extent to which services prevented or alleviated challenges associated with disease and treatment burdens.

Conclusions - Learning-based text mining techniques were found useful and practical tools to identify specific free-text comments within a large dataset, facilitating resource-efficient qualitative analysis. This method should be considered for future PROM analysis to inform policy and practice. Study findings indicated that perceived care quality directly impacts on HRQoL.

Original languageEnglish
Pages (from-to)604-614
Number of pages11
JournalBMJ Quality and Safety
Issue number8
Publication statusPublished - 28 Oct 2015


  • Aged
  • Aged, 80 and over
  • Colorectal Neoplasms/psychology
  • Data Mining/methods
  • Female
  • Humans
  • Male
  • Middle Aged
  • Patient Satisfaction/statistics & numerical data
  • Quality of Health Care/statistics & numerical data
  • Quality of Life/psychology
  • Surveys and Questionnaires


Dive into the research topics of 'Development and testing of a text-mining approach to analyse patients' comments on their experiences of colorectal cancer care'. Together they form a unique fingerprint.

Cite this