Constructing theoretically informed measures of pause duration in experimentally manipulated writing

Sophie Marie Hall, Veerle M. Baaijen, David Galbraith

Research output: Contribution to journalArticlepeer-review

31 Downloads (Pure)


This paper argues that traditional threshold-based approaches to the analysis of pauses in writing fail to capture the complexity of the cognitive processes involved in text production. It proposes that, to capture these processes, pause analysis should focus on the transition times between linearly produced units of text. Following a review of some of the problematic features of traditional pause analysis, the paper is divided into two sections. These are designed to demonstrate: (i) how to isolate relevant transitions within a text and calculate their durations; and (ii) the use of mixture modelling to identify structure within the distributions of pauses at different locations. The paper uses a set of keystroke logs collected from 32 university students writing argumentative texts about current affairs topics to demonstrate these methods. In the first section, it defines how pauses are calculated using a reproducible framework, explains the distinction between linear and non-linear text transitions, and explains how relevant sections of text are identified. It provides Excel scripts for automatically identifying relevant pauses and calculating their duration. The second section applies mixture modelling to linear transitions at sentence, sub sentence, between-word and within-word boundaries for each participant. It concludes that these transitions cannot be characterised by a single distribution of “cognitive” pauses. It proposes, further, that transitions between words should be characterised by a three-component distribution reflecting lexical, supra-lexical and reflective processes, while transitions at other text locations can be modelled by two-component distributions distinguishing between fluent and less fluent or more reflective processing. The paper concludes by recommending that, rather than imposing fixed thresholds to distinguish processes, researchers should instead impose a common set of theoretically informed distributions on the data and estimate how the parameters of these distributions vary for different individuals and under different conditions.
Original languageEnglish
Number of pages29
JournalReading and Writing
Early online date18 Apr 2022
Publication statusEarly online - 18 Apr 2022
Externally publishedYes


  • pause analysis
  • writing processes
  • keystroke analysis
  • UKRI
  • ESRC


Dive into the research topics of 'Constructing theoretically informed measures of pause duration in experimentally manipulated writing'. Together they form a unique fingerprint.

Cite this