The linguistic construction of online communities in citizen science
: a corpus-driven approach

  • Claudia Viggiano

Student thesis: Doctoral Thesis


Citizen science is a rising form of crowdsourced research that harvests the contributions of volunteers for the advancement of science, especially with regards to collecting or classifying large amounts of data.
Online forms of CS have grown exponentially since the introduction of platforms which host a diverse range of projects; research on CS has grown accordingly, yet it has mainly focused on assessing quantitative metrics such as success and motivations. This thesis presents the first comprehensive study of the language and interactions of a CS community (Zooniverse), aiming to provide a taxonomy of its linguistic environment and, ultimately, to inform our understanding of the nature of such spaces.
Specifically, this work focuses on Zooniverse’s project-adjacent discussion boards, where volunteers ask questions, report issues, learn, and chat with others. The thesis uses a 6-million word corpus spanning nearly six years and 43 project boards: through a dynamic approach of corpus linguistics methods, discourse analysis, sociolinguistics and the community of inquiry framework (Garrison et al., 2000), this work uses keyword analysis as well as other tools to (a) explore the ‘aboutness’ of the Zooniverse corpus by focusing on community-specific lexicon, (b) analyse the diachronic evolution of the platform through its keywords, and (c) explore the roles and contributions of central users through user corpora.
Findings show that Zooniverse is a supportive, goal-oriented community based not only around knowledge exchange and task completion, but also around a strong sense of community which is built through continued interactions and through the creation of a strong in-group identity, often realised through expressions unique to the community; allegiance is thus formed and fostered through creative expressions of social presence (Lander, 2015), resulting in and leading to continued engagement and task completion. However, the data also points to an inherent tension between experienced users who are proficient in scientific terminology, and newcomers who may feel alienated by it. These findings provide insight into the nature of CS and other goal-oriented communities; specifically, the findings highlight that encouraging meaningful social interactions while fostering an inclusive and accessible environment can ultimately have an impact on the design, development and retention of CS platforms.
Date of AwardMar 2021
Original languageEnglish
Awarding Institution
  • University of Portsmouth
SupervisorGlenn Hadikin (Supervisor), Catherine Jane Carroll-Meehan (Supervisor) & Mario Saraceni (Supervisor)

Cite this