Galaxy Zoo: Clump Scout - design and first application of a two-dimensional aggregation tool for citizen science

Hugh Dickinson*, Dominic Adams, Vihang Mehta, Claudia Scarlata, Lucy Fortson, Stephen Serjeant, Coleman Krawczyk, Sandor Kruk, Chris Lintott, Kameswara Bharadwaj Mantha, Brooke D. Simmons, Mike Walmsley

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

25 Downloads (Pure)


Galaxy Zoo: Clump Scout is a web-based citizen science project designed to identify and spatially locate giant star forming clumps in galaxies that were imaged by the Sloan Digital Sky Survey Legacy Survey. We present a statistically driven software framework that is designed to aggregate two-dimensional annotations of clump locations provided by multiple independent Galaxy Zoo: Clump Scout volunteers and generate a consensus label that identifies the locations of probable clumps within each galaxy. The statistical model our framework is based on allows us to assign false-positive probabilities to each of the clumps we identify, to estimate the skill levels of each of the volunteers who contribute to Galaxy Zoo: Clump Scout and also to quantitatively assess the reliability of the consensus labels that are derived for each subject. We apply our framework to a data set containing 3561 454 two-dimensional points, which constitute 1739 259 annotations of 85 286 distinct subjects provided by 20 999 volunteers. Using this data set, we identify 128 100 potential clumps distributed among 44 126 galaxies. This data set can be used to study the prevalence and demographics of giant star forming clumps in low-redshift galaxies. The code for our aggregation software framework is publicly available at:

Original languageEnglish
Pages (from-to)5882-5911
Number of pages30
JournalMonthly Notices of the Royal Astronomical Society
Issue number4
Early online date12 Oct 2022
Publication statusPublished - 1 Dec 2022


  • galaxies: structure
  • methods: data analysis
  • methods: statistical
  • software: data analysis
  • software: public release
  • UKRI
  • STFC
  • ST/P000584/1
  • EP/V030302/1

Cite this