Abstract

Citizen science provides extensive litter data, but inconsistent recording limits its use in environmental modelling and decision making. We present a scalable AI-assisted framework that harmonises two major UK datasets, Marine Debris Tracker and Litterati, into a unified, spatially detailed resource. Over 460,000 records (2015–2024) were standardised through a rules-to-embeddings-to-LLM cascade (schema-constrained Llama 3.1) for material classification. Items were clustered by material using K-means at a validated 200 m scale and linked to OpenStreetMap amenities within 500 m to identify accumulation hotspots and contextual features such as parks or transport hubs. Plastic dominated nationally, accounting for 71 percent of entries, while integration with UK Census 2021 data enabled demographic and health analyses where plastic remained highest (68.9 percent). This reproducible framework demonstrates how artificial intelligence can harmonise citizen-science data and enhance spatial modelling to inform targeted pollution prevention and sustainable waste-management strategies.
Original languageEnglish
Article number106823
Number of pages18
JournalEnvironmental Modelling & Software
Volume197
Early online date12 Dec 2025
DOIs
Publication statusEarly online - 12 Dec 2025

Keywords

  • Cluster analysis
  • Text mining
  • Natural Language Processing (NLP)
  • Environmental monitoring
  • Data integration
  • Data enrichment
  • Pollution prevention

Fingerprint

Dive into the research topics of 'Artificial intelligence enhanced litter pollution mapping: integrating citizen science with geospatial and social data'. Together they form a unique fingerprint.

Cite this