Capturing Quality: Retaining provenance for curated volunteer monitoring data

Authors S. Andrew Sheppard, Loren Terveen
Conference Wikisym 2011
Summary This paper describes a preliminary ethnographic study of the quality assurance practices of River Watch, a community-based monitoring program based in the basin of the Red River of the North. Contrary to expectations, the rigorous QA practices appear to enhance, rather than hinder, the educational goals of the program.
Clarification To be clear, "quality" is not really a verb - it's a noun or an adjective depending on context. The title of this paper is a play on words, intended to reflect the idea that quality requires action and is not a static attribute.
Fulltext Download PDFOfficial ACM Version


Citizen science is becoming more valuable as a potential source of environmental data. Involving citizens in data collection has the added educational benefits of increased scientific awareness and local ownership of environmental concerns. However, a common concern among domain experts is the presumed lower quality of data submitted by volunteers. In this paper, we explore data quality assurance practices in River Watch, a community-based monitoring program in the Red River basin. We investigate how the participants in River Watch understand and prioritize data quality concerns. We found that data quality in River Watch is primarily maintained through universal adherence to standard operating procedures, but there remain areas where technological intervention may help. We also found that rigorous data quality assurance practices appear to enhance rather than hinder the educational goals of the program. We draw implications for the design of quality assurance mechanisms for River Watch and other citizen science projects.

Dimensions of Data Quality


Authors S. Andrew Sheppard, Andrea Wiggins, Loren Terveen
Conference CSCW 2014
Summary Retaining provenance metadata for volunteer monitoring is a challenge due to the offline nature of participation and review. In this paper, we discuss the citizen science workflow and propose the ERAV model to facilitate data exchange and integration.
Fulltext Download PDFOfficial ACM Version


The real world nature of field-based citizen science involves unique data management challenges that distinguish it from projects that involve only Internet-mediated activities. In particular, many data contribution and review practices are often accomplished offline via paper or general-purpose software like Excel. This can lead to integration challenges when attempting to implement project-specific ICT with full revision and provenance tracking. In this work, we explore some of the current challenges and opportunities in implementing ICT for managing volunteer monitoring data. Our two main contributions are: a general outline of the workflow tasks common to field-based data collection, and a novel data model for preserving provenance metadata that allows for ongoing data exchange between disparate technical systems and participant skill levels. We conclude with applications for other domains, such as hydrologic forecasting and crisis informatics, as well as directions for future research.

Volunteer Monitoring Workflow