Understanding the reliability of citizen science observational data using item response models

Abstract Citizen science projects have become increasingly popular in many fields, including ecology. However, the quality of this information is frequently debated within the scientific community. Modern citizen science implementations therefore require measures of the users' proficiency. We i...

Full description

Saved in:
Bibliographic Details
Main Authors: Edgar Santos‐Fernandez, Kerrie Mengersen
Format: Article
Language:English
Published: Wiley 2021-08-01
Series:Methods in Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1111/2041-210X.13623
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Citizen science projects have become increasingly popular in many fields, including ecology. However, the quality of this information is frequently debated within the scientific community. Modern citizen science implementations therefore require measures of the users' proficiency. We introduce a new methodological framework of item response that quantifies a citizen scientist's ability, taking into account the difficulty of the task. We focus on citizen science programs involving the classification of images. Our approach accommodates spatial autocorrelation within the item difficulties, and provides deeper insights and relevant ecological measures of species and site‐related difficulties, discriminatory power and guessing behaviour. The identification of very capable versus less skilled participants can facilitate selective use of data in analyses and more targeted training programs for citizen scientists. This paper also addresses challenges in fitting such models to very large datasets. We found that the suggested methods outperform the traditional item response models in terms of RMSE, accuracy and WAIC, based on leave‐one‐out cross‐validation on simulated and empirical data. We present a comprehensive implementation using a case study of species identification in the Serengeti, Tanzania. The R and Stan codes are provided for full reproducibility. Multiple statistical illustrations and visualizations are given, which allow extrapolation to a wide range of citizen science ecological problems.
ISSN:2041-210X