Using photographic records to quantify accuracy of bird identifications in citizen science data

Autores
Gorleri, Fabricio Carlos; Jordan, Emilio Ariel; Roesler, Carlos Ignacio; Monteleone, Diego; Areta, Juan Ignacio
Año de publicación
2023
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Citizen science data are increasingly used for biodiversity monitoring. However, concerns are often raised over the accuracy of species identifications in citizen science databases, as data are collected mostly by non-professionals. Misidentifications can simultaneously generate two error types: false positives (erroneous reports of a species) and false negatives (lack of reports of the misidentified species). Large-scale assessments of identification errors should provide insights into the strengths and weaknesses of citizen science data. Here we show that citizen science photographic data for birds are trustworthy overall, although problems arise in hard-to-identify bird groups. We reviewed over 104 000 images of 377 passerine species from the southern Neotropics (Argentina) stored in eBird – a large citizen science platform – and quantified erroneous reports to calculate precision and recall metrics as measures for data accuracy. Precision increases with fewer false positives and recall increases with fewer false negatives; hence, high values of precision and recall will mirror a higher data accuracy. We found that 97% of the photos of all species were correctly identified. Most species (77%; n = 291) showed high accuracy in their identifications (precision and recall > 95%), with 122 species showing no errors. A few hard-to-identify species (10%; n = 40) showed low levels of data quality (63–90% precision or recall). Similarly, few species (12%; n = 46) exhibited intermediate precision or recall scores (90–95%). Further, we uncovered the existence of a complex network of cross-identifications composed of 272 species, with a predominance of tyrant flycatchers and ovenbirds, reflecting the strong traffic of errors that occurs within these families. To our knowledge, our study provides the first large-scale quantification of identification errors in photos submitted by citizen science contributors. We underscore the relevance of performing such assessments to understand how identification errors are distributed across a database before analysing data, and provide tools for citizen science stakeholders to direct more specific efforts toward species that need an improvement in data quality.
Fil: Gorleri, Fabricio Carlos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Salta. Instituto de Bio y Geociencias del NOA. Universidad Nacional de Salta. Facultad de Ciencias Naturales. Museo de Ciencias Naturales. Instituto de Bio y Geociencias del NOA; Argentina
Fil: Jordan, Emilio Ariel. Provincia de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Universidad Autónoma de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción; Argentina
Fil: Roesler, Carlos Ignacio. Asociación Ornitológica del Plata; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Monteleone, Diego. Asociación Ornitológica del Plata; Argentina
Fil: Areta, Juan Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Salta. Instituto de Bio y Geociencias del NOA. Universidad Nacional de Salta. Facultad de Ciencias Naturales. Museo de Ciencias Naturales. Instituto de Bio y Geociencias del NOA; Argentina
Materia
ARGENTINA
EBIRD
FALSE NEGATIVES
FALSE POSITIVES
MISIDENTIFICATIONS
NEOTROPICS
NETWORK ANALYSIS
PASSERINES
PRECISION
RECALL
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/205123

id CONICETDig_a0846c8661d1f6274f542326f6e85fda
oai_identifier_str oai:ri.conicet.gov.ar:11336/205123
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Using photographic records to quantify accuracy of bird identifications in citizen science dataGorleri, Fabricio CarlosJordan, Emilio ArielRoesler, Carlos IgnacioMonteleone, DiegoAreta, Juan IgnacioARGENTINAEBIRDFALSE NEGATIVESFALSE POSITIVESMISIDENTIFICATIONSNEOTROPICSNETWORK ANALYSISPASSERINESPRECISIONRECALLhttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Citizen science data are increasingly used for biodiversity monitoring. However, concerns are often raised over the accuracy of species identifications in citizen science databases, as data are collected mostly by non-professionals. Misidentifications can simultaneously generate two error types: false positives (erroneous reports of a species) and false negatives (lack of reports of the misidentified species). Large-scale assessments of identification errors should provide insights into the strengths and weaknesses of citizen science data. Here we show that citizen science photographic data for birds are trustworthy overall, although problems arise in hard-to-identify bird groups. We reviewed over 104 000 images of 377 passerine species from the southern Neotropics (Argentina) stored in eBird – a large citizen science platform – and quantified erroneous reports to calculate precision and recall metrics as measures for data accuracy. Precision increases with fewer false positives and recall increases with fewer false negatives; hence, high values of precision and recall will mirror a higher data accuracy. We found that 97% of the photos of all species were correctly identified. Most species (77%; n = 291) showed high accuracy in their identifications (precision and recall > 95%), with 122 species showing no errors. A few hard-to-identify species (10%; n = 40) showed low levels of data quality (63–90% precision or recall). Similarly, few species (12%; n = 46) exhibited intermediate precision or recall scores (90–95%). Further, we uncovered the existence of a complex network of cross-identifications composed of 272 species, with a predominance of tyrant flycatchers and ovenbirds, reflecting the strong traffic of errors that occurs within these families. To our knowledge, our study provides the first large-scale quantification of identification errors in photos submitted by citizen science contributors. We underscore the relevance of performing such assessments to understand how identification errors are distributed across a database before analysing data, and provide tools for citizen science stakeholders to direct more specific efforts toward species that need an improvement in data quality.Fil: Gorleri, Fabricio Carlos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Salta. Instituto de Bio y Geociencias del NOA. Universidad Nacional de Salta. Facultad de Ciencias Naturales. Museo de Ciencias Naturales. Instituto de Bio y Geociencias del NOA; ArgentinaFil: Jordan, Emilio Ariel. Provincia de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Universidad Autónoma de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción; ArgentinaFil: Roesler, Carlos Ignacio. Asociación Ornitológica del Plata; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Monteleone, Diego. Asociación Ornitológica del Plata; ArgentinaFil: Areta, Juan Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Salta. Instituto de Bio y Geociencias del NOA. Universidad Nacional de Salta. Facultad de Ciencias Naturales. Museo de Ciencias Naturales. Instituto de Bio y Geociencias del NOA; ArgentinaWiley Blackwell Publishing, Inc2023-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/205123Gorleri, Fabricio Carlos; Jordan, Emilio Ariel; Roesler, Carlos Ignacio; Monteleone, Diego; Areta, Juan Ignacio; Using photographic records to quantify accuracy of bird identifications in citizen science data; Wiley Blackwell Publishing, Inc; Ibis; 165; 2; 4-2023; 458-4710019-1019CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/10.1111/ibi.13137info:eu-repo/semantics/altIdentifier/doi/10.1111/ibi.13137info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T14:56:57Zoai:ri.conicet.gov.ar:11336/205123instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 14:56:58.295CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Using photographic records to quantify accuracy of bird identifications in citizen science data
title Using photographic records to quantify accuracy of bird identifications in citizen science data
spellingShingle Using photographic records to quantify accuracy of bird identifications in citizen science data
Gorleri, Fabricio Carlos
ARGENTINA
EBIRD
FALSE NEGATIVES
FALSE POSITIVES
MISIDENTIFICATIONS
NEOTROPICS
NETWORK ANALYSIS
PASSERINES
PRECISION
RECALL
title_short Using photographic records to quantify accuracy of bird identifications in citizen science data
title_full Using photographic records to quantify accuracy of bird identifications in citizen science data
title_fullStr Using photographic records to quantify accuracy of bird identifications in citizen science data
title_full_unstemmed Using photographic records to quantify accuracy of bird identifications in citizen science data
title_sort Using photographic records to quantify accuracy of bird identifications in citizen science data
dc.creator.none.fl_str_mv Gorleri, Fabricio Carlos
Jordan, Emilio Ariel
Roesler, Carlos Ignacio
Monteleone, Diego
Areta, Juan Ignacio
author Gorleri, Fabricio Carlos
author_facet Gorleri, Fabricio Carlos
Jordan, Emilio Ariel
Roesler, Carlos Ignacio
Monteleone, Diego
Areta, Juan Ignacio
author_role author
author2 Jordan, Emilio Ariel
Roesler, Carlos Ignacio
Monteleone, Diego
Areta, Juan Ignacio
author2_role author
author
author
author
dc.subject.none.fl_str_mv ARGENTINA
EBIRD
FALSE NEGATIVES
FALSE POSITIVES
MISIDENTIFICATIONS
NEOTROPICS
NETWORK ANALYSIS
PASSERINES
PRECISION
RECALL
topic ARGENTINA
EBIRD
FALSE NEGATIVES
FALSE POSITIVES
MISIDENTIFICATIONS
NEOTROPICS
NETWORK ANALYSIS
PASSERINES
PRECISION
RECALL
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.6
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Citizen science data are increasingly used for biodiversity monitoring. However, concerns are often raised over the accuracy of species identifications in citizen science databases, as data are collected mostly by non-professionals. Misidentifications can simultaneously generate two error types: false positives (erroneous reports of a species) and false negatives (lack of reports of the misidentified species). Large-scale assessments of identification errors should provide insights into the strengths and weaknesses of citizen science data. Here we show that citizen science photographic data for birds are trustworthy overall, although problems arise in hard-to-identify bird groups. We reviewed over 104 000 images of 377 passerine species from the southern Neotropics (Argentina) stored in eBird – a large citizen science platform – and quantified erroneous reports to calculate precision and recall metrics as measures for data accuracy. Precision increases with fewer false positives and recall increases with fewer false negatives; hence, high values of precision and recall will mirror a higher data accuracy. We found that 97% of the photos of all species were correctly identified. Most species (77%; n = 291) showed high accuracy in their identifications (precision and recall > 95%), with 122 species showing no errors. A few hard-to-identify species (10%; n = 40) showed low levels of data quality (63–90% precision or recall). Similarly, few species (12%; n = 46) exhibited intermediate precision or recall scores (90–95%). Further, we uncovered the existence of a complex network of cross-identifications composed of 272 species, with a predominance of tyrant flycatchers and ovenbirds, reflecting the strong traffic of errors that occurs within these families. To our knowledge, our study provides the first large-scale quantification of identification errors in photos submitted by citizen science contributors. We underscore the relevance of performing such assessments to understand how identification errors are distributed across a database before analysing data, and provide tools for citizen science stakeholders to direct more specific efforts toward species that need an improvement in data quality.
Fil: Gorleri, Fabricio Carlos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Salta. Instituto de Bio y Geociencias del NOA. Universidad Nacional de Salta. Facultad de Ciencias Naturales. Museo de Ciencias Naturales. Instituto de Bio y Geociencias del NOA; Argentina
Fil: Jordan, Emilio Ariel. Provincia de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Universidad Autónoma de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción; Argentina
Fil: Roesler, Carlos Ignacio. Asociación Ornitológica del Plata; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Monteleone, Diego. Asociación Ornitológica del Plata; Argentina
Fil: Areta, Juan Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Salta. Instituto de Bio y Geociencias del NOA. Universidad Nacional de Salta. Facultad de Ciencias Naturales. Museo de Ciencias Naturales. Instituto de Bio y Geociencias del NOA; Argentina
description Citizen science data are increasingly used for biodiversity monitoring. However, concerns are often raised over the accuracy of species identifications in citizen science databases, as data are collected mostly by non-professionals. Misidentifications can simultaneously generate two error types: false positives (erroneous reports of a species) and false negatives (lack of reports of the misidentified species). Large-scale assessments of identification errors should provide insights into the strengths and weaknesses of citizen science data. Here we show that citizen science photographic data for birds are trustworthy overall, although problems arise in hard-to-identify bird groups. We reviewed over 104 000 images of 377 passerine species from the southern Neotropics (Argentina) stored in eBird – a large citizen science platform – and quantified erroneous reports to calculate precision and recall metrics as measures for data accuracy. Precision increases with fewer false positives and recall increases with fewer false negatives; hence, high values of precision and recall will mirror a higher data accuracy. We found that 97% of the photos of all species were correctly identified. Most species (77%; n = 291) showed high accuracy in their identifications (precision and recall > 95%), with 122 species showing no errors. A few hard-to-identify species (10%; n = 40) showed low levels of data quality (63–90% precision or recall). Similarly, few species (12%; n = 46) exhibited intermediate precision or recall scores (90–95%). Further, we uncovered the existence of a complex network of cross-identifications composed of 272 species, with a predominance of tyrant flycatchers and ovenbirds, reflecting the strong traffic of errors that occurs within these families. To our knowledge, our study provides the first large-scale quantification of identification errors in photos submitted by citizen science contributors. We underscore the relevance of performing such assessments to understand how identification errors are distributed across a database before analysing data, and provide tools for citizen science stakeholders to direct more specific efforts toward species that need an improvement in data quality.
publishDate 2023
dc.date.none.fl_str_mv 2023-04
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/205123
Gorleri, Fabricio Carlos; Jordan, Emilio Ariel; Roesler, Carlos Ignacio; Monteleone, Diego; Areta, Juan Ignacio; Using photographic records to quantify accuracy of bird identifications in citizen science data; Wiley Blackwell Publishing, Inc; Ibis; 165; 2; 4-2023; 458-471
0019-1019
CONICET Digital
CONICET
url http://hdl.handle.net/11336/205123
identifier_str_mv Gorleri, Fabricio Carlos; Jordan, Emilio Ariel; Roesler, Carlos Ignacio; Monteleone, Diego; Areta, Juan Ignacio; Using photographic records to quantify accuracy of bird identifications in citizen science data; Wiley Blackwell Publishing, Inc; Ibis; 165; 2; 4-2023; 458-471
0019-1019
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/10.1111/ibi.13137
info:eu-repo/semantics/altIdentifier/doi/10.1111/ibi.13137
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Wiley Blackwell Publishing, Inc
publisher.none.fl_str_mv Wiley Blackwell Publishing, Inc
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846083105974124544
score 13.22299