A globally synthesised and flagged bee occurrence dataset and cleaning workflow

Autores
Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; Bossert, Silas; Collins, Shannon M.; Lichtenberg, Elinor M.; Tucker, Erika M.; Smith Pardo, Allan; Falcon Brindis, Armando; Guevara, Diego A.; Ribeiro, Bruno; de Pedro, Diego; Pickering, John; Hung, Keng Lou James; Parys, Katherine A.; McCabe, Lindsie M.; Rogan, Matthew S.; Minckley, Robert L.; Velazco, Santiago José Elías; Griswold, Terry; Zarrillo, Tracy A.; Jetz, Walter; Sica, Yanina V.; Orr, Michael C.; Guzman, Laura Melissa; Ascher, John S.; Hughes, Alice C.; Cobb, Neil S.
Año de publicación
2023
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
Fil: Dorey, James B.. Flinders University; Australia
Fil: Fischer, Erica E.. King’s College London; Reino Unido
Fil: Chesshire, Paige R.. Northern Arizona University; Estados Unidos
Fil: Nava Bolaños, Angela. Unam Campus Juriquilla; México
Fil: O´Reilly, Robert L.. Flinders University; Australia
Fil: Bossert, Silas. National Museum of Natural History; Estados Unidos. Washington State University; Estados Unidos
Fil: Collins, Shannon M.. University of North Texas; Estados Unidos
Fil: Lichtenberg, Elinor M.. University of North Texas; Estados Unidos
Fil: Tucker, Erika M.. Biodiversity Outreach Network; Estados Unidos
Fil: Smith Pardo, Allan. United States Department of Agriculture; Argentina
Fil: Falcon Brindis, Armando. University of Kentucky; Estados Unidos
Fil: Guevara, Diego A.. Universidad Nacional de Colombia; Colombia
Fil: Ribeiro, Bruno. Universidade Federal de Goiás; Brasil
Fil: de Pedro, Diego. Centro de Investigacion Cientifica y de Educacion Superior de Ensenada; México
Fil: Pickering, John. Discover Life; Estados Unidos
Fil: Hung, Keng Lou James. Oklahoma State University; Estados Unidos
Fil: Parys, Katherine A.. United States Department of Agriculture; Argentina
Fil: McCabe, Lindsie M.. United States Department of Agriculture; Argentina
Fil: Rogan, Matthew S.. University of Yale; Estados Unidos
Fil: Minckley, Robert L.. University of Rochester; Estados Unidos
Fil: Velazco, Santiago José Elías. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Universidad Nacional de Misiones. Instituto de Biología Subtropical; Argentina
Fil: Griswold, Terry. United States Department of Agriculture; Argentina
Fil: Zarrillo, Tracy A.. The Connecticut Agricultural Experiment Station; Estados Unidos
Fil: Jetz, Walter. University of Yale; Estados Unidos
Fil: Sica, Yanina V.. University of Yale; Estados Unidos
Fil: Orr, Michael C.. Staatliches Museum Für Naturkunde Stuttgart; Alemania. Institute Of Zoology Chinese Academy Of Sciences; China
Fil: Guzman, Laura Melissa. University of Southern California; Estados Unidos
Fil: Ascher, John S.. National University Of Singapore; Singapur
Fil: Hughes, Alice C.. The University Of Hong Kong; Hong Kong
Fil: Cobb, Neil S.. Biodiversity Outreach Network; Estados Unidos
Materia
R package
Bee data
Ecoinformatic
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/218897

id CONICETDig_0024118d7bbe9c4100360b9c2c146021
oai_identifier_str oai:ri.conicet.gov.ar:11336/218897
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling A globally synthesised and flagged bee occurrence dataset and cleaning workflowDorey, James B.Fischer, Erica E.Chesshire, Paige R.Nava Bolaños, AngelaO´Reilly, Robert L.Bossert, SilasCollins, Shannon M.Lichtenberg, Elinor M.Tucker, Erika M.Smith Pardo, AllanFalcon Brindis, ArmandoGuevara, Diego A.Ribeiro, Brunode Pedro, DiegoPickering, JohnHung, Keng Lou JamesParys, Katherine A.McCabe, Lindsie M.Rogan, Matthew S.Minckley, Robert L.Velazco, Santiago José ElíasGriswold, TerryZarrillo, Tracy A.Jetz, WalterSica, Yanina V.Orr, Michael C.Guzman, Laura MelissaAscher, John S.Hughes, Alice C.Cobb, Neil S.R packageBee dataEcoinformatichttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.Fil: Dorey, James B.. Flinders University; AustraliaFil: Fischer, Erica E.. King’s College London; Reino UnidoFil: Chesshire, Paige R.. Northern Arizona University; Estados UnidosFil: Nava Bolaños, Angela. Unam Campus Juriquilla; MéxicoFil: O´Reilly, Robert L.. Flinders University; AustraliaFil: Bossert, Silas. National Museum of Natural History; Estados Unidos. Washington State University; Estados UnidosFil: Collins, Shannon M.. University of North Texas; Estados UnidosFil: Lichtenberg, Elinor M.. University of North Texas; Estados UnidosFil: Tucker, Erika M.. Biodiversity Outreach Network; Estados UnidosFil: Smith Pardo, Allan. United States Department of Agriculture; ArgentinaFil: Falcon Brindis, Armando. University of Kentucky; Estados UnidosFil: Guevara, Diego A.. Universidad Nacional de Colombia; ColombiaFil: Ribeiro, Bruno. Universidade Federal de Goiás; BrasilFil: de Pedro, Diego. Centro de Investigacion Cientifica y de Educacion Superior de Ensenada; MéxicoFil: Pickering, John. Discover Life; Estados UnidosFil: Hung, Keng Lou James. Oklahoma State University; Estados UnidosFil: Parys, Katherine A.. United States Department of Agriculture; ArgentinaFil: McCabe, Lindsie M.. United States Department of Agriculture; ArgentinaFil: Rogan, Matthew S.. University of Yale; Estados UnidosFil: Minckley, Robert L.. University of Rochester; Estados UnidosFil: Velazco, Santiago José Elías. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Universidad Nacional de Misiones. Instituto de Biología Subtropical; ArgentinaFil: Griswold, Terry. United States Department of Agriculture; ArgentinaFil: Zarrillo, Tracy A.. The Connecticut Agricultural Experiment Station; Estados UnidosFil: Jetz, Walter. University of Yale; Estados UnidosFil: Sica, Yanina V.. University of Yale; Estados UnidosFil: Orr, Michael C.. Staatliches Museum Für Naturkunde Stuttgart; Alemania. Institute Of Zoology Chinese Academy Of Sciences; ChinaFil: Guzman, Laura Melissa. University of Southern California; Estados UnidosFil: Ascher, John S.. National University Of Singapore; SingapurFil: Hughes, Alice C.. The University Of Hong Kong; Hong KongFil: Cobb, Neil S.. Biodiversity Outreach Network; Estados UnidosNature Research2023-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/218897Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; et al.; A globally synthesised and flagged bee occurrence dataset and cleaning workflow; Nature Research; Scientific Data; 10; 1; 11-2023; 1-172052-4463CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/s41597-023-02626-winfo:eu-repo/semantics/altIdentifier/doi/10.1038/s41597-023-02626-winfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:19:01Zoai:ri.conicet.gov.ar:11336/218897instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:19:01.492CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title A globally synthesised and flagged bee occurrence dataset and cleaning workflow
spellingShingle A globally synthesised and flagged bee occurrence dataset and cleaning workflow
Dorey, James B.
R package
Bee data
Ecoinformatic
title_short A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_full A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_fullStr A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_full_unstemmed A globally synthesised and flagged bee occurrence dataset and cleaning workflow
title_sort A globally synthesised and flagged bee occurrence dataset and cleaning workflow
dc.creator.none.fl_str_mv Dorey, James B.
Fischer, Erica E.
Chesshire, Paige R.
Nava Bolaños, Angela
O´Reilly, Robert L.
Bossert, Silas
Collins, Shannon M.
Lichtenberg, Elinor M.
Tucker, Erika M.
Smith Pardo, Allan
Falcon Brindis, Armando
Guevara, Diego A.
Ribeiro, Bruno
de Pedro, Diego
Pickering, John
Hung, Keng Lou James
Parys, Katherine A.
McCabe, Lindsie M.
Rogan, Matthew S.
Minckley, Robert L.
Velazco, Santiago José Elías
Griswold, Terry
Zarrillo, Tracy A.
Jetz, Walter
Sica, Yanina V.
Orr, Michael C.
Guzman, Laura Melissa
Ascher, John S.
Hughes, Alice C.
Cobb, Neil S.
author Dorey, James B.
author_facet Dorey, James B.
Fischer, Erica E.
Chesshire, Paige R.
Nava Bolaños, Angela
O´Reilly, Robert L.
Bossert, Silas
Collins, Shannon M.
Lichtenberg, Elinor M.
Tucker, Erika M.
Smith Pardo, Allan
Falcon Brindis, Armando
Guevara, Diego A.
Ribeiro, Bruno
de Pedro, Diego
Pickering, John
Hung, Keng Lou James
Parys, Katherine A.
McCabe, Lindsie M.
Rogan, Matthew S.
Minckley, Robert L.
Velazco, Santiago José Elías
Griswold, Terry
Zarrillo, Tracy A.
Jetz, Walter
Sica, Yanina V.
Orr, Michael C.
Guzman, Laura Melissa
Ascher, John S.
Hughes, Alice C.
Cobb, Neil S.
author_role author
author2 Fischer, Erica E.
Chesshire, Paige R.
Nava Bolaños, Angela
O´Reilly, Robert L.
Bossert, Silas
Collins, Shannon M.
Lichtenberg, Elinor M.
Tucker, Erika M.
Smith Pardo, Allan
Falcon Brindis, Armando
Guevara, Diego A.
Ribeiro, Bruno
de Pedro, Diego
Pickering, John
Hung, Keng Lou James
Parys, Katherine A.
McCabe, Lindsie M.
Rogan, Matthew S.
Minckley, Robert L.
Velazco, Santiago José Elías
Griswold, Terry
Zarrillo, Tracy A.
Jetz, Walter
Sica, Yanina V.
Orr, Michael C.
Guzman, Laura Melissa
Ascher, John S.
Hughes, Alice C.
Cobb, Neil S.
author2_role author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
dc.subject.none.fl_str_mv R package
Bee data
Ecoinformatic
topic R package
Bee data
Ecoinformatic
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.6
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
Fil: Dorey, James B.. Flinders University; Australia
Fil: Fischer, Erica E.. King’s College London; Reino Unido
Fil: Chesshire, Paige R.. Northern Arizona University; Estados Unidos
Fil: Nava Bolaños, Angela. Unam Campus Juriquilla; México
Fil: O´Reilly, Robert L.. Flinders University; Australia
Fil: Bossert, Silas. National Museum of Natural History; Estados Unidos. Washington State University; Estados Unidos
Fil: Collins, Shannon M.. University of North Texas; Estados Unidos
Fil: Lichtenberg, Elinor M.. University of North Texas; Estados Unidos
Fil: Tucker, Erika M.. Biodiversity Outreach Network; Estados Unidos
Fil: Smith Pardo, Allan. United States Department of Agriculture; Argentina
Fil: Falcon Brindis, Armando. University of Kentucky; Estados Unidos
Fil: Guevara, Diego A.. Universidad Nacional de Colombia; Colombia
Fil: Ribeiro, Bruno. Universidade Federal de Goiás; Brasil
Fil: de Pedro, Diego. Centro de Investigacion Cientifica y de Educacion Superior de Ensenada; México
Fil: Pickering, John. Discover Life; Estados Unidos
Fil: Hung, Keng Lou James. Oklahoma State University; Estados Unidos
Fil: Parys, Katherine A.. United States Department of Agriculture; Argentina
Fil: McCabe, Lindsie M.. United States Department of Agriculture; Argentina
Fil: Rogan, Matthew S.. University of Yale; Estados Unidos
Fil: Minckley, Robert L.. University of Rochester; Estados Unidos
Fil: Velazco, Santiago José Elías. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Universidad Nacional de Misiones. Instituto de Biología Subtropical; Argentina
Fil: Griswold, Terry. United States Department of Agriculture; Argentina
Fil: Zarrillo, Tracy A.. The Connecticut Agricultural Experiment Station; Estados Unidos
Fil: Jetz, Walter. University of Yale; Estados Unidos
Fil: Sica, Yanina V.. University of Yale; Estados Unidos
Fil: Orr, Michael C.. Staatliches Museum Für Naturkunde Stuttgart; Alemania. Institute Of Zoology Chinese Academy Of Sciences; China
Fil: Guzman, Laura Melissa. University of Southern California; Estados Unidos
Fil: Ascher, John S.. National University Of Singapore; Singapur
Fil: Hughes, Alice C.. The University Of Hong Kong; Hong Kong
Fil: Cobb, Neil S.. Biodiversity Outreach Network; Estados Unidos
description Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
publishDate 2023
dc.date.none.fl_str_mv 2023-11
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/218897
Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; et al.; A globally synthesised and flagged bee occurrence dataset and cleaning workflow; Nature Research; Scientific Data; 10; 1; 11-2023; 1-17
2052-4463
CONICET Digital
CONICET
url http://hdl.handle.net/11336/218897
identifier_str_mv Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; et al.; A globally synthesised and flagged bee occurrence dataset and cleaning workflow; Nature Research; Scientific Data; 10; 1; 11-2023; 1-17
2052-4463
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/s41597-023-02626-w
info:eu-repo/semantics/altIdentifier/doi/10.1038/s41597-023-02626-w
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Nature Research
publisher.none.fl_str_mv Nature Research
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844614157727432704
score 13.070432