A globally synthesised and flagged bee occurrence dataset and cleaning workflow
- Autores
- Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; Bossert, Silas; Collins, Shannon M.; Lichtenberg, Elinor M.; Tucker, Erika M.; Smith Pardo, Allan; Falcon Brindis, Armando; Guevara, Diego A.; Ribeiro, Bruno; de Pedro, Diego; Pickering, John; Hung, Keng Lou James; Parys, Katherine A.; McCabe, Lindsie M.; Rogan, Matthew S.; Minckley, Robert L.; Velazco, Santiago José Elías; Griswold, Terry; Zarrillo, Tracy A.; Jetz, Walter; Sica, Yanina V.; Orr, Michael C.; Guzman, Laura Melissa; Ascher, John S.; Hughes, Alice C.; Cobb, Neil S.
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
Fil: Dorey, James B.. Flinders University; Australia
Fil: Fischer, Erica E.. King’s College London; Reino Unido
Fil: Chesshire, Paige R.. Northern Arizona University; Estados Unidos
Fil: Nava Bolaños, Angela. Unam Campus Juriquilla; México
Fil: O´Reilly, Robert L.. Flinders University; Australia
Fil: Bossert, Silas. National Museum of Natural History; Estados Unidos. Washington State University; Estados Unidos
Fil: Collins, Shannon M.. University of North Texas; Estados Unidos
Fil: Lichtenberg, Elinor M.. University of North Texas; Estados Unidos
Fil: Tucker, Erika M.. Biodiversity Outreach Network; Estados Unidos
Fil: Smith Pardo, Allan. United States Department of Agriculture; Argentina
Fil: Falcon Brindis, Armando. University of Kentucky; Estados Unidos
Fil: Guevara, Diego A.. Universidad Nacional de Colombia; Colombia
Fil: Ribeiro, Bruno. Universidade Federal de Goiás; Brasil
Fil: de Pedro, Diego. Centro de Investigacion Cientifica y de Educacion Superior de Ensenada; México
Fil: Pickering, John. Discover Life; Estados Unidos
Fil: Hung, Keng Lou James. Oklahoma State University; Estados Unidos
Fil: Parys, Katherine A.. United States Department of Agriculture; Argentina
Fil: McCabe, Lindsie M.. United States Department of Agriculture; Argentina
Fil: Rogan, Matthew S.. University of Yale; Estados Unidos
Fil: Minckley, Robert L.. University of Rochester; Estados Unidos
Fil: Velazco, Santiago José Elías. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Universidad Nacional de Misiones. Instituto de Biología Subtropical; Argentina
Fil: Griswold, Terry. United States Department of Agriculture; Argentina
Fil: Zarrillo, Tracy A.. The Connecticut Agricultural Experiment Station; Estados Unidos
Fil: Jetz, Walter. University of Yale; Estados Unidos
Fil: Sica, Yanina V.. University of Yale; Estados Unidos
Fil: Orr, Michael C.. Staatliches Museum Für Naturkunde Stuttgart; Alemania. Institute Of Zoology Chinese Academy Of Sciences; China
Fil: Guzman, Laura Melissa. University of Southern California; Estados Unidos
Fil: Ascher, John S.. National University Of Singapore; Singapur
Fil: Hughes, Alice C.. The University Of Hong Kong; Hong Kong
Fil: Cobb, Neil S.. Biodiversity Outreach Network; Estados Unidos - Materia
-
R package
Bee data
Ecoinformatic - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/218897
Ver los metadatos del registro completo
id |
CONICETDig_0024118d7bbe9c4100360b9c2c146021 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/218897 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
A globally synthesised and flagged bee occurrence dataset and cleaning workflowDorey, James B.Fischer, Erica E.Chesshire, Paige R.Nava Bolaños, AngelaO´Reilly, Robert L.Bossert, SilasCollins, Shannon M.Lichtenberg, Elinor M.Tucker, Erika M.Smith Pardo, AllanFalcon Brindis, ArmandoGuevara, Diego A.Ribeiro, Brunode Pedro, DiegoPickering, JohnHung, Keng Lou JamesParys, Katherine A.McCabe, Lindsie M.Rogan, Matthew S.Minckley, Robert L.Velazco, Santiago José ElíasGriswold, TerryZarrillo, Tracy A.Jetz, WalterSica, Yanina V.Orr, Michael C.Guzman, Laura MelissaAscher, John S.Hughes, Alice C.Cobb, Neil S.R packageBee dataEcoinformatichttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.Fil: Dorey, James B.. Flinders University; AustraliaFil: Fischer, Erica E.. King’s College London; Reino UnidoFil: Chesshire, Paige R.. Northern Arizona University; Estados UnidosFil: Nava Bolaños, Angela. Unam Campus Juriquilla; MéxicoFil: O´Reilly, Robert L.. Flinders University; AustraliaFil: Bossert, Silas. National Museum of Natural History; Estados Unidos. Washington State University; Estados UnidosFil: Collins, Shannon M.. University of North Texas; Estados UnidosFil: Lichtenberg, Elinor M.. University of North Texas; Estados UnidosFil: Tucker, Erika M.. Biodiversity Outreach Network; Estados UnidosFil: Smith Pardo, Allan. United States Department of Agriculture; ArgentinaFil: Falcon Brindis, Armando. University of Kentucky; Estados UnidosFil: Guevara, Diego A.. Universidad Nacional de Colombia; ColombiaFil: Ribeiro, Bruno. Universidade Federal de Goiás; BrasilFil: de Pedro, Diego. Centro de Investigacion Cientifica y de Educacion Superior de Ensenada; MéxicoFil: Pickering, John. Discover Life; Estados UnidosFil: Hung, Keng Lou James. Oklahoma State University; Estados UnidosFil: Parys, Katherine A.. United States Department of Agriculture; ArgentinaFil: McCabe, Lindsie M.. United States Department of Agriculture; ArgentinaFil: Rogan, Matthew S.. University of Yale; Estados UnidosFil: Minckley, Robert L.. University of Rochester; Estados UnidosFil: Velazco, Santiago José Elías. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Universidad Nacional de Misiones. Instituto de Biología Subtropical; ArgentinaFil: Griswold, Terry. United States Department of Agriculture; ArgentinaFil: Zarrillo, Tracy A.. The Connecticut Agricultural Experiment Station; Estados UnidosFil: Jetz, Walter. University of Yale; Estados UnidosFil: Sica, Yanina V.. University of Yale; Estados UnidosFil: Orr, Michael C.. Staatliches Museum Für Naturkunde Stuttgart; Alemania. Institute Of Zoology Chinese Academy Of Sciences; ChinaFil: Guzman, Laura Melissa. University of Southern California; Estados UnidosFil: Ascher, John S.. National University Of Singapore; SingapurFil: Hughes, Alice C.. The University Of Hong Kong; Hong KongFil: Cobb, Neil S.. Biodiversity Outreach Network; Estados UnidosNature Research2023-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/218897Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; et al.; A globally synthesised and flagged bee occurrence dataset and cleaning workflow; Nature Research; Scientific Data; 10; 1; 11-2023; 1-172052-4463CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/s41597-023-02626-winfo:eu-repo/semantics/altIdentifier/doi/10.1038/s41597-023-02626-winfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:19:01Zoai:ri.conicet.gov.ar:11336/218897instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:19:01.492CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
title |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
spellingShingle |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow Dorey, James B. R package Bee data Ecoinformatic |
title_short |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
title_full |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
title_fullStr |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
title_full_unstemmed |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
title_sort |
A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
dc.creator.none.fl_str_mv |
Dorey, James B. Fischer, Erica E. Chesshire, Paige R. Nava Bolaños, Angela O´Reilly, Robert L. Bossert, Silas Collins, Shannon M. Lichtenberg, Elinor M. Tucker, Erika M. Smith Pardo, Allan Falcon Brindis, Armando Guevara, Diego A. Ribeiro, Bruno de Pedro, Diego Pickering, John Hung, Keng Lou James Parys, Katherine A. McCabe, Lindsie M. Rogan, Matthew S. Minckley, Robert L. Velazco, Santiago José Elías Griswold, Terry Zarrillo, Tracy A. Jetz, Walter Sica, Yanina V. Orr, Michael C. Guzman, Laura Melissa Ascher, John S. Hughes, Alice C. Cobb, Neil S. |
author |
Dorey, James B. |
author_facet |
Dorey, James B. Fischer, Erica E. Chesshire, Paige R. Nava Bolaños, Angela O´Reilly, Robert L. Bossert, Silas Collins, Shannon M. Lichtenberg, Elinor M. Tucker, Erika M. Smith Pardo, Allan Falcon Brindis, Armando Guevara, Diego A. Ribeiro, Bruno de Pedro, Diego Pickering, John Hung, Keng Lou James Parys, Katherine A. McCabe, Lindsie M. Rogan, Matthew S. Minckley, Robert L. Velazco, Santiago José Elías Griswold, Terry Zarrillo, Tracy A. Jetz, Walter Sica, Yanina V. Orr, Michael C. Guzman, Laura Melissa Ascher, John S. Hughes, Alice C. Cobb, Neil S. |
author_role |
author |
author2 |
Fischer, Erica E. Chesshire, Paige R. Nava Bolaños, Angela O´Reilly, Robert L. Bossert, Silas Collins, Shannon M. Lichtenberg, Elinor M. Tucker, Erika M. Smith Pardo, Allan Falcon Brindis, Armando Guevara, Diego A. Ribeiro, Bruno de Pedro, Diego Pickering, John Hung, Keng Lou James Parys, Katherine A. McCabe, Lindsie M. Rogan, Matthew S. Minckley, Robert L. Velazco, Santiago José Elías Griswold, Terry Zarrillo, Tracy A. Jetz, Walter Sica, Yanina V. Orr, Michael C. Guzman, Laura Melissa Ascher, John S. Hughes, Alice C. Cobb, Neil S. |
author2_role |
author author author author author author author author author author author author author author author author author author author author author author author author author author author author author |
dc.subject.none.fl_str_mv |
R package Bee data Ecoinformatic |
topic |
R package Bee data Ecoinformatic |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation. Fil: Dorey, James B.. Flinders University; Australia Fil: Fischer, Erica E.. King’s College London; Reino Unido Fil: Chesshire, Paige R.. Northern Arizona University; Estados Unidos Fil: Nava Bolaños, Angela. Unam Campus Juriquilla; México Fil: O´Reilly, Robert L.. Flinders University; Australia Fil: Bossert, Silas. National Museum of Natural History; Estados Unidos. Washington State University; Estados Unidos Fil: Collins, Shannon M.. University of North Texas; Estados Unidos Fil: Lichtenberg, Elinor M.. University of North Texas; Estados Unidos Fil: Tucker, Erika M.. Biodiversity Outreach Network; Estados Unidos Fil: Smith Pardo, Allan. United States Department of Agriculture; Argentina Fil: Falcon Brindis, Armando. University of Kentucky; Estados Unidos Fil: Guevara, Diego A.. Universidad Nacional de Colombia; Colombia Fil: Ribeiro, Bruno. Universidade Federal de Goiás; Brasil Fil: de Pedro, Diego. Centro de Investigacion Cientifica y de Educacion Superior de Ensenada; México Fil: Pickering, John. Discover Life; Estados Unidos Fil: Hung, Keng Lou James. Oklahoma State University; Estados Unidos Fil: Parys, Katherine A.. United States Department of Agriculture; Argentina Fil: McCabe, Lindsie M.. United States Department of Agriculture; Argentina Fil: Rogan, Matthew S.. University of Yale; Estados Unidos Fil: Minckley, Robert L.. University of Rochester; Estados Unidos Fil: Velazco, Santiago José Elías. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Universidad Nacional de Misiones. Instituto de Biología Subtropical; Argentina Fil: Griswold, Terry. United States Department of Agriculture; Argentina Fil: Zarrillo, Tracy A.. The Connecticut Agricultural Experiment Station; Estados Unidos Fil: Jetz, Walter. University of Yale; Estados Unidos Fil: Sica, Yanina V.. University of Yale; Estados Unidos Fil: Orr, Michael C.. Staatliches Museum Für Naturkunde Stuttgart; Alemania. Institute Of Zoology Chinese Academy Of Sciences; China Fil: Guzman, Laura Melissa. University of Southern California; Estados Unidos Fil: Ascher, John S.. National University Of Singapore; Singapur Fil: Hughes, Alice C.. The University Of Hong Kong; Hong Kong Fil: Cobb, Neil S.. Biodiversity Outreach Network; Estados Unidos |
description |
Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-11 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/218897 Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; et al.; A globally synthesised and flagged bee occurrence dataset and cleaning workflow; Nature Research; Scientific Data; 10; 1; 11-2023; 1-17 2052-4463 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/218897 |
identifier_str_mv |
Dorey, James B.; Fischer, Erica E.; Chesshire, Paige R.; Nava Bolaños, Angela; O´Reilly, Robert L.; et al.; A globally synthesised and flagged bee occurrence dataset and cleaning workflow; Nature Research; Scientific Data; 10; 1; 11-2023; 1-17 2052-4463 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/s41597-023-02626-w info:eu-repo/semantics/altIdentifier/doi/10.1038/s41597-023-02626-w |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Nature Research |
publisher.none.fl_str_mv |
Nature Research |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614157727432704 |
score |
13.070432 |