An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning

Autores
Wilf, Peter; Wing, Scott L.; Meyer, Herbert W.; Rose, Jacob A.; Saha, Rohit; Serre, Thomas; Cúneo, Néstor Rubén; Donovan, Michael P.; Erwin, Diane M.; Gandolfo, María A.; González Akre, Erika; Herrera, Fabiany; Hu, Shusheng; Iglesias, Ari; Johnson, Kirk R.; Karim, Talia S.; Zou, Xiaoyu
Año de publicación
2021
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Leaves are the most abundant and visible plant organ, both in the modern world and the fossil record. Identifying foliage to the correct plant family based on leaf architecture is a fundamental botanical skill that is also critical for isolated fossil leaves, which often, especially in the Cenozoic, represent extinct genera and species from extant families. Resources focused on leaf identification are remarkably scarce; however, the situation has improved due to the recent proliferation of digitized herbarium material, live-plant identification applications, and online collections of cleared and fossil leaf images. Nevertheless, the need remains for a specialized image dataset for comparative leaf architecture. We address this gap by assembling an open-access database of 30,252 images of vouchered leaf specimens vetted to family level, primarily of angiosperms, including 26,176 images of cleared and x-rayed leaves representing 354 families and 4,076 of fossil leaves from 48 families. The images maintain original resolution, have user-friendly filenames, and are vetted using APG and modern paleobotanical standards. The cleared and x-rayed leaves include the Jack A. Wolfe and Leo J. Hickey contributions to the National Cleared Leaf Collection and a collection of high-resolution scanned x-ray negatives, housed in the Division of Paleobotany, Department of Paleobiology, Smithsonian National Museum of Natural History, Washington D.C.; and the Daniel I. Axelrod Cleared Leaf Collection, housed at the University of California Museum of Paleontology, Berkeley. The fossil images include a sampling of Late Cretaceous to Eocene paleobotanical sites from the Western Hemisphere held at numerous institutions, especially from Florissant Fossil Beds National Monument (late Eocene, Colorado), as well as several other localities from the Late Cretaceous to Eocene of the Western USA and the early Paleogene of Colombia and southern Argentina. The dataset facilitates new research and education opportunities in paleobotany, comparative leaf architecture, systematics, and machine learning.
Fil: Wilf, Peter. State University of Pennsylvania; Estados Unidos
Fil: Wing, Scott L.. National Museum of Natural History; Estados Unidos
Fil: Meyer, Herbert W.. State University of Pennsylvania; Estados Unidos
Fil: Rose, Jacob A.. State University of Pennsylvania; Estados Unidos
Fil: Saha, Rohit. State University of Pennsylvania; Estados Unidos
Fil: Serre, Thomas. State University of Pennsylvania; Estados Unidos
Fil: Cúneo, Néstor Rubén. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Museo Paleontológico Egidio Feruglio; Argentina
Fil: Donovan, Michael P.. State University of Pennsylvania; Estados Unidos
Fil: Erwin, Diane M.. State University of Pennsylvania; Estados Unidos
Fil: Gandolfo, María A.. Cornell University; Estados Unidos
Fil: González Akre, Erika. State University of Pennsylvania; Estados Unidos
Fil: Herrera, Fabiany. National Museum of Natural History; Estados Unidos
Fil: Hu, Shusheng. State University of Pennsylvania; Estados Unidos
Fil: Iglesias, Ari. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte. Instituto de Investigaciones en Biodiversidad y Medioambiente. Universidad Nacional del Comahue. Centro Regional Universidad Bariloche. Instituto de Investigaciones en Biodiversidad y Medioambiente; Argentina
Fil: Johnson, Kirk R.. Smithsonian Tropical Research Institute; Panamá
Fil: Karim, Talia S.. University of Colorado; Estados Unidos
Fil: Zou, Xiaoyu. State University of Pennsylvania; Estados Unidos
Materia
ANGIOSPERMS
CLEARED LEAVES
DATA SCIENCE
FOSSIL LEAVES
LEAF ARCHITECTURE
PALEOBOTANY
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/184093

id CONICETDig_bc44ecba2fb8c4c2de405884e0ab6084
oai_identifier_str oai:ri.conicet.gov.ar:11336/184093
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learningWilf, PeterWing, Scott L.Meyer, Herbert W.Rose, Jacob A.Saha, RohitSerre, ThomasCúneo, Néstor RubénDonovan, Michael P.Erwin, Diane M.Gandolfo, María A.González Akre, ErikaHerrera, FabianyHu, ShushengIglesias, AriJohnson, Kirk R.Karim, Talia S.Zou, XiaoyuANGIOSPERMSCLEARED LEAVESDATA SCIENCEFOSSIL LEAVESLEAF ARCHITECTUREPALEOBOTANYhttps://purl.org/becyt/ford/1.5https://purl.org/becyt/ford/1Leaves are the most abundant and visible plant organ, both in the modern world and the fossil record. Identifying foliage to the correct plant family based on leaf architecture is a fundamental botanical skill that is also critical for isolated fossil leaves, which often, especially in the Cenozoic, represent extinct genera and species from extant families. Resources focused on leaf identification are remarkably scarce; however, the situation has improved due to the recent proliferation of digitized herbarium material, live-plant identification applications, and online collections of cleared and fossil leaf images. Nevertheless, the need remains for a specialized image dataset for comparative leaf architecture. We address this gap by assembling an open-access database of 30,252 images of vouchered leaf specimens vetted to family level, primarily of angiosperms, including 26,176 images of cleared and x-rayed leaves representing 354 families and 4,076 of fossil leaves from 48 families. The images maintain original resolution, have user-friendly filenames, and are vetted using APG and modern paleobotanical standards. The cleared and x-rayed leaves include the Jack A. Wolfe and Leo J. Hickey contributions to the National Cleared Leaf Collection and a collection of high-resolution scanned x-ray negatives, housed in the Division of Paleobotany, Department of Paleobiology, Smithsonian National Museum of Natural History, Washington D.C.; and the Daniel I. Axelrod Cleared Leaf Collection, housed at the University of California Museum of Paleontology, Berkeley. The fossil images include a sampling of Late Cretaceous to Eocene paleobotanical sites from the Western Hemisphere held at numerous institutions, especially from Florissant Fossil Beds National Monument (late Eocene, Colorado), as well as several other localities from the Late Cretaceous to Eocene of the Western USA and the early Paleogene of Colombia and southern Argentina. The dataset facilitates new research and education opportunities in paleobotany, comparative leaf architecture, systematics, and machine learning.Fil: Wilf, Peter. State University of Pennsylvania; Estados UnidosFil: Wing, Scott L.. National Museum of Natural History; Estados UnidosFil: Meyer, Herbert W.. State University of Pennsylvania; Estados UnidosFil: Rose, Jacob A.. State University of Pennsylvania; Estados UnidosFil: Saha, Rohit. State University of Pennsylvania; Estados UnidosFil: Serre, Thomas. State University of Pennsylvania; Estados UnidosFil: Cúneo, Néstor Rubén. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Museo Paleontológico Egidio Feruglio; ArgentinaFil: Donovan, Michael P.. State University of Pennsylvania; Estados UnidosFil: Erwin, Diane M.. State University of Pennsylvania; Estados UnidosFil: Gandolfo, María A.. Cornell University; Estados UnidosFil: González Akre, Erika. State University of Pennsylvania; Estados UnidosFil: Herrera, Fabiany. National Museum of Natural History; Estados UnidosFil: Hu, Shusheng. State University of Pennsylvania; Estados UnidosFil: Iglesias, Ari. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte. Instituto de Investigaciones en Biodiversidad y Medioambiente. Universidad Nacional del Comahue. Centro Regional Universidad Bariloche. Instituto de Investigaciones en Biodiversidad y Medioambiente; ArgentinaFil: Johnson, Kirk R.. Smithsonian Tropical Research Institute; PanamáFil: Karim, Talia S.. University of Colorado; Estados UnidosFil: Zou, Xiaoyu. State University of Pennsylvania; Estados UnidosPensoft Publishers2021-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/184093Wilf, Peter; Wing, Scott L.; Meyer, Herbert W.; Rose, Jacob A.; Saha, Rohit; et al.; An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning; Pensoft Publishers; PhytoKeys; 187; 12-2021; 93-1281314-20031314-2011CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://phytokeys.pensoft.net/article/72350/info:eu-repo/semantics/altIdentifier/doi/10.3897/phytokeys.187.72350info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T10:07:15Zoai:ri.conicet.gov.ar:11336/184093instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 10:07:16.082CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
title An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
spellingShingle An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
Wilf, Peter
ANGIOSPERMS
CLEARED LEAVES
DATA SCIENCE
FOSSIL LEAVES
LEAF ARCHITECTURE
PALEOBOTANY
title_short An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
title_full An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
title_fullStr An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
title_full_unstemmed An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
title_sort An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
dc.creator.none.fl_str_mv Wilf, Peter
Wing, Scott L.
Meyer, Herbert W.
Rose, Jacob A.
Saha, Rohit
Serre, Thomas
Cúneo, Néstor Rubén
Donovan, Michael P.
Erwin, Diane M.
Gandolfo, María A.
González Akre, Erika
Herrera, Fabiany
Hu, Shusheng
Iglesias, Ari
Johnson, Kirk R.
Karim, Talia S.
Zou, Xiaoyu
author Wilf, Peter
author_facet Wilf, Peter
Wing, Scott L.
Meyer, Herbert W.
Rose, Jacob A.
Saha, Rohit
Serre, Thomas
Cúneo, Néstor Rubén
Donovan, Michael P.
Erwin, Diane M.
Gandolfo, María A.
González Akre, Erika
Herrera, Fabiany
Hu, Shusheng
Iglesias, Ari
Johnson, Kirk R.
Karim, Talia S.
Zou, Xiaoyu
author_role author
author2 Wing, Scott L.
Meyer, Herbert W.
Rose, Jacob A.
Saha, Rohit
Serre, Thomas
Cúneo, Néstor Rubén
Donovan, Michael P.
Erwin, Diane M.
Gandolfo, María A.
González Akre, Erika
Herrera, Fabiany
Hu, Shusheng
Iglesias, Ari
Johnson, Kirk R.
Karim, Talia S.
Zou, Xiaoyu
author2_role author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
author
dc.subject.none.fl_str_mv ANGIOSPERMS
CLEARED LEAVES
DATA SCIENCE
FOSSIL LEAVES
LEAF ARCHITECTURE
PALEOBOTANY
topic ANGIOSPERMS
CLEARED LEAVES
DATA SCIENCE
FOSSIL LEAVES
LEAF ARCHITECTURE
PALEOBOTANY
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.5
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Leaves are the most abundant and visible plant organ, both in the modern world and the fossil record. Identifying foliage to the correct plant family based on leaf architecture is a fundamental botanical skill that is also critical for isolated fossil leaves, which often, especially in the Cenozoic, represent extinct genera and species from extant families. Resources focused on leaf identification are remarkably scarce; however, the situation has improved due to the recent proliferation of digitized herbarium material, live-plant identification applications, and online collections of cleared and fossil leaf images. Nevertheless, the need remains for a specialized image dataset for comparative leaf architecture. We address this gap by assembling an open-access database of 30,252 images of vouchered leaf specimens vetted to family level, primarily of angiosperms, including 26,176 images of cleared and x-rayed leaves representing 354 families and 4,076 of fossil leaves from 48 families. The images maintain original resolution, have user-friendly filenames, and are vetted using APG and modern paleobotanical standards. The cleared and x-rayed leaves include the Jack A. Wolfe and Leo J. Hickey contributions to the National Cleared Leaf Collection and a collection of high-resolution scanned x-ray negatives, housed in the Division of Paleobotany, Department of Paleobiology, Smithsonian National Museum of Natural History, Washington D.C.; and the Daniel I. Axelrod Cleared Leaf Collection, housed at the University of California Museum of Paleontology, Berkeley. The fossil images include a sampling of Late Cretaceous to Eocene paleobotanical sites from the Western Hemisphere held at numerous institutions, especially from Florissant Fossil Beds National Monument (late Eocene, Colorado), as well as several other localities from the Late Cretaceous to Eocene of the Western USA and the early Paleogene of Colombia and southern Argentina. The dataset facilitates new research and education opportunities in paleobotany, comparative leaf architecture, systematics, and machine learning.
Fil: Wilf, Peter. State University of Pennsylvania; Estados Unidos
Fil: Wing, Scott L.. National Museum of Natural History; Estados Unidos
Fil: Meyer, Herbert W.. State University of Pennsylvania; Estados Unidos
Fil: Rose, Jacob A.. State University of Pennsylvania; Estados Unidos
Fil: Saha, Rohit. State University of Pennsylvania; Estados Unidos
Fil: Serre, Thomas. State University of Pennsylvania; Estados Unidos
Fil: Cúneo, Néstor Rubén. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Museo Paleontológico Egidio Feruglio; Argentina
Fil: Donovan, Michael P.. State University of Pennsylvania; Estados Unidos
Fil: Erwin, Diane M.. State University of Pennsylvania; Estados Unidos
Fil: Gandolfo, María A.. Cornell University; Estados Unidos
Fil: González Akre, Erika. State University of Pennsylvania; Estados Unidos
Fil: Herrera, Fabiany. National Museum of Natural History; Estados Unidos
Fil: Hu, Shusheng. State University of Pennsylvania; Estados Unidos
Fil: Iglesias, Ari. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte. Instituto de Investigaciones en Biodiversidad y Medioambiente. Universidad Nacional del Comahue. Centro Regional Universidad Bariloche. Instituto de Investigaciones en Biodiversidad y Medioambiente; Argentina
Fil: Johnson, Kirk R.. Smithsonian Tropical Research Institute; Panamá
Fil: Karim, Talia S.. University of Colorado; Estados Unidos
Fil: Zou, Xiaoyu. State University of Pennsylvania; Estados Unidos
description Leaves are the most abundant and visible plant organ, both in the modern world and the fossil record. Identifying foliage to the correct plant family based on leaf architecture is a fundamental botanical skill that is also critical for isolated fossil leaves, which often, especially in the Cenozoic, represent extinct genera and species from extant families. Resources focused on leaf identification are remarkably scarce; however, the situation has improved due to the recent proliferation of digitized herbarium material, live-plant identification applications, and online collections of cleared and fossil leaf images. Nevertheless, the need remains for a specialized image dataset for comparative leaf architecture. We address this gap by assembling an open-access database of 30,252 images of vouchered leaf specimens vetted to family level, primarily of angiosperms, including 26,176 images of cleared and x-rayed leaves representing 354 families and 4,076 of fossil leaves from 48 families. The images maintain original resolution, have user-friendly filenames, and are vetted using APG and modern paleobotanical standards. The cleared and x-rayed leaves include the Jack A. Wolfe and Leo J. Hickey contributions to the National Cleared Leaf Collection and a collection of high-resolution scanned x-ray negatives, housed in the Division of Paleobotany, Department of Paleobiology, Smithsonian National Museum of Natural History, Washington D.C.; and the Daniel I. Axelrod Cleared Leaf Collection, housed at the University of California Museum of Paleontology, Berkeley. The fossil images include a sampling of Late Cretaceous to Eocene paleobotanical sites from the Western Hemisphere held at numerous institutions, especially from Florissant Fossil Beds National Monument (late Eocene, Colorado), as well as several other localities from the Late Cretaceous to Eocene of the Western USA and the early Paleogene of Colombia and southern Argentina. The dataset facilitates new research and education opportunities in paleobotany, comparative leaf architecture, systematics, and machine learning.
publishDate 2021
dc.date.none.fl_str_mv 2021-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/184093
Wilf, Peter; Wing, Scott L.; Meyer, Herbert W.; Rose, Jacob A.; Saha, Rohit; et al.; An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning; Pensoft Publishers; PhytoKeys; 187; 12-2021; 93-128
1314-2003
1314-2011
CONICET Digital
CONICET
url http://hdl.handle.net/11336/184093
identifier_str_mv Wilf, Peter; Wing, Scott L.; Meyer, Herbert W.; Rose, Jacob A.; Saha, Rohit; et al.; An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning; Pensoft Publishers; PhytoKeys; 187; 12-2021; 93-128
1314-2003
1314-2011
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://phytokeys.pensoft.net/article/72350/
info:eu-repo/semantics/altIdentifier/doi/10.3897/phytokeys.187.72350
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Pensoft Publishers
publisher.none.fl_str_mv Pensoft Publishers
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269996189745152
score 13.13397