Generating implicit object fragment datasets for machine learning

Autores
López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; Fuertes, José M.
Año de publicación
2024
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.
Fil: López, Alfonso. Universidad de Jaén; España
Fil: Rueda, José Antonio. Universidad de Jaén; España
Fil: Segura, Rafael J.. Universidad de Jaén; España
Fil: Ogayar, Carlos J.. Universidad de Jaén; España
Fil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; Argentina
Fil: Fuertes, José M.. Universidad de Jaén; España
Materia
VOXEL
FRAGMENTATION
FRACTURE DATASET
VORONOI
GPU PROGRAMMING
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/261276

id CONICETDig_24a2df9b806171d7edc3fe72bdc05108
oai_identifier_str oai:ri.conicet.gov.ar:11336/261276
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Generating implicit object fragment datasets for machine learningLópez, AlfonsoRueda, José AntonioSegura, Rafael J.Ogayar, Carlos J.Navarro, Jose PabloFuertes, José M.VOXELFRAGMENTATIONFRACTURE DATASETVORONOIGPU PROGRAMMINGhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.Fil: López, Alfonso. Universidad de Jaén; EspañaFil: Rueda, José Antonio. Universidad de Jaén; EspañaFil: Segura, Rafael J.. Universidad de Jaén; EspañaFil: Ogayar, Carlos J.. Universidad de Jaén; EspañaFil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; ArgentinaFil: Fuertes, José M.. Universidad de Jaén; EspañaPergamon-Elsevier Science Ltd2024-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/261276López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-120097-8493CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0097849324002395info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cag.2024.104104info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:40:49Zoai:ri.conicet.gov.ar:11336/261276instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:40:49.538CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Generating implicit object fragment datasets for machine learning
title Generating implicit object fragment datasets for machine learning
spellingShingle Generating implicit object fragment datasets for machine learning
López, Alfonso
VOXEL
FRAGMENTATION
FRACTURE DATASET
VORONOI
GPU PROGRAMMING
title_short Generating implicit object fragment datasets for machine learning
title_full Generating implicit object fragment datasets for machine learning
title_fullStr Generating implicit object fragment datasets for machine learning
title_full_unstemmed Generating implicit object fragment datasets for machine learning
title_sort Generating implicit object fragment datasets for machine learning
dc.creator.none.fl_str_mv López, Alfonso
Rueda, José Antonio
Segura, Rafael J.
Ogayar, Carlos J.
Navarro, Jose Pablo
Fuertes, José M.
author López, Alfonso
author_facet López, Alfonso
Rueda, José Antonio
Segura, Rafael J.
Ogayar, Carlos J.
Navarro, Jose Pablo
Fuertes, José M.
author_role author
author2 Rueda, José Antonio
Segura, Rafael J.
Ogayar, Carlos J.
Navarro, Jose Pablo
Fuertes, José M.
author2_role author
author
author
author
author
dc.subject.none.fl_str_mv VOXEL
FRAGMENTATION
FRACTURE DATASET
VORONOI
GPU PROGRAMMING
topic VOXEL
FRAGMENTATION
FRACTURE DATASET
VORONOI
GPU PROGRAMMING
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.
Fil: López, Alfonso. Universidad de Jaén; España
Fil: Rueda, José Antonio. Universidad de Jaén; España
Fil: Segura, Rafael J.. Universidad de Jaén; España
Fil: Ogayar, Carlos J.. Universidad de Jaén; España
Fil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; Argentina
Fil: Fuertes, José M.. Universidad de Jaén; España
description One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.
publishDate 2024
dc.date.none.fl_str_mv 2024-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/261276
López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-12
0097-8493
CONICET Digital
CONICET
url http://hdl.handle.net/11336/261276
identifier_str_mv López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-12
0097-8493
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0097849324002395
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cag.2024.104104
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Pergamon-Elsevier Science Ltd
publisher.none.fl_str_mv Pergamon-Elsevier Science Ltd
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613291447418880
score 13.070432