Generating implicit object fragment datasets for machine learning

Autores: López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; Fuertes, José M.
Año de publicación: 2024
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.
Fil: López, Alfonso. Universidad de Jaén; España
Fil: Rueda, José Antonio. Universidad de Jaén; España
Fil: Segura, Rafael J.. Universidad de Jaén; España
Fil: Ogayar, Carlos J.. Universidad de Jaén; España
Fil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; Argentina
Fil: Fuertes, José M.. Universidad de Jaén; España
Materia: VOXEL
FRAGMENTATION
FRACTURE DATASET
VORONOI
GPU PROGRAMMING
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/261276

Acceder

id	CONICETDig_24a2df9b806171d7edc3fe72bdc05108
oai_identifier_str	oai:ri.conicet.gov.ar:11336/261276
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Generating implicit object fragment datasets for machine learningLópez, AlfonsoRueda, José AntonioSegura, Rafael J.Ogayar, Carlos J.Navarro, Jose PabloFuertes, José M.VOXELFRAGMENTATIONFRACTURE DATASETVORONOIGPU PROGRAMMINGhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.Fil: López, Alfonso. Universidad de Jaén; EspañaFil: Rueda, José Antonio. Universidad de Jaén; EspañaFil: Segura, Rafael J.. Universidad de Jaén; EspañaFil: Ogayar, Carlos J.. Universidad de Jaén; EspañaFil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; ArgentinaFil: Fuertes, José M.. Universidad de Jaén; EspañaPergamon-Elsevier Science Ltd2024-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/261276López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-120097-8493CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0097849324002395info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cag.2024.104104info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-11-05T09:42:50Zoai:ri.conicet.gov.ar:11336/261276instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-11-05 09:42:51.069CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Generating implicit object fragment datasets for machine learning
title	Generating implicit object fragment datasets for machine learning
spellingShingle	Generating implicit object fragment datasets for machine learning López, Alfonso VOXEL FRAGMENTATION FRACTURE DATASET VORONOI GPU PROGRAMMING
title_short	Generating implicit object fragment datasets for machine learning
title_full	Generating implicit object fragment datasets for machine learning
title_fullStr	Generating implicit object fragment datasets for machine learning
title_full_unstemmed	Generating implicit object fragment datasets for machine learning
title_sort	Generating implicit object fragment datasets for machine learning
dc.creator.none.fl_str_mv	López, Alfonso Rueda, José Antonio Segura, Rafael J. Ogayar, Carlos J. Navarro, Jose Pablo Fuertes, José M.
author	López, Alfonso
author_facet	López, Alfonso Rueda, José Antonio Segura, Rafael J. Ogayar, Carlos J. Navarro, Jose Pablo Fuertes, José M.
author_role	author
author2	Rueda, José Antonio Segura, Rafael J. Ogayar, Carlos J. Navarro, Jose Pablo Fuertes, José M.
author2_role	author author author author author
dc.subject.none.fl_str_mv	VOXEL FRAGMENTATION FRACTURE DATASET VORONOI GPU PROGRAMMING
topic	VOXEL FRAGMENTATION FRACTURE DATASET VORONOI GPU PROGRAMMING
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results. Fil: López, Alfonso. Universidad de Jaén; España Fil: Rueda, José Antonio. Universidad de Jaén; España Fil: Segura, Rafael J.. Universidad de Jaén; España Fil: Ogayar, Carlos J.. Universidad de Jaén; España Fil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; Argentina Fil: Fuertes, José M.. Universidad de Jaén; España
description	One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.
publishDate	2024
dc.date.none.fl_str_mv	2024-12
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/261276 López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-12 0097-8493 CONICET Digital CONICET
url	http://hdl.handle.net/11336/261276
identifier_str_mv	López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-12 0097-8493 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0097849324002395 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cag.2024.104104
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Pergamon-Elsevier Science Ltd
publisher.none.fl_str_mv	Pergamon-Elsevier Science Ltd
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1847977008826417152
score	13.121305

Generating implicit object fragment datasets for machine learning

Publicaciones similares