Generating implicit object fragment datasets for machine learning
- Autores
- López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; Fuertes, José M.
- Año de publicación
- 2024
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.
Fil: López, Alfonso. Universidad de Jaén; España
Fil: Rueda, José Antonio. Universidad de Jaén; España
Fil: Segura, Rafael J.. Universidad de Jaén; España
Fil: Ogayar, Carlos J.. Universidad de Jaén; España
Fil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; Argentina
Fil: Fuertes, José M.. Universidad de Jaén; España - Materia
-
VOXEL
FRAGMENTATION
FRACTURE DATASET
VORONOI
GPU PROGRAMMING - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/261276
Ver los metadatos del registro completo
id |
CONICETDig_24a2df9b806171d7edc3fe72bdc05108 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/261276 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Generating implicit object fragment datasets for machine learningLópez, AlfonsoRueda, José AntonioSegura, Rafael J.Ogayar, Carlos J.Navarro, Jose PabloFuertes, José M.VOXELFRAGMENTATIONFRACTURE DATASETVORONOIGPU PROGRAMMINGhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results.Fil: López, Alfonso. Universidad de Jaén; EspañaFil: Rueda, José Antonio. Universidad de Jaén; EspañaFil: Segura, Rafael J.. Universidad de Jaén; EspañaFil: Ogayar, Carlos J.. Universidad de Jaén; EspañaFil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; ArgentinaFil: Fuertes, José M.. Universidad de Jaén; EspañaPergamon-Elsevier Science Ltd2024-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/261276López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-120097-8493CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0097849324002395info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cag.2024.104104info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:40:49Zoai:ri.conicet.gov.ar:11336/261276instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:40:49.538CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Generating implicit object fragment datasets for machine learning |
title |
Generating implicit object fragment datasets for machine learning |
spellingShingle |
Generating implicit object fragment datasets for machine learning López, Alfonso VOXEL FRAGMENTATION FRACTURE DATASET VORONOI GPU PROGRAMMING |
title_short |
Generating implicit object fragment datasets for machine learning |
title_full |
Generating implicit object fragment datasets for machine learning |
title_fullStr |
Generating implicit object fragment datasets for machine learning |
title_full_unstemmed |
Generating implicit object fragment datasets for machine learning |
title_sort |
Generating implicit object fragment datasets for machine learning |
dc.creator.none.fl_str_mv |
López, Alfonso Rueda, José Antonio Segura, Rafael J. Ogayar, Carlos J. Navarro, Jose Pablo Fuertes, José M. |
author |
López, Alfonso |
author_facet |
López, Alfonso Rueda, José Antonio Segura, Rafael J. Ogayar, Carlos J. Navarro, Jose Pablo Fuertes, José M. |
author_role |
author |
author2 |
Rueda, José Antonio Segura, Rafael J. Ogayar, Carlos J. Navarro, Jose Pablo Fuertes, José M. |
author2_role |
author author author author author |
dc.subject.none.fl_str_mv |
VOXEL FRAGMENTATION FRACTURE DATASET VORONOI GPU PROGRAMMING |
topic |
VOXEL FRAGMENTATION FRACTURE DATASET VORONOI GPU PROGRAMMING |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results. Fil: López, Alfonso. Universidad de Jaén; España Fil: Rueda, José Antonio. Universidad de Jaén; España Fil: Segura, Rafael J.. Universidad de Jaén; España Fil: Ogayar, Carlos J.. Universidad de Jaén; España Fil: Navarro, Jose Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico de Ciencias Sociales y Humanas; Argentina Fil: Fuertes, José M.. Universidad de Jaén; España |
description |
One of the primary challenges inherent in utilizing deep learning models is the scarcity and accessibility hurdles associated with acquiring datasets of sufficient size to facilitate effective training of these networks. This is particularly significant in object detection, shape completion, and fracture assembly. Instead of scanning a large number of real-world fragments, it is possible to generate massive datasets with synthetic pieces. However, realistic fragmentation is computationally intensive in the preparation (e.g., pre-factured models) and generation. Otherwise, simpler algorithms such as Voronoi diagrams provide faster processing speeds at the expense of compromising realism. In this context, it is required to balance computational efficiency and realism. This paper introduces a GPU-based framework for the massive generation of voxelized fragments derived from high-resolution 3D models, specifically prepared for their utilization as training sets for machine learning models. This rapid pipeline enables controlling how many pieces are produced, their dispersion and the appearance of subtle effects such as erosion. We have tested our pipeline with an archaeological dataset, producing more than 1M fragmented pieces from 1,052 Iberian vessels (). Although this work primarily intends to provide pieces as implicit data represented by voxels, triangle meshes and point clouds can also be inferred from the initial implicit representation. To underscore the unparalleled benefits of CPU and GPU acceleration in generating vast datasets, we compared against a realistic fragment generator that highlights the potential of our approach, both in terms of applicability and processing time. We also demonstrate the synergies between our pipeline and realistic simulators, which frequently cannot select the number and size of resulting pieces. To this end, a deep learning model was trained over realistic fragments and our dataset, showcasing similar results. |
publishDate |
2024 |
dc.date.none.fl_str_mv |
2024-12 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/261276 López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-12 0097-8493 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/261276 |
identifier_str_mv |
López, Alfonso; Rueda, José Antonio; Segura, Rafael J.; Ogayar, Carlos J.; Navarro, Jose Pablo; et al.; Generating implicit object fragment datasets for machine learning; Pergamon-Elsevier Science Ltd; Computers & Graphics; 125; 104104; 12-2024; 1-12 0097-8493 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0097849324002395 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.cag.2024.104104 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Pergamon-Elsevier Science Ltd |
publisher.none.fl_str_mv |
Pergamon-Elsevier Science Ltd |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613291447418880 |
score |
13.070432 |