Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models

Autores
Casado, Ulises Martín; Altuna, Facundo Ignacio; Miccio, Luis Alejandro
Año de publicación
2024
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
In this study, we employed machine learning techniques to improve sustainable materials design by examining how various latent space representations affect the AI performance in property predictions. We compared three fingerprinting methodologies: (a) neural networks trained on specific properties, (b) encoder–decoder architectures, and c) traditional Morgan fingerprints. Their encoding quality was quantitatively compared by using these fingerprints as inputs for a simple regression model (Random Forest) to predict glass transition temperatures (Tg), a critical parameter in determining material performance. We found that the task-specific neural networks achieved the highest accuracy, with a mean absolute percentage error (MAPE) of 10% and an R2 of 0.9, significantly outperforming encoder–decoder models (MAPE: 19%, R2: 0.76) and Morgan fingerprints (MAPE: 24%, R2: 0.6). In addition, we used dimensionality reduction techniques, such as principal component analysis (PCA) and t-distributed stochastic neighbour embedding (t-SNE), to gain insights on the models’ abilities to learn relevant molecular features to Tg. By offering a more profound understanding of how chemical structures influence AI-based property predictions, this approach enables the efficient identification of high-performing materials in applications that range from water decontamination to polymer recyclability with minimum experimental effort, promoting a circular economy in materials science.
Fil: Casado, Ulises Martín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina
Fil: Altuna, Facundo Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina
Fil: Miccio, Luis Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina. Universidad del País Vasco; España
Materia
AI-assisted design
glass formers
latent space
data scarcity conditions
sustainable materials design
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/264165

id CONICETDig_00231faa90dcaaf92851e2f6a2e17bae
oai_identifier_str oai:ri.conicet.gov.ar:11336/264165
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI ModelsCasado, Ulises MartínAltuna, Facundo IgnacioMiccio, Luis AlejandroAI-assisted designglass formerslatent spacedata scarcity conditionssustainable materials designhttps://purl.org/becyt/ford/2.5https://purl.org/becyt/ford/2In this study, we employed machine learning techniques to improve sustainable materials design by examining how various latent space representations affect the AI performance in property predictions. We compared three fingerprinting methodologies: (a) neural networks trained on specific properties, (b) encoder–decoder architectures, and c) traditional Morgan fingerprints. Their encoding quality was quantitatively compared by using these fingerprints as inputs for a simple regression model (Random Forest) to predict glass transition temperatures (Tg), a critical parameter in determining material performance. We found that the task-specific neural networks achieved the highest accuracy, with a mean absolute percentage error (MAPE) of 10% and an R2 of 0.9, significantly outperforming encoder–decoder models (MAPE: 19%, R2: 0.76) and Morgan fingerprints (MAPE: 24%, R2: 0.6). In addition, we used dimensionality reduction techniques, such as principal component analysis (PCA) and t-distributed stochastic neighbour embedding (t-SNE), to gain insights on the models’ abilities to learn relevant molecular features to Tg. By offering a more profound understanding of how chemical structures influence AI-based property predictions, this approach enables the efficient identification of high-performing materials in applications that range from water decontamination to polymer recyclability with minimum experimental effort, promoting a circular economy in materials science.Fil: Casado, Ulises Martín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; ArgentinaFil: Altuna, Facundo Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; ArgentinaFil: Miccio, Luis Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina. Universidad del País Vasco; EspañaMDPI2024-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/264165Casado, Ulises Martín; Altuna, Facundo Ignacio; Miccio, Luis Alejandro; Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models; MDPI; Sustainability; 16; 23; 12-2024; 1-152071-1050CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2071-1050/16/23/10681info:eu-repo/semantics/altIdentifier/doi/10.3390/su162310681info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:39:43Zoai:ri.conicet.gov.ar:11336/264165instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:39:43.788CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
title Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
spellingShingle Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
Casado, Ulises Martín
AI-assisted design
glass formers
latent space
data scarcity conditions
sustainable materials design
title_short Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
title_full Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
title_fullStr Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
title_full_unstemmed Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
title_sort Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models
dc.creator.none.fl_str_mv Casado, Ulises Martín
Altuna, Facundo Ignacio
Miccio, Luis Alejandro
author Casado, Ulises Martín
author_facet Casado, Ulises Martín
Altuna, Facundo Ignacio
Miccio, Luis Alejandro
author_role author
author2 Altuna, Facundo Ignacio
Miccio, Luis Alejandro
author2_role author
author
dc.subject.none.fl_str_mv AI-assisted design
glass formers
latent space
data scarcity conditions
sustainable materials design
topic AI-assisted design
glass formers
latent space
data scarcity conditions
sustainable materials design
purl_subject.fl_str_mv https://purl.org/becyt/ford/2.5
https://purl.org/becyt/ford/2
dc.description.none.fl_txt_mv In this study, we employed machine learning techniques to improve sustainable materials design by examining how various latent space representations affect the AI performance in property predictions. We compared three fingerprinting methodologies: (a) neural networks trained on specific properties, (b) encoder–decoder architectures, and c) traditional Morgan fingerprints. Their encoding quality was quantitatively compared by using these fingerprints as inputs for a simple regression model (Random Forest) to predict glass transition temperatures (Tg), a critical parameter in determining material performance. We found that the task-specific neural networks achieved the highest accuracy, with a mean absolute percentage error (MAPE) of 10% and an R2 of 0.9, significantly outperforming encoder–decoder models (MAPE: 19%, R2: 0.76) and Morgan fingerprints (MAPE: 24%, R2: 0.6). In addition, we used dimensionality reduction techniques, such as principal component analysis (PCA) and t-distributed stochastic neighbour embedding (t-SNE), to gain insights on the models’ abilities to learn relevant molecular features to Tg. By offering a more profound understanding of how chemical structures influence AI-based property predictions, this approach enables the efficient identification of high-performing materials in applications that range from water decontamination to polymer recyclability with minimum experimental effort, promoting a circular economy in materials science.
Fil: Casado, Ulises Martín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina
Fil: Altuna, Facundo Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina
Fil: Miccio, Luis Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones en Ciencia y Tecnología de Materiales. Universidad Nacional de Mar del Plata. Facultad de Ingeniería. Instituto de Investigaciones en Ciencia y Tecnología de Materiales; Argentina. Universidad del País Vasco; España
description In this study, we employed machine learning techniques to improve sustainable materials design by examining how various latent space representations affect the AI performance in property predictions. We compared three fingerprinting methodologies: (a) neural networks trained on specific properties, (b) encoder–decoder architectures, and c) traditional Morgan fingerprints. Their encoding quality was quantitatively compared by using these fingerprints as inputs for a simple regression model (Random Forest) to predict glass transition temperatures (Tg), a critical parameter in determining material performance. We found that the task-specific neural networks achieved the highest accuracy, with a mean absolute percentage error (MAPE) of 10% and an R2 of 0.9, significantly outperforming encoder–decoder models (MAPE: 19%, R2: 0.76) and Morgan fingerprints (MAPE: 24%, R2: 0.6). In addition, we used dimensionality reduction techniques, such as principal component analysis (PCA) and t-distributed stochastic neighbour embedding (t-SNE), to gain insights on the models’ abilities to learn relevant molecular features to Tg. By offering a more profound understanding of how chemical structures influence AI-based property predictions, this approach enables the efficient identification of high-performing materials in applications that range from water decontamination to polymer recyclability with minimum experimental effort, promoting a circular economy in materials science.
publishDate 2024
dc.date.none.fl_str_mv 2024-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/264165
Casado, Ulises Martín; Altuna, Facundo Ignacio; Miccio, Luis Alejandro; Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models; MDPI; Sustainability; 16; 23; 12-2024; 1-15
2071-1050
CONICET Digital
CONICET
url http://hdl.handle.net/11336/264165
identifier_str_mv Casado, Ulises Martín; Altuna, Facundo Ignacio; Miccio, Luis Alejandro; Towards Sustainable Material Design: A Comparative Analysis of Latent Space Representations in AI Models; MDPI; Sustainability; 16; 23; 12-2024; 1-15
2071-1050
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2071-1050/16/23/10681
info:eu-repo/semantics/altIdentifier/doi/10.3390/su162310681
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844614423166058496
score 13.070432