Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish

Autores
Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis
Año de publicación
2024
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.
Fil: Parra, Verónica Ester. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Sureda Figueroa, Diana Patricia. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Corica, Ana Rosa. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Materia
GENERATIVE AI
GEOMETRY
LLMS
MATH
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/240281

id CONICETDig_262fb2c79ddafc347880e596e7601540
oai_identifier_str oai:ri.conicet.gov.ar:11336/240281
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in SpanishParra, Verónica EsterSureda Figueroa, Diana PatriciaCorica, Ana RosaSchiaffino, Silvia NoemiGodoy, Daniela LisGENERATIVE AIGEOMETRYLLMSMATHhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.Fil: Parra, Verónica Ester. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; ArgentinaFil: Sureda Figueroa, Diana Patricia. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; ArgentinaFil: Corica, Ana Rosa. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; ArgentinaFil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaFil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaUniversidad Internacional de La Rioja2024-02info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/240281Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis; Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish; Universidad Internacional de La Rioja; International Journal of Interactive Multimedia and Artificial Intelligence; 8; 1; 2-2024; 65-741989-1660CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/bibcite/reference/3432info:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/sites/default/files/2024-02/ijimai8_5_7.pdfinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T10:00:08Zoai:ri.conicet.gov.ar:11336/240281instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 10:00:08.721CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
spellingShingle Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
Parra, Verónica Ester
GENERATIVE AI
GEOMETRY
LLMS
MATH
title_short Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_full Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_fullStr Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_full_unstemmed Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_sort Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
dc.creator.none.fl_str_mv Parra, Verónica Ester
Sureda Figueroa, Diana Patricia
Corica, Ana Rosa
Schiaffino, Silvia Noemi
Godoy, Daniela Lis
author Parra, Verónica Ester
author_facet Parra, Verónica Ester
Sureda Figueroa, Diana Patricia
Corica, Ana Rosa
Schiaffino, Silvia Noemi
Godoy, Daniela Lis
author_role author
author2 Sureda Figueroa, Diana Patricia
Corica, Ana Rosa
Schiaffino, Silvia Noemi
Godoy, Daniela Lis
author2_role author
author
author
author
dc.subject.none.fl_str_mv GENERATIVE AI
GEOMETRY
LLMS
MATH
topic GENERATIVE AI
GEOMETRY
LLMS
MATH
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.
Fil: Parra, Verónica Ester. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Sureda Figueroa, Diana Patricia. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Corica, Ana Rosa. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
description Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.
publishDate 2024
dc.date.none.fl_str_mv 2024-02
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/240281
Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis; Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish; Universidad Internacional de La Rioja; International Journal of Interactive Multimedia and Artificial Intelligence; 8; 1; 2-2024; 65-74
1989-1660
CONICET Digital
CONICET
url http://hdl.handle.net/11336/240281
identifier_str_mv Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis; Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish; Universidad Internacional de La Rioja; International Journal of Interactive Multimedia and Artificial Intelligence; 8; 1; 2-2024; 65-74
1989-1660
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/bibcite/reference/3432
info:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/sites/default/files/2024-02/ijimai8_5_7.pdf
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Universidad Internacional de La Rioja
publisher.none.fl_str_mv Universidad Internacional de La Rioja
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269622314729472
score 13.13397