Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish

Autores: Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis
Año de publicación: 2024
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.
Fil: Parra, Verónica Ester. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Sureda Figueroa, Diana Patricia. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Corica, Ana Rosa. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina
Fil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Materia: GENERATIVE AI
GEOMETRY
LLMS
MATH
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/240281

Acceder

id	CONICETDig_262fb2c79ddafc347880e596e7601540
oai_identifier_str	oai:ri.conicet.gov.ar:11336/240281
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in SpanishParra, Verónica EsterSureda Figueroa, Diana PatriciaCorica, Ana RosaSchiaffino, Silvia NoemiGodoy, Daniela LisGENERATIVE AIGEOMETRYLLMSMATHhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.Fil: Parra, Verónica Ester. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; ArgentinaFil: Sureda Figueroa, Diana Patricia. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; ArgentinaFil: Corica, Ana Rosa. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; ArgentinaFil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaFil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaUniversidad Internacional de La Rioja2024-02info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/240281Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis; Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish; Universidad Internacional de La Rioja; International Journal of Interactive Multimedia and Artificial Intelligence; 8; 1; 2-2024; 65-741989-1660CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/bibcite/reference/3432info:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/sites/default/files/2024-02/ijimai8_5_7.pdfinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:22:48Zoai:ri.conicet.gov.ar:11336/240281instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:22:48.448CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
spellingShingle	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish Parra, Verónica Ester GENERATIVE AI GEOMETRY LLMS MATH
title_short	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_full	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_fullStr	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_full_unstemmed	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
title_sort	Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish
dc.creator.none.fl_str_mv	Parra, Verónica Ester Sureda Figueroa, Diana Patricia Corica, Ana Rosa Schiaffino, Silvia Noemi Godoy, Daniela Lis
author	Parra, Verónica Ester
author_facet	Parra, Verónica Ester Sureda Figueroa, Diana Patricia Corica, Ana Rosa Schiaffino, Silvia Noemi Godoy, Daniela Lis
author_role	author
author2	Sureda Figueroa, Diana Patricia Corica, Ana Rosa Schiaffino, Silvia Noemi Godoy, Daniela Lis
author2_role	author author author author
dc.subject.none.fl_str_mv	GENERATIVE AI GEOMETRY LLMS MATH
topic	GENERATIVE AI GEOMETRY LLMS MATH
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI. Fil: Parra, Verónica Ester. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina Fil: Sureda Figueroa, Diana Patricia. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina Fil: Corica, Ana Rosa. Universidad Nacional del Centro de la Provincia de Buenos Aires. Facultad de Ciencias Exactas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil; Argentina Fil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
description	Generative Artificial Intelligence (AI) has emerged as a disruptive technology that is challenging traditional teaching and learning practices. Question-answering in natural language fosters the use of chatbots, such as ChatGPT, Bard and others, that generate text based on pre-trained Large Language Models (LLMs). The performance of these models in certain areas, like Math problem solving is receiving a crescent attention as it directly impacts on its potential use in educational settings. Most of these evaluations, however, concentrate on the construction and use of benchmarks comprising diverse Math problems in English. In this work, we discuss the capabilities of most used LLMs within the subfield of Geometry, in view of the relevance of this subject in high-school curricula and the difficulties exhibited by even most advanced multimodal LLMs to deal with geometric notions. This work focuses on Spanish, which is additionally a less resourced language. The answers of three major chatbots, based on different LLMs, were analyzed not only to determine their capacity to provide correct solutions, but also to categorize the errors found in the reasoning processes described. Understanding LLMs strengths and weaknesses in a field like Geometry can be a first step towards the design of more informed methodological proposals to include these technologies in classrooms as well as the development of more powerful automatic assistance tools based on generative AI.
publishDate	2024
dc.date.none.fl_str_mv	2024-02
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/240281 Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis; Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish; Universidad Internacional de La Rioja; International Journal of Interactive Multimedia and Artificial Intelligence; 8; 1; 2-2024; 65-74 1989-1660 CONICET Digital CONICET
url	http://hdl.handle.net/11336/240281
identifier_str_mv	Parra, Verónica Ester; Sureda Figueroa, Diana Patricia; Corica, Ana Rosa; Schiaffino, Silvia Noemi; Godoy, Daniela Lis; Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish; Universidad Internacional de La Rioja; International Journal of Interactive Multimedia and Artificial Intelligence; 8; 1; 2-2024; 65-74 1989-1660 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/bibcite/reference/3432 info:eu-repo/semantics/altIdentifier/url/https://www.ijimai.org/journal/sites/default/files/2024-02/ijimai8_5_7.pdf
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Universidad Internacional de La Rioja
publisher.none.fl_str_mv	Universidad Internacional de La Rioja
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858305679556608000
score	13.176822

Can generative AI solve Geometry problems? Strengths and weaknesses of LLMs for geometric reasoning in Spanish

Publicaciones similares