Evaluation of natural language processing models to measure similarity between scenarios written in Spanish

Autores
Pérez, Gabriela Alejandra; Mostaccio, Catalina Alba; Antonelli, Leandro; Maltempo, Giuliana
Año de publicación
2024
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Requirements engineering is a critical phase in software development; it seeks to understand and document system requirements from early stages. Typically, requirements specification involves close collaboration be- tween customers and development teams. Customers contribute their expertise in the domain language, while developers use more technical, computational terms. Despite these differences, achieving mutual understanding is crucial. One of the most widely used artifacts for this purpose is scenarios. In environments where multiple actors write scenarios, duplication is common. Thus, there is a need for mechanisms to detect similar scenarios and prevent redundancy. In this paper we empirically evaluate several pre-trained Natural Language Processing models to analyze the semantic similarity between scenarios in Spanish, identifying words or phrases with equivalent meanings. It is important to note that the analysis is performed in this language to contribute to the region. Finally, we present a tool that facilitates the creation of new scenarios by identifying potential similarities with existing ones. The tool supports multiple models, allowing users to select the most appropriate one to detect similarscenarios accurately during the definition process.
Materia
Ciencias de la Computación e Información
Natural Language Processing models
scenarios in Spanish
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
CIC Digital (CICBA)
Institución
Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
OAI Identificador
oai:digital.cic.gba.gob.ar:11746/12431

id CICBA_0d94dee5ed844f9c87610e1d2204b1c8
oai_identifier_str oai:digital.cic.gba.gob.ar:11746/12431
network_acronym_str CICBA
repository_id_str 9441
network_name_str CIC Digital (CICBA)
spelling Evaluation of natural language processing models to measure similarity between scenarios written in SpanishPérez, Gabriela AlejandraMostaccio, Catalina AlbaAntonelli, LeandroMaltempo, GiulianaCiencias de la Computación e InformaciónNatural Language Processing modelsscenarios in SpanishRequirements engineering is a critical phase in software development; it seeks to understand and document system requirements from early stages. Typically, requirements specification involves close collaboration be- tween customers and development teams. Customers contribute their expertise in the domain language, while developers use more technical, computational terms. Despite these differences, achieving mutual understanding is crucial. One of the most widely used artifacts for this purpose is scenarios. In environments where multiple actors write scenarios, duplication is common. Thus, there is a need for mechanisms to detect similar scenarios and prevent redundancy. In this paper we empirically evaluate several pre-trained Natural Language Processing models to analyze the semantic similarity between scenarios in Spanish, identifying words or phrases with equivalent meanings. It is important to note that the analysis is performed in this language to contribute to the region. Finally, we present a tool that facilitates the creation of new scenarios by identifying potential similarities with existing ones. The tool supports multiple models, allowing users to select the most appropriate one to detect similarscenarios accurately during the definition process.2024info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/12431enginfo:eu-repo/semantics/altIdentifier/doi/10.12957/cadinf.2024.87935info:eu-repo/semantics/altIdentifier/issn/2317-2193info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2025-09-29T13:39:52Zoai:digital.cic.gba.gob.ar:11746/12431Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412025-09-29 13:39:53.178CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse
dc.title.none.fl_str_mv Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
title Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
spellingShingle Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
Pérez, Gabriela Alejandra
Ciencias de la Computación e Información
Natural Language Processing models
scenarios in Spanish
title_short Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
title_full Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
title_fullStr Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
title_full_unstemmed Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
title_sort Evaluation of natural language processing models to measure similarity between scenarios written in Spanish
dc.creator.none.fl_str_mv Pérez, Gabriela Alejandra
Mostaccio, Catalina Alba
Antonelli, Leandro
Maltempo, Giuliana
author Pérez, Gabriela Alejandra
author_facet Pérez, Gabriela Alejandra
Mostaccio, Catalina Alba
Antonelli, Leandro
Maltempo, Giuliana
author_role author
author2 Mostaccio, Catalina Alba
Antonelli, Leandro
Maltempo, Giuliana
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias de la Computación e Información
Natural Language Processing models
scenarios in Spanish
topic Ciencias de la Computación e Información
Natural Language Processing models
scenarios in Spanish
dc.description.none.fl_txt_mv Requirements engineering is a critical phase in software development; it seeks to understand and document system requirements from early stages. Typically, requirements specification involves close collaboration be- tween customers and development teams. Customers contribute their expertise in the domain language, while developers use more technical, computational terms. Despite these differences, achieving mutual understanding is crucial. One of the most widely used artifacts for this purpose is scenarios. In environments where multiple actors write scenarios, duplication is common. Thus, there is a need for mechanisms to detect similar scenarios and prevent redundancy. In this paper we empirically evaluate several pre-trained Natural Language Processing models to analyze the semantic similarity between scenarios in Spanish, identifying words or phrases with equivalent meanings. It is important to note that the analysis is performed in this language to contribute to the region. Finally, we present a tool that facilitates the creation of new scenarios by identifying potential similarities with existing ones. The tool supports multiple models, allowing users to select the most appropriate one to detect similarscenarios accurately during the definition process.
description Requirements engineering is a critical phase in software development; it seeks to understand and document system requirements from early stages. Typically, requirements specification involves close collaboration be- tween customers and development teams. Customers contribute their expertise in the domain language, while developers use more technical, computational terms. Despite these differences, achieving mutual understanding is crucial. One of the most widely used artifacts for this purpose is scenarios. In environments where multiple actors write scenarios, duplication is common. Thus, there is a need for mechanisms to detect similar scenarios and prevent redundancy. In this paper we empirically evaluate several pre-trained Natural Language Processing models to analyze the semantic similarity between scenarios in Spanish, identifying words or phrases with equivalent meanings. It is important to note that the analysis is performed in this language to contribute to the region. Finally, we present a tool that facilitates the creation of new scenarios by identifying potential similarities with existing ones. The tool supports multiple models, allowing users to select the most appropriate one to detect similarscenarios accurately during the definition process.
publishDate 2024
dc.date.none.fl_str_mv 2024
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv https://digital.cic.gba.gob.ar/handle/11746/12431
url https://digital.cic.gba.gob.ar/handle/11746/12431
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.12957/cadinf.2024.87935
info:eu-repo/semantics/altIdentifier/issn/2317-2193
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:CIC Digital (CICBA)
instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron:CICBA
reponame_str CIC Digital (CICBA)
collection CIC Digital (CICBA)
instname_str Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron_str CICBA
institution CICBA
repository.name.fl_str_mv CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
repository.mail.fl_str_mv marisa.degiusti@sedici.unlp.edu.ar
_version_ 1844618583393435648
score 13.070432