Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values
- Autores
- Rodríguez, Martín; Rossi, Gustavo Héctor; Fernández, Alejandro
- Año de publicación
- 2025
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- The design and implementation of unit tests is a complex task many programmers neglect. This research evaluates the potential of Large Language Models (LLMs) in automatically generating test cases, comparing them with manual tests. An optimized prompt was developed, that integrates code and requirements, covering critical cases such as equivalence partitions and boundary values. The strengths and weaknesses of LLMs versus trained programmers were compared through quantitative metrics and manual qualitative analysis. The results show that the effectiveness of LLMs depends on well-designed prompts, robust implementation, and precise requirements. Although flexible and promising, LLMs still require human supervision. This work highlights the importance of manual qualitative analysis as an essential complement to automation in unit test evaluation.
- Materia
-
Ciencias de la Computación e Información
Evaluation
Unit Testing
LLM - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-nd/4.0/
- Repositorio
.jpg)
- Institución
- Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
- OAI Identificador
- oai:digital.cic.gba.gob.ar:11746/12673
Ver los metadatos del registro completo
| id |
CICBA_f0ffe3e09fe08940eae6cc8467676b0f |
|---|---|
| oai_identifier_str |
oai:digital.cic.gba.gob.ar:11746/12673 |
| network_acronym_str |
CICBA |
| repository_id_str |
9441 |
| network_name_str |
CIC Digital (CICBA) |
| spelling |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary ValuesRodríguez, MartínRossi, Gustavo HéctorFernández, AlejandroCiencias de la Computación e InformaciónEvaluationUnit TestingLLMThe design and implementation of unit tests is a complex task many programmers neglect. This research evaluates the potential of Large Language Models (LLMs) in automatically generating test cases, comparing them with manual tests. An optimized prompt was developed, that integrates code and requirements, covering critical cases such as equivalence partitions and boundary values. The strengths and weaknesses of LLMs versus trained programmers were compared through quantitative metrics and manual qualitative analysis. The results show that the effectiveness of LLMs depends on well-designed prompts, robust implementation, and precise requirements. Although flexible and promising, LLMs still require human supervision. This work highlights the importance of manual qualitative analysis as an essential complement to automation in unit test evaluation.2025info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/12673enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2026-03-26T11:18:04Zoai:digital.cic.gba.gob.ar:11746/12673Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412026-03-26 11:18:05.039CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse |
| dc.title.none.fl_str_mv |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| title |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| spellingShingle |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values Rodríguez, Martín Ciencias de la Computación e Información Evaluation Unit Testing LLM |
| title_short |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| title_full |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| title_fullStr |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| title_full_unstemmed |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| title_sort |
Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values |
| dc.creator.none.fl_str_mv |
Rodríguez, Martín Rossi, Gustavo Héctor Fernández, Alejandro |
| author |
Rodríguez, Martín |
| author_facet |
Rodríguez, Martín Rossi, Gustavo Héctor Fernández, Alejandro |
| author_role |
author |
| author2 |
Rossi, Gustavo Héctor Fernández, Alejandro |
| author2_role |
author author |
| dc.subject.none.fl_str_mv |
Ciencias de la Computación e Información Evaluation Unit Testing LLM |
| topic |
Ciencias de la Computación e Información Evaluation Unit Testing LLM |
| dc.description.none.fl_txt_mv |
The design and implementation of unit tests is a complex task many programmers neglect. This research evaluates the potential of Large Language Models (LLMs) in automatically generating test cases, comparing them with manual tests. An optimized prompt was developed, that integrates code and requirements, covering critical cases such as equivalence partitions and boundary values. The strengths and weaknesses of LLMs versus trained programmers were compared through quantitative metrics and manual qualitative analysis. The results show that the effectiveness of LLMs depends on well-designed prompts, robust implementation, and precise requirements. Although flexible and promising, LLMs still require human supervision. This work highlights the importance of manual qualitative analysis as an essential complement to automation in unit test evaluation. |
| description |
The design and implementation of unit tests is a complex task many programmers neglect. This research evaluates the potential of Large Language Models (LLMs) in automatically generating test cases, comparing them with manual tests. An optimized prompt was developed, that integrates code and requirements, covering critical cases such as equivalence partitions and boundary values. The strengths and weaknesses of LLMs versus trained programmers were compared through quantitative metrics and manual qualitative analysis. The results show that the effectiveness of LLMs depends on well-designed prompts, robust implementation, and precise requirements. Although flexible and promising, LLMs still require human supervision. This work highlights the importance of manual qualitative analysis as an essential complement to automation in unit test evaluation. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
| format |
conferenceObject |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
https://digital.cic.gba.gob.ar/handle/11746/12673 |
| url |
https://digital.cic.gba.gob.ar/handle/11746/12673 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-nd/4.0/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.source.none.fl_str_mv |
reponame:CIC Digital (CICBA) instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires instacron:CICBA |
| reponame_str |
CIC Digital (CICBA) |
| collection |
CIC Digital (CICBA) |
| instname_str |
Comisión de Investigaciones Científicas de la Provincia de Buenos Aires |
| instacron_str |
CICBA |
| institution |
CICBA |
| repository.name.fl_str_mv |
CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires |
| repository.mail.fl_str_mv |
marisa.degiusti@sedici.unlp.edu.ar |
| _version_ |
1860736755337003008 |
| score |
13.332987 |