Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts

Autores: Herrera, Myriam; Romagnano, María Rosalía Gema; Ruiz, Susana
Año de publicación: 2025
Idioma: español castellano
Tipo de recurso: documento de conferencia
Estado: versión publicada
Descripción: In Argentine universities, the management of student data is a critical issue that needs to be addressed immediately. These educational institutions collect a variety and quantity of data, such as the total number of students enrolled, the most chosen career/s, the dropout rate, among others. However, the retrieval, recording and analysis of these data is often inefficient and disorganized because many of them are in free textual con-tent format and come from diverse information sources. This abundance of data, while valuable, presents a significant challenge due to its unstructured and heterogeneous nature. That is, how to process textual Big Data to obtain information and then acquire knowledge that can help us make valuable decisions? In the educational domain, Text Analytics provides valuable information. This paper presents the Textual Data Analysis, collected from student surveys of two careers of the Faculty of Exact, Physical and Natural Sciences of the National University of San Juan. For this purpose, the ALCESTE method (Lexical Analysis of Cooccurrences in Simple Sentences of a Text) and other methods of the textual domain, such as word glossaries, concordances and the selection of the most specific vocabulary of each text, have been combined in order to provide a comparative tool. As a result, it is shown how the study of the distribution of the lexicon used in a text allows us to detect the structuring of the meanings present in it.
En las universidades argentinas la gestión de los datos estudiantiles es una problemática crítica que requiere ser atendida de inmediato.Estas instituciones educativas recopilan una variedad y cantidad de datos tales como el total de estudiantes matriculados, la/s carrera/s más elegida/s, la tasa de deserción, entre otros. Sin embargo, la recuperación, registro y el análisis de estos datos, a menudo, es ineficaz y desorganizada debido a que muchos de ellos se encuentran en formato de contenido textual libre y provienen de diversas fuentes de información. Esta abundancia de datos, aunque valiosa, presenta un desafío significativo, debido a su naturaleza desestructurada y heterogénea. Es decir, ¿cómo procesar Big Data textual para obtener información y luego adquirir conocimiento que pueda ayudarnos a tomar valiosas decisiones? En el ámbito educativo, el Análisis de Texto proporciona información valiosa. En este trabajo se presenta el Análisis de Datos Textuales, relevados a partir de las encuestas estudiantiles de dos carreras de la Facultad de Ciencias Exactas, Físicas y Naturales de la Universidad Nacional de San Juan. Para ello se ha combiado el método ALCESTE (Análisis Lexical de Coocurrencias en Enunciados Simples de un Texto) y otros métodos del dominio textual, tales como los glosarios de palabras, las concordancias y la selección del vocabulario más específico de cada texto, para así proveer una herramienta comparativa. Como resultado se muestra cómo el estudio de la distribución del léxico empleado en un texto permite detectar la estructuración de los significados presentes en el mismo.
Sociedad Argentina de Informática e Investigación Operativa
Materia: Ciencias Informáticas
gestión de datos
deserción estudiantil
análisis de texto
encuestas educativas
data management
student dropout
text analysis
educational surveys
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/190347

Acceder

id	SEDICI_7f0149a338ba626c649cfb4dbe0ef484
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/190347
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Textual Data Analysis in University Surveys to Understand and Reduce Student DropoutsAnálisis de datos textuales en encuestas universitarias para comprender y disminuir la deserción estudiantilHerrera, MyriamRomagnano, María Rosalía GemaRuiz, SusanaCiencias Informáticasgestión de datosdeserción estudiantilanálisis de textoencuestas educativasdata managementstudent dropouttext analysiseducational surveysIn Argentine universities, the management of student data is a critical issue that needs to be addressed immediately. These educational institutions collect a variety and quantity of data, such as the total number of students enrolled, the most chosen career/s, the dropout rate, among others. However, the retrieval, recording and analysis of these data is often inefficient and disorganized because many of them are in free textual con-tent format and come from diverse information sources. This abundance of data, while valuable, presents a significant challenge due to its unstructured and heterogeneous nature. That is, how to process textual Big Data to obtain information and then acquire knowledge that can help us make valuable decisions? In the educational domain, Text Analytics provides valuable information. This paper presents the Textual Data Analysis, collected from student surveys of two careers of the Faculty of Exact, Physical and Natural Sciences of the National University of San Juan. For this purpose, the ALCESTE method (Lexical Analysis of Cooccurrences in Simple Sentences of a Text) and other methods of the textual domain, such as word glossaries, concordances and the selection of the most specific vocabulary of each text, have been combined in order to provide a comparative tool. As a result, it is shown how the study of the distribution of the lexicon used in a text allows us to detect the structuring of the meanings present in it.En las universidades argentinas la gestión de los datos estudiantiles es una problemática crítica que requiere ser atendida de inmediato.Estas instituciones educativas recopilan una variedad y cantidad de datos tales como el total de estudiantes matriculados, la/s carrera/s más elegida/s, la tasa de deserción, entre otros. Sin embargo, la recuperación, registro y el análisis de estos datos, a menudo, es ineficaz y desorganizada debido a que muchos de ellos se encuentran en formato de contenido textual libre y provienen de diversas fuentes de información. Esta abundancia de datos, aunque valiosa, presenta un desafío significativo, debido a su naturaleza desestructurada y heterogénea. Es decir, ¿cómo procesar Big Data textual para obtener información y luego adquirir conocimiento que pueda ayudarnos a tomar valiosas decisiones? En el ámbito educativo, el Análisis de Texto proporciona información valiosa. En este trabajo se presenta el Análisis de Datos Textuales, relevados a partir de las encuestas estudiantiles de dos carreras de la Facultad de Ciencias Exactas, Físicas y Naturales de la Universidad Nacional de San Juan. Para ello se ha combiado el método ALCESTE (Análisis Lexical de Coocurrencias en Enunciados Simples de un Texto) y otros métodos del dominio textual, tales como los glosarios de palabras, las concordancias y la selección del vocabulario más específico de cada texto, para así proveer una herramienta comparativa. Como resultado se muestra cómo el estudio de la distribución del léxico empleado en un texto permite detectar la estructuración de los significados presentes en el mismo.Sociedad Argentina de Informática e Investigación Operativa2025-08info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf44-59http://sedici.unlp.edu.ar/handle/10915/190347spainfo:eu-repo/semantics/altIdentifier/url/https://revistas.unlp.edu.ar/JAIIO/article/view/19932info:eu-repo/semantics/altIdentifier/issn/2451-7496info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-05-27T11:46:36Zoai:sedici.unlp.edu.ar:10915/190347Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-05-27 11:46:36.439SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts Análisis de datos textuales en encuestas universitarias para comprender y disminuir la deserción estudiantil
title	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts
spellingShingle	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts Herrera, Myriam Ciencias Informáticas gestión de datos deserción estudiantil análisis de texto encuestas educativas data management student dropout text analysis educational surveys
title_short	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts
title_full	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts
title_fullStr	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts
title_full_unstemmed	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts
title_sort	Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts
dc.creator.none.fl_str_mv	Herrera, Myriam Romagnano, María Rosalía Gema Ruiz, Susana
author	Herrera, Myriam
author_facet	Herrera, Myriam Romagnano, María Rosalía Gema Ruiz, Susana
author_role	author
author2	Romagnano, María Rosalía Gema Ruiz, Susana
author2_role	author author
dc.subject.none.fl_str_mv	Ciencias Informáticas gestión de datos deserción estudiantil análisis de texto encuestas educativas data management student dropout text analysis educational surveys
topic	Ciencias Informáticas gestión de datos deserción estudiantil análisis de texto encuestas educativas data management student dropout text analysis educational surveys
dc.description.none.fl_txt_mv	In Argentine universities, the management of student data is a critical issue that needs to be addressed immediately. These educational institutions collect a variety and quantity of data, such as the total number of students enrolled, the most chosen career/s, the dropout rate, among others. However, the retrieval, recording and analysis of these data is often inefficient and disorganized because many of them are in free textual con-tent format and come from diverse information sources. This abundance of data, while valuable, presents a significant challenge due to its unstructured and heterogeneous nature. That is, how to process textual Big Data to obtain information and then acquire knowledge that can help us make valuable decisions? In the educational domain, Text Analytics provides valuable information. This paper presents the Textual Data Analysis, collected from student surveys of two careers of the Faculty of Exact, Physical and Natural Sciences of the National University of San Juan. For this purpose, the ALCESTE method (Lexical Analysis of Cooccurrences in Simple Sentences of a Text) and other methods of the textual domain, such as word glossaries, concordances and the selection of the most specific vocabulary of each text, have been combined in order to provide a comparative tool. As a result, it is shown how the study of the distribution of the lexicon used in a text allows us to detect the structuring of the meanings present in it. En las universidades argentinas la gestión de los datos estudiantiles es una problemática crítica que requiere ser atendida de inmediato.Estas instituciones educativas recopilan una variedad y cantidad de datos tales como el total de estudiantes matriculados, la/s carrera/s más elegida/s, la tasa de deserción, entre otros. Sin embargo, la recuperación, registro y el análisis de estos datos, a menudo, es ineficaz y desorganizada debido a que muchos de ellos se encuentran en formato de contenido textual libre y provienen de diversas fuentes de información. Esta abundancia de datos, aunque valiosa, presenta un desafío significativo, debido a su naturaleza desestructurada y heterogénea. Es decir, ¿cómo procesar Big Data textual para obtener información y luego adquirir conocimiento que pueda ayudarnos a tomar valiosas decisiones? En el ámbito educativo, el Análisis de Texto proporciona información valiosa. En este trabajo se presenta el Análisis de Datos Textuales, relevados a partir de las encuestas estudiantiles de dos carreras de la Facultad de Ciencias Exactas, Físicas y Naturales de la Universidad Nacional de San Juan. Para ello se ha combiado el método ALCESTE (Análisis Lexical de Coocurrencias en Enunciados Simples de un Texto) y otros métodos del dominio textual, tales como los glosarios de palabras, las concordancias y la selección del vocabulario más específico de cada texto, para así proveer una herramienta comparativa. Como resultado se muestra cómo el estudio de la distribución del léxico empleado en un texto permite detectar la estructuración de los significados presentes en el mismo. Sociedad Argentina de Informática e Investigación Operativa
description	In Argentine universities, the management of student data is a critical issue that needs to be addressed immediately. These educational institutions collect a variety and quantity of data, such as the total number of students enrolled, the most chosen career/s, the dropout rate, among others. However, the retrieval, recording and analysis of these data is often inefficient and disorganized because many of them are in free textual con-tent format and come from diverse information sources. This abundance of data, while valuable, presents a significant challenge due to its unstructured and heterogeneous nature. That is, how to process textual Big Data to obtain information and then acquire knowledge that can help us make valuable decisions? In the educational domain, Text Analytics provides valuable information. This paper presents the Textual Data Analysis, collected from student surveys of two careers of the Faculty of Exact, Physical and Natural Sciences of the National University of San Juan. For this purpose, the ALCESTE method (Lexical Analysis of Cooccurrences in Simple Sentences of a Text) and other methods of the textual domain, such as word glossaries, concordances and the selection of the most specific vocabulary of each text, have been combined in order to provide a comparative tool. As a result, it is shown how the study of the distribution of the lexicon used in a text allows us to detect the structuring of the meanings present in it.
publishDate	2025
dc.date.none.fl_str_mv	2025-08
dc.type.none.fl_str_mv	info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia
format	conferenceObject
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/190347
url	http://sedici.unlp.edu.ar/handle/10915/190347
dc.language.none.fl_str_mv	spa
language	spa
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/https://revistas.unlp.edu.ar/JAIIO/article/view/19932 info:eu-repo/semantics/altIdentifier/issn/2451-7496
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv	application/pdf 44-59
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1866372190393335808
score	13.040872

Textual Data Analysis in University Surveys to Understand and Reduce Student Dropouts

Publicaciones similares