Cross domain author profiling task in spanish language: an experimental study

Autores
Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia; Errecalde, Marcelo Luis
Año de publicación
2015
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
Facultad de Informática
Materia
Ciencias Informáticas
Natural Language Processing
Data mining
text mining
cross domain classification
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/50187

id SEDICI_f62d4e88f9b1575f437ce663b659919c
oai_identifier_str oai:sedici.unlp.edu.ar:10915/50187
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Cross domain author profiling task in spanish language: an experimental studyGarciarena Ucelay, María JoséVillegas, María PaulaCagnina, LeticiaErrecalde, Marcelo LuisCiencias InformáticasNatural Language ProcessingData miningtext miningcross domain classificationAuthor Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.Facultad de Informática2015-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdf122-128http://sedici.unlp.edu.ar/handle/10915/50187enginfo:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/JCST41-Paper-12.pdfinfo:eu-repo/semantics/altIdentifier/issn/1666-6038info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/3.0/Creative Commons Attribution 3.0 Unported (CC BY 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:36:32Zoai:sedici.unlp.edu.ar:10915/50187Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:36:32.609SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Cross domain author profiling task in spanish language: an experimental study
title Cross domain author profiling task in spanish language: an experimental study
spellingShingle Cross domain author profiling task in spanish language: an experimental study
Garciarena Ucelay, María José
Ciencias Informáticas
Natural Language Processing
Data mining
text mining
cross domain classification
title_short Cross domain author profiling task in spanish language: an experimental study
title_full Cross domain author profiling task in spanish language: an experimental study
title_fullStr Cross domain author profiling task in spanish language: an experimental study
title_full_unstemmed Cross domain author profiling task in spanish language: an experimental study
title_sort Cross domain author profiling task in spanish language: an experimental study
dc.creator.none.fl_str_mv Garciarena Ucelay, María José
Villegas, María Paula
Cagnina, Leticia
Errecalde, Marcelo Luis
author Garciarena Ucelay, María José
author_facet Garciarena Ucelay, María José
Villegas, María Paula
Cagnina, Leticia
Errecalde, Marcelo Luis
author_role author
author2 Villegas, María Paula
Cagnina, Leticia
Errecalde, Marcelo Luis
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Natural Language Processing
Data mining
text mining
cross domain classification
topic Ciencias Informáticas
Natural Language Processing
Data mining
text mining
cross domain classification
dc.description.none.fl_txt_mv Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
Facultad de Informática
description Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
publishDate 2015
dc.date.none.fl_str_mv 2015-11
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Articulo
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/50187
url http://sedici.unlp.edu.ar/handle/10915/50187
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/JCST41-Paper-12.pdf
info:eu-repo/semantics/altIdentifier/issn/1666-6038
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by/3.0/
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/3.0/
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
dc.format.none.fl_str_mv application/pdf
122-128
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260219683405824
score 13.13397