Cross Domain Author Profiling Task in Spanish Language: An Experimental Study

Autores
Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis
Año de publicación
2015
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
Fil: Garciarena Ucelay, María José. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina
Fil: Villegas, María Paula. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina
Materia
AUTHOR PROFILING
NATURAL PROCESSING LANGUAGE
CROSS DOMAIN CLASSIFICATION
SPANISH LANGUAGE
TEXT MINING
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/154303

id CONICETDig_8c9bca49e1b7e6ece75ba0cee1c1f115
oai_identifier_str oai:ri.conicet.gov.ar:11336/154303
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Cross Domain Author Profiling Task in Spanish Language: An Experimental StudyGarciarena Ucelay, María JoséVillegas, María PaulaCagnina, Leticia CeciliaErrecalde, Marcelo LuisAUTHOR PROFILINGNATURAL PROCESSING LANGUAGECROSS DOMAIN CLASSIFICATIONSPANISH LANGUAGETEXT MININGhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.Fil: Garciarena Ucelay, María José. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; ArgentinaFil: Villegas, María Paula. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; ArgentinaFil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; ArgentinaFil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; ArgentinaUniversidad Nacional de La Plata. Facultad de Informática2015-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/154303Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Cross Domain Author Profiling Task in Spanish Language: An Experimental Study; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 15; 2; 11-2015; 122-1281666-60461666-6038CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://dialnet.unirioja.es/servlet/articulo?codigo=5388233info:eu-repo/semantics/altIdentifier/url/https://journal.info.unlp.edu.ar/JCST/article/view/544info:eu-repo/semantics/altIdentifier/url/https://www.redalyc.org/articulo.oa?id=638067264012info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T10:02:07Zoai:ri.conicet.gov.ar:11336/154303instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 10:02:07.377CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
title Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
spellingShingle Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
Garciarena Ucelay, María José
AUTHOR PROFILING
NATURAL PROCESSING LANGUAGE
CROSS DOMAIN CLASSIFICATION
SPANISH LANGUAGE
TEXT MINING
title_short Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
title_full Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
title_fullStr Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
title_full_unstemmed Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
title_sort Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
dc.creator.none.fl_str_mv Garciarena Ucelay, María José
Villegas, María Paula
Cagnina, Leticia Cecilia
Errecalde, Marcelo Luis
author Garciarena Ucelay, María José
author_facet Garciarena Ucelay, María José
Villegas, María Paula
Cagnina, Leticia Cecilia
Errecalde, Marcelo Luis
author_role author
author2 Villegas, María Paula
Cagnina, Leticia Cecilia
Errecalde, Marcelo Luis
author2_role author
author
author
dc.subject.none.fl_str_mv AUTHOR PROFILING
NATURAL PROCESSING LANGUAGE
CROSS DOMAIN CLASSIFICATION
SPANISH LANGUAGE
TEXT MINING
topic AUTHOR PROFILING
NATURAL PROCESSING LANGUAGE
CROSS DOMAIN CLASSIFICATION
SPANISH LANGUAGE
TEXT MINING
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
Fil: Garciarena Ucelay, María José. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina
Fil: Villegas, María Paula. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina
description Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
publishDate 2015
dc.date.none.fl_str_mv 2015-11
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/154303
Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Cross Domain Author Profiling Task in Spanish Language: An Experimental Study; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 15; 2; 11-2015; 122-128
1666-6046
1666-6038
CONICET Digital
CONICET
url http://hdl.handle.net/11336/154303
identifier_str_mv Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Cross Domain Author Profiling Task in Spanish Language: An Experimental Study; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 15; 2; 11-2015; 122-128
1666-6046
1666-6038
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://dialnet.unirioja.es/servlet/articulo?codigo=5388233
info:eu-repo/semantics/altIdentifier/url/https://journal.info.unlp.edu.ar/JCST/article/view/544
info:eu-repo/semantics/altIdentifier/url/https://www.redalyc.org/articulo.oa?id=638067264012
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Universidad Nacional de La Plata. Facultad de Informática
publisher.none.fl_str_mv Universidad Nacional de La Plata. Facultad de Informática
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269738665771008
score 13.13397