Cross Domain Author Profiling Task in Spanish Language: An Experimental Study
- Autores
- Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis
- Año de publicación
- 2015
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
Fil: Garciarena Ucelay, María José. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina
Fil: Villegas, María Paula. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina - Materia
-
AUTHOR PROFILING
NATURAL PROCESSING LANGUAGE
CROSS DOMAIN CLASSIFICATION
SPANISH LANGUAGE
TEXT MINING - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/154303
Ver los metadatos del registro completo
id |
CONICETDig_8c9bca49e1b7e6ece75ba0cee1c1f115 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/154303 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Cross Domain Author Profiling Task in Spanish Language: An Experimental StudyGarciarena Ucelay, María JoséVillegas, María PaulaCagnina, Leticia CeciliaErrecalde, Marcelo LuisAUTHOR PROFILINGNATURAL PROCESSING LANGUAGECROSS DOMAIN CLASSIFICATIONSPANISH LANGUAGETEXT MININGhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.Fil: Garciarena Ucelay, María José. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; ArgentinaFil: Villegas, María Paula. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; ArgentinaFil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; ArgentinaFil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; ArgentinaUniversidad Nacional de La Plata. Facultad de Informática2015-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/154303Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Cross Domain Author Profiling Task in Spanish Language: An Experimental Study; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 15; 2; 11-2015; 122-1281666-60461666-6038CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://dialnet.unirioja.es/servlet/articulo?codigo=5388233info:eu-repo/semantics/altIdentifier/url/https://journal.info.unlp.edu.ar/JCST/article/view/544info:eu-repo/semantics/altIdentifier/url/https://www.redalyc.org/articulo.oa?id=638067264012info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T10:02:07Zoai:ri.conicet.gov.ar:11336/154303instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 10:02:07.377CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
title |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
spellingShingle |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study Garciarena Ucelay, María José AUTHOR PROFILING NATURAL PROCESSING LANGUAGE CROSS DOMAIN CLASSIFICATION SPANISH LANGUAGE TEXT MINING |
title_short |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
title_full |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
title_fullStr |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
title_full_unstemmed |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
title_sort |
Cross Domain Author Profiling Task in Spanish Language: An Experimental Study |
dc.creator.none.fl_str_mv |
Garciarena Ucelay, María José Villegas, María Paula Cagnina, Leticia Cecilia Errecalde, Marcelo Luis |
author |
Garciarena Ucelay, María José |
author_facet |
Garciarena Ucelay, María José Villegas, María Paula Cagnina, Leticia Cecilia Errecalde, Marcelo Luis |
author_role |
author |
author2 |
Villegas, María Paula Cagnina, Leticia Cecilia Errecalde, Marcelo Luis |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
AUTHOR PROFILING NATURAL PROCESSING LANGUAGE CROSS DOMAIN CLASSIFICATION SPANISH LANGUAGE TEXT MINING |
topic |
AUTHOR PROFILING NATURAL PROCESSING LANGUAGE CROSS DOMAIN CLASSIFICATION SPANISH LANGUAGE TEXT MINING |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language. Fil: Garciarena Ucelay, María José. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina Fil: Villegas, María Paula. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina |
description |
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a data set and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for data sets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language. |
publishDate |
2015 |
dc.date.none.fl_str_mv |
2015-11 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/154303 Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Cross Domain Author Profiling Task in Spanish Language: An Experimental Study; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 15; 2; 11-2015; 122-128 1666-6046 1666-6038 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/154303 |
identifier_str_mv |
Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia Cecilia; Errecalde, Marcelo Luis; Cross Domain Author Profiling Task in Spanish Language: An Experimental Study; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 15; 2; 11-2015; 122-128 1666-6046 1666-6038 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://dialnet.unirioja.es/servlet/articulo?codigo=5388233 info:eu-repo/semantics/altIdentifier/url/https://journal.info.unlp.edu.ar/JCST/article/view/544 info:eu-repo/semantics/altIdentifier/url/https://www.redalyc.org/articulo.oa?id=638067264012 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Universidad Nacional de La Plata. Facultad de Informática |
publisher.none.fl_str_mv |
Universidad Nacional de La Plata. Facultad de Informática |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842269738665771008 |
score |
13.13397 |