An experimental study for the Cross Domain Author Profiling classification

Autores
Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia; Errecalde, Marcelo Luis
Año de publicación
2015
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
XII Workshop Bases de Datos y Minería de Datos (WBDDM)
Red de Universidades con Carreras en Informática (RedUNCI)
Materia
Ciencias Informáticas
Natural Language Processing
cross domain classification
author profiling
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/50445

id SEDICI_0afad1262d639be162e2004756fb8167
oai_identifier_str oai:sedici.unlp.edu.ar:10915/50445
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling An experimental study for the Cross Domain Author Profiling classificationGarciarena Ucelay, María JoséVillegas, María PaulaCagnina, LeticiaErrecalde, Marcelo LuisCiencias InformáticasNatural Language Processingcross domain classificationauthor profilingAuthor Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.XII Workshop Bases de Datos y Minería de Datos (WBDDM)Red de Universidades con Carreras en Informática (RedUNCI)2015-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/50445enginfo:eu-repo/semantics/altIdentifier/isbn/978-987-3806-05-6info:eu-repo/semantics/reference/hdl/10915/50028info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-17T09:47:20Zoai:sedici.unlp.edu.ar:10915/50445Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-17 09:47:20.324SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv An experimental study for the Cross Domain Author Profiling classification
title An experimental study for the Cross Domain Author Profiling classification
spellingShingle An experimental study for the Cross Domain Author Profiling classification
Garciarena Ucelay, María José
Ciencias Informáticas
Natural Language Processing
cross domain classification
author profiling
title_short An experimental study for the Cross Domain Author Profiling classification
title_full An experimental study for the Cross Domain Author Profiling classification
title_fullStr An experimental study for the Cross Domain Author Profiling classification
title_full_unstemmed An experimental study for the Cross Domain Author Profiling classification
title_sort An experimental study for the Cross Domain Author Profiling classification
dc.creator.none.fl_str_mv Garciarena Ucelay, María José
Villegas, María Paula
Cagnina, Leticia
Errecalde, Marcelo Luis
author Garciarena Ucelay, María José
author_facet Garciarena Ucelay, María José
Villegas, María Paula
Cagnina, Leticia
Errecalde, Marcelo Luis
author_role author
author2 Villegas, María Paula
Cagnina, Leticia
Errecalde, Marcelo Luis
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Natural Language Processing
cross domain classification
author profiling
topic Ciencias Informáticas
Natural Language Processing
cross domain classification
author profiling
dc.description.none.fl_txt_mv Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
XII Workshop Bases de Datos y Minería de Datos (WBDDM)
Red de Universidades con Carreras en Informática (RedUNCI)
description Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
publishDate 2015
dc.date.none.fl_str_mv 2015-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/50445
url http://sedici.unlp.edu.ar/handle/10915/50445
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/isbn/978-987-3806-05-6
info:eu-repo/semantics/reference/hdl/10915/50028
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1843532227571351552
score 13.001348