An experimental study for the Cross Domain Author Profiling classification
- Autores
- Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia; Errecalde, Marcelo Luis
- Año de publicación
- 2015
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.
XII Workshop Bases de Datos y Minería de Datos (WBDDM)
Red de Universidades con Carreras en Informática (RedUNCI) - Materia
-
Ciencias Informáticas
Natural Language Processing
cross domain classification
author profiling - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/50445
Ver los metadatos del registro completo
id |
SEDICI_0afad1262d639be162e2004756fb8167 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/50445 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
An experimental study for the Cross Domain Author Profiling classificationGarciarena Ucelay, María JoséVillegas, María PaulaCagnina, LeticiaErrecalde, Marcelo LuisCiencias InformáticasNatural Language Processingcross domain classificationauthor profilingAuthor Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language.XII Workshop Bases de Datos y Minería de Datos (WBDDM)Red de Universidades con Carreras en Informática (RedUNCI)2015-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/50445enginfo:eu-repo/semantics/altIdentifier/isbn/978-987-3806-05-6info:eu-repo/semantics/reference/hdl/10915/50028info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-17T09:47:20Zoai:sedici.unlp.edu.ar:10915/50445Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-17 09:47:20.324SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
An experimental study for the Cross Domain Author Profiling classification |
title |
An experimental study for the Cross Domain Author Profiling classification |
spellingShingle |
An experimental study for the Cross Domain Author Profiling classification Garciarena Ucelay, María José Ciencias Informáticas Natural Language Processing cross domain classification author profiling |
title_short |
An experimental study for the Cross Domain Author Profiling classification |
title_full |
An experimental study for the Cross Domain Author Profiling classification |
title_fullStr |
An experimental study for the Cross Domain Author Profiling classification |
title_full_unstemmed |
An experimental study for the Cross Domain Author Profiling classification |
title_sort |
An experimental study for the Cross Domain Author Profiling classification |
dc.creator.none.fl_str_mv |
Garciarena Ucelay, María José Villegas, María Paula Cagnina, Leticia Errecalde, Marcelo Luis |
author |
Garciarena Ucelay, María José |
author_facet |
Garciarena Ucelay, María José Villegas, María Paula Cagnina, Leticia Errecalde, Marcelo Luis |
author_role |
author |
author2 |
Villegas, María Paula Cagnina, Leticia Errecalde, Marcelo Luis |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Natural Language Processing cross domain classification author profiling |
topic |
Ciencias Informáticas Natural Language Processing cross domain classification author profiling |
dc.description.none.fl_txt_mv |
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language. XII Workshop Bases de Datos y Minería de Datos (WBDDM) Red de Universidades con Carreras en Informática (RedUNCI) |
description |
Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language. |
publishDate |
2015 |
dc.date.none.fl_str_mv |
2015-10 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/50445 |
url |
http://sedici.unlp.edu.ar/handle/10915/50445 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-987-3806-05-6 info:eu-repo/semantics/reference/hdl/10915/50028 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1843532227571351552 |
score |
13.001348 |