A text classification framework for simple and effective early depression detection over social media streams

Autores
Burdisso, Sergio Gastón; Errecalde, Marcelo Luis; Montes y Gómez, Manuel
Año de publicación
2019
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
With the rise of the Internet, there is a growing need to build intelligent systems that are capable of efficiently dealing with early risk detection (ERD) problems on social media, such as early depression detection, early rumor detection or identification of sexual predators. These systems, nowadays mostly based on machine learning techniques, must be able to deal with data streams since users provide their data over time. In addition, these systems must be able to decide when the processed data is sufficient to actually classify users. Moreover, since ERD tasks involve risky decisions by which people's lives could be affected, such systems must also be able to justify their decisions. However, most standard and state-of-the-art supervised machine learning models (such as SVM, MNB, Neural Networks, etc.) are not well suited to deal with this scenario. This is due to the fact that they either act as black boxes or do not support incremental classification/learning. In this paper we introduce SS3, a novel supervised learning model for text classification that naturally supports these aspects. SS3 was designed to be used as a general framework to deal with ERD problems. We evaluated our model on the CLEF's eRisk2017 pilot task on early depression detection. Most of the 30 contributions submitted to this competition used state-of-the-art methods. Experimental results show that our classifier was able to outperform these models and standard classifiers, despite being less computationally expensive and having the ability to explain its rationale.
Fil: Burdisso, Sergio Gastón. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis; Argentina
Fil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; México
Materia
EARLY DEPRESSION DETECTION
EARLY TEXT CLASSIFICATION
EXPLAINABILITY
INCREMENTAL CLASSIFICATION
INTERPRETABILITY
SS3
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/140606

id CONICETDig_f509f977aa7cf8770bae68b0904f3dc3
oai_identifier_str oai:ri.conicet.gov.ar:11336/140606
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling A text classification framework for simple and effective early depression detection over social media streamsBurdisso, Sergio GastónErrecalde, Marcelo LuisMontes y Gómez, ManuelEARLY DEPRESSION DETECTIONEARLY TEXT CLASSIFICATIONEXPLAINABILITYINCREMENTAL CLASSIFICATIONINTERPRETABILITYSS3https://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1With the rise of the Internet, there is a growing need to build intelligent systems that are capable of efficiently dealing with early risk detection (ERD) problems on social media, such as early depression detection, early rumor detection or identification of sexual predators. These systems, nowadays mostly based on machine learning techniques, must be able to deal with data streams since users provide their data over time. In addition, these systems must be able to decide when the processed data is sufficient to actually classify users. Moreover, since ERD tasks involve risky decisions by which people's lives could be affected, such systems must also be able to justify their decisions. However, most standard and state-of-the-art supervised machine learning models (such as SVM, MNB, Neural Networks, etc.) are not well suited to deal with this scenario. This is due to the fact that they either act as black boxes or do not support incremental classification/learning. In this paper we introduce SS3, a novel supervised learning model for text classification that naturally supports these aspects. SS3 was designed to be used as a general framework to deal with ERD problems. We evaluated our model on the CLEF's eRisk2017 pilot task on early depression detection. Most of the 30 contributions submitted to this competition used state-of-the-art methods. Experimental results show that our classifier was able to outperform these models and standard classifiers, despite being less computationally expensive and having the ability to explain its rationale.Fil: Burdisso, Sergio Gastón. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; ArgentinaFil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis; ArgentinaFil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; MéxicoPergamon-Elsevier Science Ltd2019-11-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/140606Burdisso, Sergio Gastón; Errecalde, Marcelo Luis; Montes y Gómez, Manuel; A text classification framework for simple and effective early depression detection over social media streams; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 133; 1-11-2019; 182-1970957-41741873-6793CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/abs/pii/S0957417419303525info:eu-repo/semantics/altIdentifier/doi/10.1016/j.eswa.2019.05.023info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/1905.08772info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:42:42Zoai:ri.conicet.gov.ar:11336/140606instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:42:43.111CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv A text classification framework for simple and effective early depression detection over social media streams
title A text classification framework for simple and effective early depression detection over social media streams
spellingShingle A text classification framework for simple and effective early depression detection over social media streams
Burdisso, Sergio Gastón
EARLY DEPRESSION DETECTION
EARLY TEXT CLASSIFICATION
EXPLAINABILITY
INCREMENTAL CLASSIFICATION
INTERPRETABILITY
SS3
title_short A text classification framework for simple and effective early depression detection over social media streams
title_full A text classification framework for simple and effective early depression detection over social media streams
title_fullStr A text classification framework for simple and effective early depression detection over social media streams
title_full_unstemmed A text classification framework for simple and effective early depression detection over social media streams
title_sort A text classification framework for simple and effective early depression detection over social media streams
dc.creator.none.fl_str_mv Burdisso, Sergio Gastón
Errecalde, Marcelo Luis
Montes y Gómez, Manuel
author Burdisso, Sergio Gastón
author_facet Burdisso, Sergio Gastón
Errecalde, Marcelo Luis
Montes y Gómez, Manuel
author_role author
author2 Errecalde, Marcelo Luis
Montes y Gómez, Manuel
author2_role author
author
dc.subject.none.fl_str_mv EARLY DEPRESSION DETECTION
EARLY TEXT CLASSIFICATION
EXPLAINABILITY
INCREMENTAL CLASSIFICATION
INTERPRETABILITY
SS3
topic EARLY DEPRESSION DETECTION
EARLY TEXT CLASSIFICATION
EXPLAINABILITY
INCREMENTAL CLASSIFICATION
INTERPRETABILITY
SS3
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv With the rise of the Internet, there is a growing need to build intelligent systems that are capable of efficiently dealing with early risk detection (ERD) problems on social media, such as early depression detection, early rumor detection or identification of sexual predators. These systems, nowadays mostly based on machine learning techniques, must be able to deal with data streams since users provide their data over time. In addition, these systems must be able to decide when the processed data is sufficient to actually classify users. Moreover, since ERD tasks involve risky decisions by which people's lives could be affected, such systems must also be able to justify their decisions. However, most standard and state-of-the-art supervised machine learning models (such as SVM, MNB, Neural Networks, etc.) are not well suited to deal with this scenario. This is due to the fact that they either act as black boxes or do not support incremental classification/learning. In this paper we introduce SS3, a novel supervised learning model for text classification that naturally supports these aspects. SS3 was designed to be used as a general framework to deal with ERD problems. We evaluated our model on the CLEF's eRisk2017 pilot task on early depression detection. Most of the 30 contributions submitted to this competition used state-of-the-art methods. Experimental results show that our classifier was able to outperform these models and standard classifiers, despite being less computationally expensive and having the ability to explain its rationale.
Fil: Burdisso, Sergio Gastón. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis; Argentina
Fil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; México
description With the rise of the Internet, there is a growing need to build intelligent systems that are capable of efficiently dealing with early risk detection (ERD) problems on social media, such as early depression detection, early rumor detection or identification of sexual predators. These systems, nowadays mostly based on machine learning techniques, must be able to deal with data streams since users provide their data over time. In addition, these systems must be able to decide when the processed data is sufficient to actually classify users. Moreover, since ERD tasks involve risky decisions by which people's lives could be affected, such systems must also be able to justify their decisions. However, most standard and state-of-the-art supervised machine learning models (such as SVM, MNB, Neural Networks, etc.) are not well suited to deal with this scenario. This is due to the fact that they either act as black boxes or do not support incremental classification/learning. In this paper we introduce SS3, a novel supervised learning model for text classification that naturally supports these aspects. SS3 was designed to be used as a general framework to deal with ERD problems. We evaluated our model on the CLEF's eRisk2017 pilot task on early depression detection. Most of the 30 contributions submitted to this competition used state-of-the-art methods. Experimental results show that our classifier was able to outperform these models and standard classifiers, despite being less computationally expensive and having the ability to explain its rationale.
publishDate 2019
dc.date.none.fl_str_mv 2019-11-01
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/140606
Burdisso, Sergio Gastón; Errecalde, Marcelo Luis; Montes y Gómez, Manuel; A text classification framework for simple and effective early depression detection over social media streams; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 133; 1-11-2019; 182-197
0957-4174
1873-6793
CONICET Digital
CONICET
url http://hdl.handle.net/11336/140606
identifier_str_mv Burdisso, Sergio Gastón; Errecalde, Marcelo Luis; Montes y Gómez, Manuel; A text classification framework for simple and effective early depression detection over social media streams; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 133; 1-11-2019; 182-197
0957-4174
1873-6793
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/abs/pii/S0957417419303525
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.eswa.2019.05.023
info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/1905.08772
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Pergamon-Elsevier Science Ltd
publisher.none.fl_str_mv Pergamon-Elsevier Science Ltd
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613344807354368
score 13.069144