Learning when to classify for early text classification

Autores
Loyola, Juan Martin; Errecalde, Marcelo Luis; Escalante, Hugo Jair; Montes y Gómez, Manuel
Año de publicación
2017
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
The problem of classification in supervised learning is a widely studied one. Nonetheless, there are scenarios that received little attention despite its applicability. One of such scenarios is early text classification, where one needs to know the category of a document as soon as possible. The importance of this variant of the classification problem is evident in tasks like sexual predator detection, where one wants to identify an offender as early as possible. This paper presents a framework for early text classification which highlights the two main pieces involved in this problem: classification with partial information and deciding the moment of classification. In this context, a novel approach that learns the second component (when classify) and an adaptation of a temporal measurement for multi-class problems are introduced. Results with a classical text classification corpus in comparison against a model that reads the entire documents confirm the feasibility of our approach.
Fil: Loyola, Juan Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias Físico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Fisico Matematicas y Naturales. Departamento de Informatica; Argentina
Fil: Escalante, Hugo Jair. Instituto Nacional de Astrofísica, Óptica y Electrónica; México
Fil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; México
XXIII Congreso Argentino de Ciencias de la Computación
La Plata
Argentina
Universidad Nacional de La Plata. Facultad de Informática
Red de Universidades con Carreras en Informática
Materia
MACHINE LEARNING
EARLY TEXT CLASSIFICATION
CLASSIFICATION WITH PARTIAL INFORMATION
DECISION OF THE MOMENT OF CLASSIFICATION
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/262506

id CONICETDig_29d87a0404ee1f3a8267fc4366400961
oai_identifier_str oai:ri.conicet.gov.ar:11336/262506
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Learning when to classify for early text classificationLoyola, Juan MartinErrecalde, Marcelo LuisEscalante, Hugo JairMontes y Gómez, ManuelMACHINE LEARNINGEARLY TEXT CLASSIFICATIONCLASSIFICATION WITH PARTIAL INFORMATIONDECISION OF THE MOMENT OF CLASSIFICATIONhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1The problem of classification in supervised learning is a widely studied one. Nonetheless, there are scenarios that received little attention despite its applicability. One of such scenarios is early text classification, where one needs to know the category of a document as soon as possible. The importance of this variant of the classification problem is evident in tasks like sexual predator detection, where one wants to identify an offender as early as possible. This paper presents a framework for early text classification which highlights the two main pieces involved in this problem: classification with partial information and deciding the moment of classification. In this context, a novel approach that learns the second component (when classify) and an adaptation of a temporal measurement for multi-class problems are introduced. Results with a classical text classification corpus in comparison against a model that reads the entire documents confirm the feasibility of our approach.Fil: Loyola, Juan Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias Físico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; ArgentinaFil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Fisico Matematicas y Naturales. Departamento de Informatica; ArgentinaFil: Escalante, Hugo Jair. Instituto Nacional de Astrofísica, Óptica y Electrónica; MéxicoFil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; MéxicoXXIII Congreso Argentino de Ciencias de la ComputaciónLa PlataArgentinaUniversidad Nacional de La Plata. Facultad de InformáticaRed de Universidades con Carreras en InformáticaUniversidad Nacional de La Plata. Facultad de Informática2017info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObjectCongresoBookhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/262506Learning when to classify for early text classification; XXIII Congreso Argentino de Ciencias de la Computación; La Plata; Argentina; 2017; 103-112978-950-34-1539-9CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/http://sedici.unlp.edu.ar/handle/10915/63498Nacionalinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:52:58Zoai:ri.conicet.gov.ar:11336/262506instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:52:58.997CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Learning when to classify for early text classification
title Learning when to classify for early text classification
spellingShingle Learning when to classify for early text classification
Loyola, Juan Martin
MACHINE LEARNING
EARLY TEXT CLASSIFICATION
CLASSIFICATION WITH PARTIAL INFORMATION
DECISION OF THE MOMENT OF CLASSIFICATION
title_short Learning when to classify for early text classification
title_full Learning when to classify for early text classification
title_fullStr Learning when to classify for early text classification
title_full_unstemmed Learning when to classify for early text classification
title_sort Learning when to classify for early text classification
dc.creator.none.fl_str_mv Loyola, Juan Martin
Errecalde, Marcelo Luis
Escalante, Hugo Jair
Montes y Gómez, Manuel
author Loyola, Juan Martin
author_facet Loyola, Juan Martin
Errecalde, Marcelo Luis
Escalante, Hugo Jair
Montes y Gómez, Manuel
author_role author
author2 Errecalde, Marcelo Luis
Escalante, Hugo Jair
Montes y Gómez, Manuel
author2_role author
author
author
dc.subject.none.fl_str_mv MACHINE LEARNING
EARLY TEXT CLASSIFICATION
CLASSIFICATION WITH PARTIAL INFORMATION
DECISION OF THE MOMENT OF CLASSIFICATION
topic MACHINE LEARNING
EARLY TEXT CLASSIFICATION
CLASSIFICATION WITH PARTIAL INFORMATION
DECISION OF THE MOMENT OF CLASSIFICATION
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv The problem of classification in supervised learning is a widely studied one. Nonetheless, there are scenarios that received little attention despite its applicability. One of such scenarios is early text classification, where one needs to know the category of a document as soon as possible. The importance of this variant of the classification problem is evident in tasks like sexual predator detection, where one wants to identify an offender as early as possible. This paper presents a framework for early text classification which highlights the two main pieces involved in this problem: classification with partial information and deciding the moment of classification. In this context, a novel approach that learns the second component (when classify) and an adaptation of a temporal measurement for multi-class problems are introduced. Results with a classical text classification corpus in comparison against a model that reads the entire documents confirm the feasibility of our approach.
Fil: Loyola, Juan Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias Físico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; Argentina
Fil: Errecalde, Marcelo Luis. Universidad Nacional de San Luis. Facultad de Ciencias Fisico Matematicas y Naturales. Departamento de Informatica; Argentina
Fil: Escalante, Hugo Jair. Instituto Nacional de Astrofísica, Óptica y Electrónica; México
Fil: Montes y Gómez, Manuel. Instituto Nacional de Astrofísica, Óptica y Electrónica; México
XXIII Congreso Argentino de Ciencias de la Computación
La Plata
Argentina
Universidad Nacional de La Plata. Facultad de Informática
Red de Universidades con Carreras en Informática
description The problem of classification in supervised learning is a widely studied one. Nonetheless, there are scenarios that received little attention despite its applicability. One of such scenarios is early text classification, where one needs to know the category of a document as soon as possible. The importance of this variant of the classification problem is evident in tasks like sexual predator detection, where one wants to identify an offender as early as possible. This paper presents a framework for early text classification which highlights the two main pieces involved in this problem: classification with partial information and deciding the moment of classification. In this context, a novel approach that learns the second component (when classify) and an adaptation of a temporal measurement for multi-class problems are introduced. Results with a classical text classification corpus in comparison against a model that reads the entire documents confirm the feasibility of our approach.
publishDate 2017
dc.date.none.fl_str_mv 2017
dc.type.none.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/conferenceObject
Congreso
Book
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
status_str publishedVersion
format conferenceObject
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/262506
Learning when to classify for early text classification; XXIII Congreso Argentino de Ciencias de la Computación; La Plata; Argentina; 2017; 103-112
978-950-34-1539-9
CONICET Digital
CONICET
url http://hdl.handle.net/11336/262506
identifier_str_mv Learning when to classify for early text classification; XXIII Congreso Argentino de Ciencias de la Computación; La Plata; Argentina; 2017; 103-112
978-950-34-1539-9
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://sedici.unlp.edu.ar/handle/10915/63498
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.coverage.none.fl_str_mv Nacional
dc.publisher.none.fl_str_mv Universidad Nacional de La Plata. Facultad de Informática
publisher.none.fl_str_mv Universidad Nacional de La Plata. Facultad de Informática
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613622640148480
score 13.069144