Compression-based regularization with an application to multitask learning

Autores
Vera, Matías Alejandro; Rey Vega, Leonardo Javier; Piantanida, Pablo
Año de publicación
2018
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin studying a multitask learning (MTL) problem from the average (over the tasks) of misclassification probability point of view and linking it with the popular cross-entropy criterion. Our approach allows an information theoretic formulation of an MTL problem as a supervised learning framework, in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. More precisely, our formulation of the MTL problem can be interpreted as an information bottleneck problem with side information at the decoder. Based on that, we present an iterative algorithm for computing the optimal tradeoffs and some of its convergence properties are studied. An important feature of this algorithm is to provide a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the excess risk, which depends on the nature and the amount of available training data. Applications to hierarchical text categorization and distributional word clusters are also investigated, extending previous works.
Fil: Vera, Matías Alejandro. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Rey Vega, Leonardo Javier. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Simulación Computacional para Aplicaciones Tecnológicas; Argentina
Fil: Piantanida, Pablo. Université Paris Sud; Francia. Centre National de la Recherche Scientifique; Francia
Materia
ARIMOTO-BLAHUT ALGORITHM
INFORMATION BOTTLENECK
MULTITASK LEARNING
REGULARIZATION
SIDE INFORMATION
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/88736

id CONICETDig_1b5c99383b170ed7fa3b5af3e3eb0ec4
oai_identifier_str oai:ri.conicet.gov.ar:11336/88736
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Compression-based regularization with an application to multitask learningVera, Matías AlejandroRey Vega, Leonardo JavierPiantanida, PabloARIMOTO-BLAHUT ALGORITHMINFORMATION BOTTLENECKMULTITASK LEARNINGREGULARIZATIONSIDE INFORMATIONhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin studying a multitask learning (MTL) problem from the average (over the tasks) of misclassification probability point of view and linking it with the popular cross-entropy criterion. Our approach allows an information theoretic formulation of an MTL problem as a supervised learning framework, in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. More precisely, our formulation of the MTL problem can be interpreted as an information bottleneck problem with side information at the decoder. Based on that, we present an iterative algorithm for computing the optimal tradeoffs and some of its convergence properties are studied. An important feature of this algorithm is to provide a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the excess risk, which depends on the nature and the amount of available training data. Applications to hierarchical text categorization and distributional word clusters are also investigated, extending previous works.Fil: Vera, Matías Alejandro. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Rey Vega, Leonardo Javier. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Simulación Computacional para Aplicaciones Tecnológicas; ArgentinaFil: Piantanida, Pablo. Université Paris Sud; Francia. Centre National de la Recherche Scientifique; FranciaInstitute of Electrical and Electronics Engineers2018-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/88736Vera, Matías Alejandro; Rey Vega, Leonardo Javier; Piantanida, Pablo; Compression-based regularization with an application to multitask learning; Institute of Electrical and Electronics Engineers; Ieee Journal Of Selected Topics In Signal Processing; 12; 5; 10-2018; 1063-10761932-4553CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://ieeexplore.ieee.org/document/8379424info:eu-repo/semantics/altIdentifier/doi/10.1109/JSTSP.2018.2846218info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T14:46:43Zoai:ri.conicet.gov.ar:11336/88736instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 14:46:43.673CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Compression-based regularization with an application to multitask learning
title Compression-based regularization with an application to multitask learning
spellingShingle Compression-based regularization with an application to multitask learning
Vera, Matías Alejandro
ARIMOTO-BLAHUT ALGORITHM
INFORMATION BOTTLENECK
MULTITASK LEARNING
REGULARIZATION
SIDE INFORMATION
title_short Compression-based regularization with an application to multitask learning
title_full Compression-based regularization with an application to multitask learning
title_fullStr Compression-based regularization with an application to multitask learning
title_full_unstemmed Compression-based regularization with an application to multitask learning
title_sort Compression-based regularization with an application to multitask learning
dc.creator.none.fl_str_mv Vera, Matías Alejandro
Rey Vega, Leonardo Javier
Piantanida, Pablo
author Vera, Matías Alejandro
author_facet Vera, Matías Alejandro
Rey Vega, Leonardo Javier
Piantanida, Pablo
author_role author
author2 Rey Vega, Leonardo Javier
Piantanida, Pablo
author2_role author
author
dc.subject.none.fl_str_mv ARIMOTO-BLAHUT ALGORITHM
INFORMATION BOTTLENECK
MULTITASK LEARNING
REGULARIZATION
SIDE INFORMATION
topic ARIMOTO-BLAHUT ALGORITHM
INFORMATION BOTTLENECK
MULTITASK LEARNING
REGULARIZATION
SIDE INFORMATION
purl_subject.fl_str_mv https://purl.org/becyt/ford/2.2
https://purl.org/becyt/ford/2
dc.description.none.fl_txt_mv This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin studying a multitask learning (MTL) problem from the average (over the tasks) of misclassification probability point of view and linking it with the popular cross-entropy criterion. Our approach allows an information theoretic formulation of an MTL problem as a supervised learning framework, in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. More precisely, our formulation of the MTL problem can be interpreted as an information bottleneck problem with side information at the decoder. Based on that, we present an iterative algorithm for computing the optimal tradeoffs and some of its convergence properties are studied. An important feature of this algorithm is to provide a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the excess risk, which depends on the nature and the amount of available training data. Applications to hierarchical text categorization and distributional word clusters are also investigated, extending previous works.
Fil: Vera, Matías Alejandro. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Rey Vega, Leonardo Javier. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Simulación Computacional para Aplicaciones Tecnológicas; Argentina
Fil: Piantanida, Pablo. Université Paris Sud; Francia. Centre National de la Recherche Scientifique; Francia
description This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin studying a multitask learning (MTL) problem from the average (over the tasks) of misclassification probability point of view and linking it with the popular cross-entropy criterion. Our approach allows an information theoretic formulation of an MTL problem as a supervised learning framework, in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. More precisely, our formulation of the MTL problem can be interpreted as an information bottleneck problem with side information at the decoder. Based on that, we present an iterative algorithm for computing the optimal tradeoffs and some of its convergence properties are studied. An important feature of this algorithm is to provide a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the excess risk, which depends on the nature and the amount of available training data. Applications to hierarchical text categorization and distributional word clusters are also investigated, extending previous works.
publishDate 2018
dc.date.none.fl_str_mv 2018-10
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/88736
Vera, Matías Alejandro; Rey Vega, Leonardo Javier; Piantanida, Pablo; Compression-based regularization with an application to multitask learning; Institute of Electrical and Electronics Engineers; Ieee Journal Of Selected Topics In Signal Processing; 12; 5; 10-2018; 1063-1076
1932-4553
CONICET Digital
CONICET
url http://hdl.handle.net/11336/88736
identifier_str_mv Vera, Matías Alejandro; Rey Vega, Leonardo Javier; Piantanida, Pablo; Compression-based regularization with an application to multitask learning; Institute of Electrical and Electronics Engineers; Ieee Journal Of Selected Topics In Signal Processing; 12; 5; 10-2018; 1063-1076
1932-4553
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://ieeexplore.ieee.org/document/8379424
info:eu-repo/semantics/altIdentifier/doi/10.1109/JSTSP.2018.2846218
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers
publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846082981336186880
score 13.22299