On semi-supervised learning

Autores: Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela
Año de publicación: 2020
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.
Fil: Cholaquidis, A.. Universidad de la República; Uruguay
Fil: Fraiman, R.. Universidad de la República; Uruguay
Fil: Sued, Raquel Mariela. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Materia: CONSISTENCY
SEMI-SUPERVISED LEARNING
SMALL TRAINING SAMPLE
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/147485

Acceder

id	CONICETDig_cfcb6a18a0fb0c7e075a2a2d78221b41
oai_identifier_str	oai:ri.conicet.gov.ar:11336/147485
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	On semi-supervised learningCholaquidis, A.Fraiman, R.Sued, Raquel MarielaCONSISTENCYSEMI-SUPERVISED LEARNINGSMALL TRAINING SAMPLEhttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.Fil: Cholaquidis, A.. Universidad de la República; UruguayFil: Fraiman, R.. Universidad de la República; UruguayFil: Sued, Raquel Mariela. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaSpringer2020-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/147485Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela; On semi-supervised learning; Springer; Test; 29; 4; 12-2020; 914-9371133-0686CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1007/s11749-019-00690-2info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/article/10.1007%2Fs11749-019-00690-2info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/1805.09180info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-12-23T14:47:23Zoai:ri.conicet.gov.ar:11336/147485instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-12-23 14:47:23.671CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	On semi-supervised learning
title	On semi-supervised learning
spellingShingle	On semi-supervised learning Cholaquidis, A. CONSISTENCY SEMI-SUPERVISED LEARNING SMALL TRAINING SAMPLE
title_short	On semi-supervised learning
title_full	On semi-supervised learning
title_fullStr	On semi-supervised learning
title_full_unstemmed	On semi-supervised learning
title_sort	On semi-supervised learning
dc.creator.none.fl_str_mv	Cholaquidis, A. Fraiman, R. Sued, Raquel Mariela
author	Cholaquidis, A.
author_facet	Cholaquidis, A. Fraiman, R. Sued, Raquel Mariela
author_role	author
author2	Fraiman, R. Sued, Raquel Mariela
author2_role	author author
dc.subject.none.fl_str_mv	CONSISTENCY SEMI-SUPERVISED LEARNING SMALL TRAINING SAMPLE
topic	CONSISTENCY SEMI-SUPERVISED LEARNING SMALL TRAINING SAMPLE
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.1 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems. Fil: Cholaquidis, A.. Universidad de la República; Uruguay Fil: Fraiman, R.. Universidad de la República; Uruguay Fil: Sued, Raquel Mariela. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
description	Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.
publishDate	2020
dc.date.none.fl_str_mv	2020-12
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/147485 Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela; On semi-supervised learning; Springer; Test; 29; 4; 12-2020; 914-937 1133-0686 CONICET Digital CONICET
url	http://hdl.handle.net/11336/147485
identifier_str_mv	Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela; On semi-supervised learning; Springer; Test; 29; 4; 12-2020; 914-937 1133-0686 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/doi/10.1007/s11749-019-00690-2 info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/article/10.1007%2Fs11749-019-00690-2 info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/1805.09180
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Springer
publisher.none.fl_str_mv	Springer
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1852335823171616768
score	12.952241

On semi-supervised learning

Publicaciones similares