On semi-supervised learning
- Autores
- Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela
- Año de publicación
- 2020
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.
Fil: Cholaquidis, A.. Universidad de la República; Uruguay
Fil: Fraiman, R.. Universidad de la República; Uruguay
Fil: Sued, Raquel Mariela. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina - Materia
-
CONSISTENCY
SEMI-SUPERVISED LEARNING
SMALL TRAINING SAMPLE - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/147485
Ver los metadatos del registro completo
id |
CONICETDig_cfcb6a18a0fb0c7e075a2a2d78221b41 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/147485 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
On semi-supervised learningCholaquidis, A.Fraiman, R.Sued, Raquel MarielaCONSISTENCYSEMI-SUPERVISED LEARNINGSMALL TRAINING SAMPLEhttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.Fil: Cholaquidis, A.. Universidad de la República; UruguayFil: Fraiman, R.. Universidad de la República; UruguayFil: Sued, Raquel Mariela. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaSpringer2020-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/147485Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela; On semi-supervised learning; Springer; Test; 29; 4; 12-2020; 914-9371133-0686CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1007/s11749-019-00690-2info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/article/10.1007%2Fs11749-019-00690-2info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/1805.09180info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:39:27Zoai:ri.conicet.gov.ar:11336/147485instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:39:27.605CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
On semi-supervised learning |
title |
On semi-supervised learning |
spellingShingle |
On semi-supervised learning Cholaquidis, A. CONSISTENCY SEMI-SUPERVISED LEARNING SMALL TRAINING SAMPLE |
title_short |
On semi-supervised learning |
title_full |
On semi-supervised learning |
title_fullStr |
On semi-supervised learning |
title_full_unstemmed |
On semi-supervised learning |
title_sort |
On semi-supervised learning |
dc.creator.none.fl_str_mv |
Cholaquidis, A. Fraiman, R. Sued, Raquel Mariela |
author |
Cholaquidis, A. |
author_facet |
Cholaquidis, A. Fraiman, R. Sued, Raquel Mariela |
author_role |
author |
author2 |
Fraiman, R. Sued, Raquel Mariela |
author2_role |
author author |
dc.subject.none.fl_str_mv |
CONSISTENCY SEMI-SUPERVISED LEARNING SMALL TRAINING SAMPLE |
topic |
CONSISTENCY SEMI-SUPERVISED LEARNING SMALL TRAINING SAMPLE |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.1 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems. Fil: Cholaquidis, A.. Universidad de la República; Uruguay Fil: Fraiman, R.. Universidad de la República; Uruguay Fil: Sued, Raquel Mariela. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina |
description |
Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-12 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/147485 Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela; On semi-supervised learning; Springer; Test; 29; 4; 12-2020; 914-937 1133-0686 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/147485 |
identifier_str_mv |
Cholaquidis, A.; Fraiman, R.; Sued, Raquel Mariela; On semi-supervised learning; Springer; Test; 29; 4; 12-2020; 914-937 1133-0686 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1007/s11749-019-00690-2 info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/article/10.1007%2Fs11749-019-00690-2 info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/1805.09180 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Springer |
publisher.none.fl_str_mv |
Springer |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614419779158016 |
score |
13.070432 |