Decomposition methods for machine learning with small, incomplete or noisy datasets

Autores
Caiafa, César Federico; Sole Casals, Jordi; Marti Puig, Pere; Sun, Zhe; Tanaka,Toshihisa
Año de publicación
2020
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; Argentina
Fil: Sole Casals, Jordi. Center for Advanced Intelligence; Japón
Fil: Marti Puig, Pere. University of Catalonia; España
Fil: Sun, Zhe. RIKEN; Japón
Fil: Tanaka,Toshihisa. Tokyo University of Agriculture and Technology; Japón
Materia
empirical mode decomposition
machine learning
sparse representation
tensor decomposition
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/127445

id CONICETDig_4381c4f63e6130f032695fb2544f6e58
oai_identifier_str oai:ri.conicet.gov.ar:11336/127445
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Decomposition methods for machine learning with small, incomplete or noisy datasetsCaiafa, César FedericoSole Casals, JordiMarti Puig, PereSun, ZheTanaka,Toshihisaempirical mode decompositionmachine learningsparse representationtensor decompositionhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Sole Casals, Jordi. Center for Advanced Intelligence; JapónFil: Marti Puig, Pere. University of Catalonia; EspañaFil: Sun, Zhe. RIKEN; JapónFil: Tanaka,Toshihisa. Tokyo University of Agriculture and Technology; JapónMDPI2020-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/127445Caiafa, César Federico; Sole Casals, Jordi; Marti Puig, Pere; Sun, Zhe; Tanaka,Toshihisa; Decomposition methods for machine learning with small, incomplete or noisy datasets; MDPI; Applied Sciences; 10; 23; 11-2020; 1-212076-3417CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2076-3417/10/23/8481info:eu-repo/semantics/altIdentifier/doi/10.3390/app10238481info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:53:03Zoai:ri.conicet.gov.ar:11336/127445instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:53:04.248CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Decomposition methods for machine learning with small, incomplete or noisy datasets
title Decomposition methods for machine learning with small, incomplete or noisy datasets
spellingShingle Decomposition methods for machine learning with small, incomplete or noisy datasets
Caiafa, César Federico
empirical mode decomposition
machine learning
sparse representation
tensor decomposition
title_short Decomposition methods for machine learning with small, incomplete or noisy datasets
title_full Decomposition methods for machine learning with small, incomplete or noisy datasets
title_fullStr Decomposition methods for machine learning with small, incomplete or noisy datasets
title_full_unstemmed Decomposition methods for machine learning with small, incomplete or noisy datasets
title_sort Decomposition methods for machine learning with small, incomplete or noisy datasets
dc.creator.none.fl_str_mv Caiafa, César Federico
Sole Casals, Jordi
Marti Puig, Pere
Sun, Zhe
Tanaka,Toshihisa
author Caiafa, César Federico
author_facet Caiafa, César Federico
Sole Casals, Jordi
Marti Puig, Pere
Sun, Zhe
Tanaka,Toshihisa
author_role author
author2 Sole Casals, Jordi
Marti Puig, Pere
Sun, Zhe
Tanaka,Toshihisa
author2_role author
author
author
author
dc.subject.none.fl_str_mv empirical mode decomposition
machine learning
sparse representation
tensor decomposition
topic empirical mode decomposition
machine learning
sparse representation
tensor decomposition
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; Argentina
Fil: Sole Casals, Jordi. Center for Advanced Intelligence; Japón
Fil: Marti Puig, Pere. University of Catalonia; España
Fil: Sun, Zhe. RIKEN; Japón
Fil: Tanaka,Toshihisa. Tokyo University of Agriculture and Technology; Japón
description In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
publishDate 2020
dc.date.none.fl_str_mv 2020-11
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/127445
Caiafa, César Federico; Sole Casals, Jordi; Marti Puig, Pere; Sun, Zhe; Tanaka,Toshihisa; Decomposition methods for machine learning with small, incomplete or noisy datasets; MDPI; Applied Sciences; 10; 23; 11-2020; 1-21
2076-3417
CONICET Digital
CONICET
url http://hdl.handle.net/11336/127445
identifier_str_mv Caiafa, César Federico; Sole Casals, Jordi; Marti Puig, Pere; Sun, Zhe; Tanaka,Toshihisa; Decomposition methods for machine learning with small, incomplete or noisy datasets; MDPI; Applied Sciences; 10; 23; 11-2020; 1-21
2076-3417
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2076-3417/10/23/8481
info:eu-repo/semantics/altIdentifier/doi/10.3390/app10238481
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269198872477696
score 13.13397