Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

Autores
Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun; Tanaka, Toshihisa
Año de publicación
2021
Idioma
inglés
Tipo de recurso
parte de libro
Estado
versión publicada
Descripción
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; Argentina
Fil: Solé Casals, Jordi. University of Vic; España
Fil: Marti Puig, Pere. University of Vic; España
Fil: Zhe, Sun. Head Office for Information Systems and Cybersecurity. Computational Engineering Applications Unit. RIKEN; Japón
Fil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; Japón
Materia
machine learning
incomplete data
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/137669

id CONICETDig_a4f0d0b7d23f5cf42ca78512380bc0e8
oai_identifier_str oai:ri.conicet.gov.ar:11336/137669
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCaiafa, César FedericoSolé Casals, JordiMarti Puig, PereZhe, SunTanaka, Toshihisamachine learningincomplete datahttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Solé Casals, Jordi. University of Vic; EspañaFil: Marti Puig, Pere. University of Vic; EspañaFil: Zhe, Sun. Head Office for Information Systems and Cybersecurity. Computational Engineering Applications Unit. RIKEN; JapónFil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; JapónMDPICaiafa, César FedericoSolé Casals, Jordi2021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookParthttp://purl.org/coar/resource_type/c_3248info:ar-repo/semantics/parteDeLibroapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/137669Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun ; Tanaka, Toshihisa ; Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets; MDPI; 2021; 5-24978-3-0365-1288-4CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/books/pdfview/book/3727info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:54:10Zoai:ri.conicet.gov.ar:11336/137669instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:54:10.611CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
spellingShingle Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
Caiafa, César Federico
machine learning
incomplete data
title_short Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_fullStr Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full_unstemmed Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_sort Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
dc.creator.none.fl_str_mv Caiafa, César Federico
Solé Casals, Jordi
Marti Puig, Pere
Zhe, Sun
Tanaka, Toshihisa
author Caiafa, César Federico
author_facet Caiafa, César Federico
Solé Casals, Jordi
Marti Puig, Pere
Zhe, Sun
Tanaka, Toshihisa
author_role author
author2 Solé Casals, Jordi
Marti Puig, Pere
Zhe, Sun
Tanaka, Toshihisa
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Caiafa, César Federico
Solé Casals, Jordi
dc.subject.none.fl_str_mv machine learning
incomplete data
topic machine learning
incomplete data
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; Argentina
Fil: Solé Casals, Jordi. University of Vic; España
Fil: Marti Puig, Pere. University of Vic; España
Fil: Zhe, Sun. Head Office for Information Systems and Cybersecurity. Computational Engineering Applications Unit. RIKEN; Japón
Fil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; Japón
description In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
publishDate 2021
dc.date.none.fl_str_mv 2021
dc.type.none.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/bookPart
http://purl.org/coar/resource_type/c_3248
info:ar-repo/semantics/parteDeLibro
status_str publishedVersion
format bookPart
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/137669
Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun ; Tanaka, Toshihisa ; Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets; MDPI; 2021; 5-24
978-3-0365-1288-4
CONICET Digital
CONICET
url http://hdl.handle.net/11336/137669
identifier_str_mv Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun ; Tanaka, Toshihisa ; Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets; MDPI; 2021; 5-24
978-3-0365-1288-4
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/books/pdfview/book/3727
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269269008580608
score 13.13397