Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
- Autores
- Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun; Tanaka, Toshihisa
- Año de publicación
- 2021
- Idioma
- inglés
- Tipo de recurso
- parte de libro
- Estado
- versión publicada
- Descripción
- In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; Argentina
Fil: Solé Casals, Jordi. University of Vic; España
Fil: Marti Puig, Pere. University of Vic; España
Fil: Zhe, Sun. Head Office for Information Systems and Cybersecurity. Computational Engineering Applications Unit. RIKEN; Japón
Fil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; Japón - Materia
-
machine learning
incomplete data - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/137669
Ver los metadatos del registro completo
id |
CONICETDig_a4f0d0b7d23f5cf42ca78512380bc0e8 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/137669 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCaiafa, César FedericoSolé Casals, JordiMarti Puig, PereZhe, SunTanaka, Toshihisamachine learningincomplete datahttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Solé Casals, Jordi. University of Vic; EspañaFil: Marti Puig, Pere. University of Vic; EspañaFil: Zhe, Sun. Head Office for Information Systems and Cybersecurity. Computational Engineering Applications Unit. RIKEN; JapónFil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; JapónMDPICaiafa, César FedericoSolé Casals, Jordi2021info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookParthttp://purl.org/coar/resource_type/c_3248info:ar-repo/semantics/parteDeLibroapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/137669Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun ; Tanaka, Toshihisa ; Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets; MDPI; 2021; 5-24978-3-0365-1288-4CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/books/pdfview/book/3727info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:54:10Zoai:ri.conicet.gov.ar:11336/137669instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:54:10.611CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
spellingShingle |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets Caiafa, César Federico machine learning incomplete data |
title_short |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_full |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_fullStr |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_full_unstemmed |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_sort |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
dc.creator.none.fl_str_mv |
Caiafa, César Federico Solé Casals, Jordi Marti Puig, Pere Zhe, Sun Tanaka, Toshihisa |
author |
Caiafa, César Federico |
author_facet |
Caiafa, César Federico Solé Casals, Jordi Marti Puig, Pere Zhe, Sun Tanaka, Toshihisa |
author_role |
author |
author2 |
Solé Casals, Jordi Marti Puig, Pere Zhe, Sun Tanaka, Toshihisa |
author2_role |
author author author author |
dc.contributor.none.fl_str_mv |
Caiafa, César Federico Solé Casals, Jordi |
dc.subject.none.fl_str_mv |
machine learning incomplete data |
topic |
machine learning incomplete data |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets. Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; Argentina Fil: Solé Casals, Jordi. University of Vic; España Fil: Marti Puig, Pere. University of Vic; España Fil: Zhe, Sun. Head Office for Information Systems and Cybersecurity. Computational Engineering Applications Unit. RIKEN; Japón Fil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; Japón |
description |
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/bookPart http://purl.org/coar/resource_type/c_3248 info:ar-repo/semantics/parteDeLibro |
status_str |
publishedVersion |
format |
bookPart |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/137669 Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun ; Tanaka, Toshihisa ; Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets; MDPI; 2021; 5-24 978-3-0365-1288-4 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/137669 |
identifier_str_mv |
Caiafa, César Federico; Solé Casals, Jordi; Marti Puig, Pere; Zhe, Sun ; Tanaka, Toshihisa ; Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets; MDPI; 2021; 5-24 978-3-0365-1288-4 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/books/pdfview/book/3727 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
MDPI |
publisher.none.fl_str_mv |
MDPI |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842269269008580608 |
score |
13.13397 |