Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

Autores
Caiafa, Cesar Federico
Año de publicación
2020
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Instituto Argentino de Radioastronomía
Materia
Ingeniería Electrónica
Ciencias Informáticas
Empirical mode decomposition
Machine learning
Sparse representations
Tensor decomposition
Tensor completion
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by/4.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/119641

id SEDICI_0080c7491955ac0f5af9a35501b4bbc8
oai_identifier_str oai:sedici.unlp.edu.ar:10915/119641
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCaiafa, Cesar FedericoIngeniería ElectrónicaCiencias InformáticasEmpirical mode decompositionMachine learningSparse representationsTensor decompositionTensor completionIn many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Instituto Argentino de Radioastronomía2020info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/119641enginfo:eu-repo/semantics/altIdentifier/issn/2076-3417info:eu-repo/semantics/altIdentifier/doi/10.3390/app10238481info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Creative Commons Attribution 4.0 International (CC BY 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T11:00:21Zoai:sedici.unlp.edu.ar:10915/119641Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 11:00:21.952SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
spellingShingle Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
Caiafa, Cesar Federico
Ingeniería Electrónica
Ciencias Informáticas
Empirical mode decomposition
Machine learning
Sparse representations
Tensor decomposition
Tensor completion
title_short Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_fullStr Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full_unstemmed Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_sort Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
dc.creator.none.fl_str_mv Caiafa, Cesar Federico
author Caiafa, Cesar Federico
author_facet Caiafa, Cesar Federico
author_role author
dc.subject.none.fl_str_mv Ingeniería Electrónica
Ciencias Informáticas
Empirical mode decomposition
Machine learning
Sparse representations
Tensor decomposition
Tensor completion
topic Ingeniería Electrónica
Ciencias Informáticas
Empirical mode decomposition
Machine learning
Sparse representations
Tensor decomposition
Tensor completion
dc.description.none.fl_txt_mv In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Instituto Argentino de Radioastronomía
description In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
publishDate 2020
dc.date.none.fl_str_mv 2020
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Articulo
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/119641
url http://sedici.unlp.edu.ar/handle/10915/119641
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/issn/2076-3417
info:eu-repo/semantics/altIdentifier/doi/10.3390/app10238481
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by/4.0/
Creative Commons Attribution 4.0 International (CC BY 4.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0/
Creative Commons Attribution 4.0 International (CC BY 4.0)
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260498868862976
score 13.13397