Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
- Autores
- Caiafa, Cesar Federico
- Año de publicación
- 2020
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
Instituto Argentino de Radioastronomía - Materia
-
Ingeniería Electrónica
Ciencias Informáticas
Empirical mode decomposition
Machine learning
Sparse representations
Tensor decomposition
Tensor completion - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/119641
Ver los metadatos del registro completo
id |
SEDICI_0080c7491955ac0f5af9a35501b4bbc8 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/119641 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCaiafa, Cesar FedericoIngeniería ElectrónicaCiencias InformáticasEmpirical mode decompositionMachine learningSparse representationsTensor decompositionTensor completionIn many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Instituto Argentino de Radioastronomía2020info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/119641enginfo:eu-repo/semantics/altIdentifier/issn/2076-3417info:eu-repo/semantics/altIdentifier/doi/10.3390/app10238481info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Creative Commons Attribution 4.0 International (CC BY 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T11:00:21Zoai:sedici.unlp.edu.ar:10915/119641Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 11:00:21.952SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
spellingShingle |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets Caiafa, Cesar Federico Ingeniería Electrónica Ciencias Informáticas Empirical mode decomposition Machine learning Sparse representations Tensor decomposition Tensor completion |
title_short |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_full |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_fullStr |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_full_unstemmed |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_sort |
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
dc.creator.none.fl_str_mv |
Caiafa, Cesar Federico |
author |
Caiafa, Cesar Federico |
author_facet |
Caiafa, Cesar Federico |
author_role |
author |
dc.subject.none.fl_str_mv |
Ingeniería Electrónica Ciencias Informáticas Empirical mode decomposition Machine learning Sparse representations Tensor decomposition Tensor completion |
topic |
Ingeniería Electrónica Ciencias Informáticas Empirical mode decomposition Machine learning Sparse representations Tensor decomposition Tensor completion |
dc.description.none.fl_txt_mv |
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets. Instituto Argentino de Radioastronomía |
description |
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Articulo http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/119641 |
url |
http://sedici.unlp.edu.ar/handle/10915/119641 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/issn/2076-3417 info:eu-repo/semantics/altIdentifier/doi/10.3390/app10238481 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International (CC BY 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International (CC BY 4.0) |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1842260498868862976 |
score |
13.13397 |