Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
- Autores
- Folguera, Laura; Zupan, Jure; Cicerone, Daniel; Magallanes, Jorge
- Año de publicación
- 2015
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method.
Fil: Laura Folguera. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina.
Fil: Jure Zupan. National Institute of Chemistry; Ljubljana. Slovenia.
Fil: Daniel Cicerone. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina.
Fil: Jorge Magallanes. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. - Fuente
- Chemometrics and Intelligent Laboratory Systems. 143: 146-151 (2015) Elsevier B.V.
http://dx.doi.org/10.1016/j.chemolab.2015.03.002 - Materia
-
CHEMOMETRICS
ARTIFICIAL NEURAL NETWORK
SELF-ORGANIZING MAPS
MISSING DATA IMPUTATION
ENVIRONMENTAL DATA SET
CIENCIAS QUÍMICAS
CIENCIAS EXACTAS Y NATURALES - Nivel de accesibilidad
- acceso restringido
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Universidad Nacional de General San Martín
- OAI Identificador
- oai:ri.unsam.edu.ar:123456789/1009
Ver los metadatos del registro completo
id |
RIUNSAM_38bbb9fe520b192a07130ae0912e128a |
---|---|
oai_identifier_str |
oai:ri.unsam.edu.ar:123456789/1009 |
network_acronym_str |
RIUNSAM |
repository_id_str |
s |
network_name_str |
Repositorio Institucional (UNSAM) |
spelling |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data MatricesFolguera, LauraZupan, JureCicerone, DanielMagallanes, JorgeCHEMOMETRICSARTIFICIAL NEURAL NETWORKSELF-ORGANIZING MAPSMISSING DATA IMPUTATIONENVIRONMENTAL DATA SETCIENCIAS QUÍMICASCIENCIAS EXACTAS Y NATURALESThe problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method.Fil: Laura Folguera. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina.Fil: Jure Zupan. National Institute of Chemistry; Ljubljana. Slovenia.Fil: Daniel Cicerone. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina.Fil: Jorge Magallanes. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina.Elsevier Science Bv2015-03info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfpp. 146-151application/pdfFolguera, L. et al (2015). Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices. En: Chemometrics and Intelligent Laboratory Systems. Elsevier Science 143, 146-1510169-7439https://ri.unsam.edu.ar/handle/123456789/1009Chemometrics and Intelligent Laboratory Systems. 143: 146-151 (2015) Elsevier B.V.http://dx.doi.org/10.1016/j.chemolab.2015.03.002reponame:Repositorio Institucional (UNSAM)instname:Universidad Nacional de General San Martínenginfo:eu-repo/semantics/restrictedAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Atribución-NoComercial-CompartirIgual 2.5 Argentina (CC BY-NC-SA 2.5)2025-09-29T14:30:21Zoai:ri.unsam.edu.ar:123456789/1009instacron:UNSAMInstitucionalhttp://ri.unsam.edu.arUniversidad públicaNo correspondehttp://ri.unsam.edu.ar/oai/lpastran@unsam.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:s2025-09-29 14:31:15.548Repositorio Institucional (UNSAM) - Universidad Nacional de General San Martínfalse |
dc.title.none.fl_str_mv |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
spellingShingle |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices Folguera, Laura CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES |
title_short |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_full |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_fullStr |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_full_unstemmed |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
title_sort |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
dc.creator.none.fl_str_mv |
Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
author |
Folguera, Laura |
author_facet |
Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
author_role |
author |
author2 |
Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES |
topic |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES |
dc.description.none.fl_txt_mv |
The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method. Fil: Laura Folguera. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. Fil: Jure Zupan. National Institute of Chemistry; Ljubljana. Slovenia. Fil: Daniel Cicerone. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. Fil: Jorge Magallanes. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. |
description |
The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method. |
publishDate |
2015 |
dc.date.none.fl_str_mv |
2015-03 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/article http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
status_str |
publishedVersion |
format |
article |
dc.identifier.none.fl_str_mv |
Folguera, L. et al (2015). Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices. En: Chemometrics and Intelligent Laboratory Systems. Elsevier Science 143, 146-151 0169-7439 https://ri.unsam.edu.ar/handle/123456789/1009 |
identifier_str_mv |
Folguera, L. et al (2015). Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices. En: Chemometrics and Intelligent Laboratory Systems. Elsevier Science 143, 146-151 0169-7439 |
url |
https://ri.unsam.edu.ar/handle/123456789/1009 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/restrictedAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Atribución-NoComercial-CompartirIgual 2.5 Argentina (CC BY-NC-SA 2.5) |
eu_rights_str_mv |
restrictedAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Atribución-NoComercial-CompartirIgual 2.5 Argentina (CC BY-NC-SA 2.5) |
dc.format.none.fl_str_mv |
application/pdf pp. 146-151 application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier Science Bv |
publisher.none.fl_str_mv |
Elsevier Science Bv |
dc.source.none.fl_str_mv |
Chemometrics and Intelligent Laboratory Systems. 143: 146-151 (2015) Elsevier B.V. http://dx.doi.org/10.1016/j.chemolab.2015.03.002 reponame:Repositorio Institucional (UNSAM) instname:Universidad Nacional de General San Martín |
reponame_str |
Repositorio Institucional (UNSAM) |
collection |
Repositorio Institucional (UNSAM) |
instname_str |
Universidad Nacional de General San Martín |
repository.name.fl_str_mv |
Repositorio Institucional (UNSAM) - Universidad Nacional de General San Martín |
repository.mail.fl_str_mv |
lpastran@unsam.edu.ar |
_version_ |
1844621916289105920 |
score |
12.559606 |