Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes

Autores
Dianda, Daniela Fernanda
Año de publicación
2017
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
One of the main objectives of data analysis in industrial contexts is prediction, that is, to identify a function that allows predicting the value of a response from the values of other variables considered as potential predictors of this outcome. The large volumes of data that current technology allows to generate and store have made it necessary to develop methods of analysis alternative to the traditional ones to achieve this objective, which allow mainly to process these large amounts of information and to predict the response in real time. Enclosed under the name of Data Mining, many of these new methods are based on automatic algorithms mostly originated in the computer field. However, the quality of the information that feeds these procedures remains a key factor in ensuring the reliability of the results. With this premise, in this work we study the effect that the presence of faults in the measurement devices that originate the information to be analyzed, can cause on the predictive ability of one of the predictive methods of data mining, the decision trees. The results are compared with those obtained using one of the traditional statistical techniques: multiple linear regression. The results obtained indicate that the effect of measurement related errors on the predictive ability of decision trees, compared to traditional regression models, depends on the nature of the measurement error.
Fil: Dianda, Daniela Fernanda. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario; Argentina. Universidad Nacional de Rosario. Facultad de Ciencias económicas y Estadística. Escuela de Estadística. Instituto de Investigaciones Teóricas y Aplicadas; Argentina
Materia
CART DECISION TREES
LINEAR REGRESSION
MEASUREMENT ERROR
PREDICTION ERROR
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/67129

id CONICETDig_958e5ee8c2284e9fac8811df7cd2627a
oai_identifier_str oai:ri.conicet.gov.ar:11336/67129
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production ProcessesDianda, Daniela FernandaCART DECISION TREESLINEAR REGRESSIONMEASUREMENT ERRORPREDICTION ERRORhttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1One of the main objectives of data analysis in industrial contexts is prediction, that is, to identify a function that allows predicting the value of a response from the values of other variables considered as potential predictors of this outcome. The large volumes of data that current technology allows to generate and store have made it necessary to develop methods of analysis alternative to the traditional ones to achieve this objective, which allow mainly to process these large amounts of information and to predict the response in real time. Enclosed under the name of Data Mining, many of these new methods are based on automatic algorithms mostly originated in the computer field. However, the quality of the information that feeds these procedures remains a key factor in ensuring the reliability of the results. With this premise, in this work we study the effect that the presence of faults in the measurement devices that originate the information to be analyzed, can cause on the predictive ability of one of the predictive methods of data mining, the decision trees. The results are compared with those obtained using one of the traditional statistical techniques: multiple linear regression. The results obtained indicate that the effect of measurement related errors on the predictive ability of decision trees, compared to traditional regression models, depends on the nature of the measurement error.Fil: Dianda, Daniela Fernanda. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario; Argentina. Universidad Nacional de Rosario. Facultad de Ciencias económicas y Estadística. Escuela de Estadística. Instituto de Investigaciones Teóricas y Aplicadas; ArgentinaIOSR Journals2017-02info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/67129Dianda, Daniela Fernanda; Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes; IOSR Journals; IOSR Journal of Computer Engineering; 19; 01; 2-2017; 90-982278-0661CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.9790/0661-1901049098info:eu-repo/semantics/altIdentifier/url/http://www.iosrjournals.org/iosr-jce/papers/Vol19-issue1/Version-4/R1901049098.pdfinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:47:37Zoai:ri.conicet.gov.ar:11336/67129instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:47:37.924CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
title Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
spellingShingle Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
Dianda, Daniela Fernanda
CART DECISION TREES
LINEAR REGRESSION
MEASUREMENT ERROR
PREDICTION ERROR
title_short Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
title_full Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
title_fullStr Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
title_full_unstemmed Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
title_sort Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes
dc.creator.none.fl_str_mv Dianda, Daniela Fernanda
author Dianda, Daniela Fernanda
author_facet Dianda, Daniela Fernanda
author_role author
dc.subject.none.fl_str_mv CART DECISION TREES
LINEAR REGRESSION
MEASUREMENT ERROR
PREDICTION ERROR
topic CART DECISION TREES
LINEAR REGRESSION
MEASUREMENT ERROR
PREDICTION ERROR
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.1
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv One of the main objectives of data analysis in industrial contexts is prediction, that is, to identify a function that allows predicting the value of a response from the values of other variables considered as potential predictors of this outcome. The large volumes of data that current technology allows to generate and store have made it necessary to develop methods of analysis alternative to the traditional ones to achieve this objective, which allow mainly to process these large amounts of information and to predict the response in real time. Enclosed under the name of Data Mining, many of these new methods are based on automatic algorithms mostly originated in the computer field. However, the quality of the information that feeds these procedures remains a key factor in ensuring the reliability of the results. With this premise, in this work we study the effect that the presence of faults in the measurement devices that originate the information to be analyzed, can cause on the predictive ability of one of the predictive methods of data mining, the decision trees. The results are compared with those obtained using one of the traditional statistical techniques: multiple linear regression. The results obtained indicate that the effect of measurement related errors on the predictive ability of decision trees, compared to traditional regression models, depends on the nature of the measurement error.
Fil: Dianda, Daniela Fernanda. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario; Argentina. Universidad Nacional de Rosario. Facultad de Ciencias económicas y Estadística. Escuela de Estadística. Instituto de Investigaciones Teóricas y Aplicadas; Argentina
description One of the main objectives of data analysis in industrial contexts is prediction, that is, to identify a function that allows predicting the value of a response from the values of other variables considered as potential predictors of this outcome. The large volumes of data that current technology allows to generate and store have made it necessary to develop methods of analysis alternative to the traditional ones to achieve this objective, which allow mainly to process these large amounts of information and to predict the response in real time. Enclosed under the name of Data Mining, many of these new methods are based on automatic algorithms mostly originated in the computer field. However, the quality of the information that feeds these procedures remains a key factor in ensuring the reliability of the results. With this premise, in this work we study the effect that the presence of faults in the measurement devices that originate the information to be analyzed, can cause on the predictive ability of one of the predictive methods of data mining, the decision trees. The results are compared with those obtained using one of the traditional statistical techniques: multiple linear regression. The results obtained indicate that the effect of measurement related errors on the predictive ability of decision trees, compared to traditional regression models, depends on the nature of the measurement error.
publishDate 2017
dc.date.none.fl_str_mv 2017-02
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/67129
Dianda, Daniela Fernanda; Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes; IOSR Journals; IOSR Journal of Computer Engineering; 19; 01; 2-2017; 90-98
2278-0661
CONICET Digital
CONICET
url http://hdl.handle.net/11336/67129
identifier_str_mv Dianda, Daniela Fernanda; Robustness of Predictive Data Mining Methods under the Presence of Measurement Errors in the Context of Production Processes; IOSR Journals; IOSR Journal of Computer Engineering; 19; 01; 2-2017; 90-98
2278-0661
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.9790/0661-1901049098
info:eu-repo/semantics/altIdentifier/url/http://www.iosrjournals.org/iosr-jce/papers/Vol19-issue1/Version-4/R1901049098.pdf
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv IOSR Journals
publisher.none.fl_str_mv IOSR Journals
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613483523473408
score 13.070432