Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes

Autores: Cosa Rodriguez, Pablo; Marti Puig, Pere; Caiafa, Cesar Federico; Serra Serra, Moisès; Cusidó, Jordi; Solé Casals, Jordi
Año de publicación: 2023
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Product maintenance costs throughout the product’s lifetime can account for between 30–60% of total operating costs, making it necessary to implement maintenance strategies. This problem not only affects the economy but is also related to the impact on the environment, since breakdowns are also responsible for the delivery of greenhouse gases. Industrial maintenance is a set of measures of a technical-organizational nature whose purpose is to sustain the functionality of the equipment and guarantee an optimal state of the machines over time, with the aim of saving costs, extending the useful life of the machines, saving energy, maximising production and availability, ensuring the quality of the product obtained, providing job security for technicians, preserving the environment, and reducing emissions as much as possible. Machine learning techniques can be used to detect or predict faults in wind turbines. However, labelled data suffers from many problems in this application because alarms are usually not clearly associated with a specific fault, some labels are wrongly associated with a problem, and the imbalance between labels is evident. To avoid using labelled data, we investigate here the use of the clustering technique, more specifically K-means, and boxplot representations of the variables for a set of six different tests. Experimental results show that in some cases, the clustering and boxplot techniques allow us to determine outliers or identify erroneous behaviours of the wind turbines. These cases can then be investigated in detail by a specialist so that more efficient predictive maintenance can be carried out.
Instituto Argentino de Radioastronomía
Materia: Ingeniería
Informática
Predictive maintenance
Prognosis
Machine learning
K-means
Clustering
SCADA data
Renewable energies
Wind turbine
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by/4.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/152530

Acceder

id	SEDICI_303c9077530f2e973e46ede0e5549178
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/152530
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposesCosa Rodriguez, PabloMarti Puig, PereCaiafa, Cesar FedericoSerra Serra, MoisèsCusidó, JordiSolé Casals, JordiIngenieríaInformáticaPredictive maintenancePrognosisMachine learningK-meansClusteringSCADA dataRenewable energiesWind turbineProduct maintenance costs throughout the product’s lifetime can account for between 30–60% of total operating costs, making it necessary to implement maintenance strategies. This problem not only affects the economy but is also related to the impact on the environment, since breakdowns are also responsible for the delivery of greenhouse gases. Industrial maintenance is a set of measures of a technical-organizational nature whose purpose is to sustain the functionality of the equipment and guarantee an optimal state of the machines over time, with the aim of saving costs, extending the useful life of the machines, saving energy, maximising production and availability, ensuring the quality of the product obtained, providing job security for technicians, preserving the environment, and reducing emissions as much as possible. Machine learning techniques can be used to detect or predict faults in wind turbines. However, labelled data suffers from many problems in this application because alarms are usually not clearly associated with a specific fault, some labels are wrongly associated with a problem, and the imbalance between labels is evident. To avoid using labelled data, we investigate here the use of the clustering technique, more specifically K-means, and boxplot representations of the variables for a set of six different tests. Experimental results show that in some cases, the clustering and boxplot techniques allow us to determine outliers or identify erroneous behaviours of the wind turbines. These cases can then be investigated in detail by a specialist so that more efficient predictive maintenance can be carried out.Instituto Argentino de Radioastronomía2023info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/152530enginfo:eu-repo/semantics/altIdentifier/issn/2075-1702info:eu-repo/semantics/altIdentifier/doi/10.3390/machines11020270info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Creative Commons Attribution 4.0 International (CC BY 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-02-12T16:23:42Zoai:sedici.unlp.edu.ar:10915/152530Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-02-12 16:23:42.416SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
title	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
spellingShingle	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes Cosa Rodriguez, Pablo Ingeniería Informática Predictive maintenance Prognosis Machine learning K-means Clustering SCADA data Renewable energies Wind turbine
title_short	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
title_full	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
title_fullStr	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
title_full_unstemmed	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
title_sort	Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes
dc.creator.none.fl_str_mv	Cosa Rodriguez, Pablo Marti Puig, Pere Caiafa, Cesar Federico Serra Serra, Moisès Cusidó, Jordi Solé Casals, Jordi
author	Cosa Rodriguez, Pablo
author_facet	Cosa Rodriguez, Pablo Marti Puig, Pere Caiafa, Cesar Federico Serra Serra, Moisès Cusidó, Jordi Solé Casals, Jordi
author_role	author
author2	Marti Puig, Pere Caiafa, Cesar Federico Serra Serra, Moisès Cusidó, Jordi Solé Casals, Jordi
author2_role	author author author author author
dc.subject.none.fl_str_mv	Ingeniería Informática Predictive maintenance Prognosis Machine learning K-means Clustering SCADA data Renewable energies Wind turbine
topic	Ingeniería Informática Predictive maintenance Prognosis Machine learning K-means Clustering SCADA data Renewable energies Wind turbine
dc.description.none.fl_txt_mv	Product maintenance costs throughout the product’s lifetime can account for between 30–60% of total operating costs, making it necessary to implement maintenance strategies. This problem not only affects the economy but is also related to the impact on the environment, since breakdowns are also responsible for the delivery of greenhouse gases. Industrial maintenance is a set of measures of a technical-organizational nature whose purpose is to sustain the functionality of the equipment and guarantee an optimal state of the machines over time, with the aim of saving costs, extending the useful life of the machines, saving energy, maximising production and availability, ensuring the quality of the product obtained, providing job security for technicians, preserving the environment, and reducing emissions as much as possible. Machine learning techniques can be used to detect or predict faults in wind turbines. However, labelled data suffers from many problems in this application because alarms are usually not clearly associated with a specific fault, some labels are wrongly associated with a problem, and the imbalance between labels is evident. To avoid using labelled data, we investigate here the use of the clustering technique, more specifically K-means, and boxplot representations of the variables for a set of six different tests. Experimental results show that in some cases, the clustering and boxplot techniques allow us to determine outliers or identify erroneous behaviours of the wind turbines. These cases can then be investigated in detail by a specialist so that more efficient predictive maintenance can be carried out. Instituto Argentino de Radioastronomía
description	Product maintenance costs throughout the product’s lifetime can account for between 30–60% of total operating costs, making it necessary to implement maintenance strategies. This problem not only affects the economy but is also related to the impact on the environment, since breakdowns are also responsible for the delivery of greenhouse gases. Industrial maintenance is a set of measures of a technical-organizational nature whose purpose is to sustain the functionality of the equipment and guarantee an optimal state of the machines over time, with the aim of saving costs, extending the useful life of the machines, saving energy, maximising production and availability, ensuring the quality of the product obtained, providing job security for technicians, preserving the environment, and reducing emissions as much as possible. Machine learning techniques can be used to detect or predict faults in wind turbines. However, labelled data suffers from many problems in this application because alarms are usually not clearly associated with a specific fault, some labels are wrongly associated with a problem, and the imbalance between labels is evident. To avoid using labelled data, we investigate here the use of the clustering technique, more specifically K-means, and boxplot representations of the variables for a set of six different tests. Experimental results show that in some cases, the clustering and boxplot techniques allow us to determine outliers or identify erroneous behaviours of the wind turbines. These cases can then be investigated in detail by a specialist so that more efficient predictive maintenance can be carried out.
publishDate	2023
dc.date.none.fl_str_mv	2023
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Articulo http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/152530
url	http://sedici.unlp.edu.ar/handle/10915/152530
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/issn/2075-1702 info:eu-repo/semantics/altIdentifier/doi/10.3390/machines11020270
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International (CC BY 4.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International (CC BY 4.0)
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1857016819328483328
score	12.930639

Exploratory analysis of SCADA data fromwind turbines using the K-means clustering algorithm for predictive maintenance purposes

Publicaciones similares