Checkpoint and Restart: An Energy Consumption Characterization in Clusters
- Autores
- Morán, Marina; Balladini, Javier; Rexachs, Dolores; Luque, Emilio
- Año de publicación
- 2024
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión aceptada
- Descripción
- The fault tolerance method currently used in High Perfor- mance Computing (HPC) is the rollback-recovery method by using check- points. This, like any other fault tolerance method, adds an additional energy consumption to that of the execution of the application. The objective of this work is to determine the factors that affect the energy consumption of the computing nodes on homogeneous cluster, when per- forming checkpoint and restart operations, on SPMD (Single Program Multiple Data) applications. We have focused on the energetic study of compute nodes, contemplating different configurations of hardware and software parameters. We studied the effect of performance states (states P) and power states (states C) of processors, application problem size, checkpoint software (DMTCP) and distributed file system (NFS) config- uration. The results analysis allowed to identify opportunities to reduce the energy consumption of checkpoint and restart operations.
Fil: Morán, Marina. Universidad Nacional del Comahue. Facultad de Informática; Argentina.
Fil: Balladini, Javier. Universidad Nacional del Comahue. Facultad de Informática; Argentina.
Fil: Rexachs, Dolores. Universitat Autónoma de Barcelona. Departamento de Arquitectura de Computadores y Sistemas Operativos; España.
Fil: Luque, Emilio. Universitat Autónoma de Barcelona. Departamento de Arquitectura de Computadores y Sistemas Operativos; España. - Materia
-
Checkpoint
Restart
Energy consumption
Power
Fault tol- erance methods
Ciencias de la Computación e Información
Artículos - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
.jpg)
- Institución
- Universidad Nacional del Comahue
- OAI Identificador
- oai:rdi.uncoma.edu.ar:uncomaid/19173
Ver los metadatos del registro completo
| id |
RDIUNCO_2451a6909bec0f8230d12e317929c3dc |
|---|---|
| oai_identifier_str |
oai:rdi.uncoma.edu.ar:uncomaid/19173 |
| network_acronym_str |
RDIUNCO |
| repository_id_str |
7108 |
| network_name_str |
Repositorio Digital Institucional (UNCo) |
| spelling |
Checkpoint and Restart: An Energy Consumption Characterization in ClustersMorán, MarinaBalladini, JavierRexachs, DoloresLuque, EmilioCheckpointRestartEnergy consumptionPowerFault tol- erance methodsCiencias de la Computación e InformaciónArtículosThe fault tolerance method currently used in High Perfor- mance Computing (HPC) is the rollback-recovery method by using check- points. This, like any other fault tolerance method, adds an additional energy consumption to that of the execution of the application. The objective of this work is to determine the factors that affect the energy consumption of the computing nodes on homogeneous cluster, when per- forming checkpoint and restart operations, on SPMD (Single Program Multiple Data) applications. We have focused on the energetic study of compute nodes, contemplating different configurations of hardware and software parameters. We studied the effect of performance states (states P) and power states (states C) of processors, application problem size, checkpoint software (DMTCP) and distributed file system (NFS) config- uration. The results analysis allowed to identify opportunities to reduce the energy consumption of checkpoint and restart operations.Fil: Morán, Marina. Universidad Nacional del Comahue. Facultad de Informática; Argentina.Fil: Balladini, Javier. Universidad Nacional del Comahue. Facultad de Informática; Argentina.Fil: Rexachs, Dolores. Universitat Autónoma de Barcelona. Departamento de Arquitectura de Computadores y Sistemas Operativos; España.Fil: Luque, Emilio. Universitat Autónoma de Barcelona. Departamento de Arquitectura de Computadores y Sistemas Operativos; España.arXiv2024info:eu-repo/semantics/articleinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdf15 p.application/pdfhttps://rdi.uncoma.edu.ar/handle/uncomaid/19173enghttps://arxiv.org/abs/2409.02214info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/4.0/reponame:Repositorio Digital Institucional (UNCo)instname:Universidad Nacional del Comahue2026-01-08T11:15:23Zoai:rdi.uncoma.edu.ar:uncomaid/19173instacron:UNCoInstitucionalhttp://rdi.uncoma.edu.ar/Universidad públicaNo correspondehttp://rdi.uncoma.edu.ar/oaimirtha.mateo@biblioteca.uncoma.edu.ar; adriana.acuna@biblioteca.uncoma.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:71082026-01-08 11:15:23.67Repositorio Digital Institucional (UNCo) - Universidad Nacional del Comahuefalse |
| dc.title.none.fl_str_mv |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| title |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| spellingShingle |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters Morán, Marina Checkpoint Restart Energy consumption Power Fault tol- erance methods Ciencias de la Computación e Información Artículos |
| title_short |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| title_full |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| title_fullStr |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| title_full_unstemmed |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| title_sort |
Checkpoint and Restart: An Energy Consumption Characterization in Clusters |
| dc.creator.none.fl_str_mv |
Morán, Marina Balladini, Javier Rexachs, Dolores Luque, Emilio |
| author |
Morán, Marina |
| author_facet |
Morán, Marina Balladini, Javier Rexachs, Dolores Luque, Emilio |
| author_role |
author |
| author2 |
Balladini, Javier Rexachs, Dolores Luque, Emilio |
| author2_role |
author author author |
| dc.subject.none.fl_str_mv |
Checkpoint Restart Energy consumption Power Fault tol- erance methods Ciencias de la Computación e Información Artículos |
| topic |
Checkpoint Restart Energy consumption Power Fault tol- erance methods Ciencias de la Computación e Información Artículos |
| dc.description.none.fl_txt_mv |
The fault tolerance method currently used in High Perfor- mance Computing (HPC) is the rollback-recovery method by using check- points. This, like any other fault tolerance method, adds an additional energy consumption to that of the execution of the application. The objective of this work is to determine the factors that affect the energy consumption of the computing nodes on homogeneous cluster, when per- forming checkpoint and restart operations, on SPMD (Single Program Multiple Data) applications. We have focused on the energetic study of compute nodes, contemplating different configurations of hardware and software parameters. We studied the effect of performance states (states P) and power states (states C) of processors, application problem size, checkpoint software (DMTCP) and distributed file system (NFS) config- uration. The results analysis allowed to identify opportunities to reduce the energy consumption of checkpoint and restart operations. Fil: Morán, Marina. Universidad Nacional del Comahue. Facultad de Informática; Argentina. Fil: Balladini, Javier. Universidad Nacional del Comahue. Facultad de Informática; Argentina. Fil: Rexachs, Dolores. Universitat Autónoma de Barcelona. Departamento de Arquitectura de Computadores y Sistemas Operativos; España. Fil: Luque, Emilio. Universitat Autónoma de Barcelona. Departamento de Arquitectura de Computadores y Sistemas Operativos; España. |
| description |
The fault tolerance method currently used in High Perfor- mance Computing (HPC) is the rollback-recovery method by using check- points. This, like any other fault tolerance method, adds an additional energy consumption to that of the execution of the application. The objective of this work is to determine the factors that affect the energy consumption of the computing nodes on homogeneous cluster, when per- forming checkpoint and restart operations, on SPMD (Single Program Multiple Data) applications. We have focused on the energetic study of compute nodes, contemplating different configurations of hardware and software parameters. We studied the effect of performance states (states P) and power states (states C) of processors, application problem size, checkpoint software (DMTCP) and distributed file system (NFS) config- uration. The results analysis allowed to identify opportunities to reduce the energy consumption of checkpoint and restart operations. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/acceptedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
acceptedVersion |
| dc.identifier.none.fl_str_mv |
https://rdi.uncoma.edu.ar/handle/uncomaid/19173 |
| url |
https://rdi.uncoma.edu.ar/handle/uncomaid/19173 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
https://arxiv.org/abs/2409.02214 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/4.0/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/4.0/ |
| dc.format.none.fl_str_mv |
application/pdf 15 p. application/pdf |
| dc.publisher.none.fl_str_mv |
arXiv |
| publisher.none.fl_str_mv |
arXiv |
| dc.source.none.fl_str_mv |
reponame:Repositorio Digital Institucional (UNCo) instname:Universidad Nacional del Comahue |
| reponame_str |
Repositorio Digital Institucional (UNCo) |
| collection |
Repositorio Digital Institucional (UNCo) |
| instname_str |
Universidad Nacional del Comahue |
| repository.name.fl_str_mv |
Repositorio Digital Institucional (UNCo) - Universidad Nacional del Comahue |
| repository.mail.fl_str_mv |
mirtha.mateo@biblioteca.uncoma.edu.ar; adriana.acuna@biblioteca.uncoma.edu.ar |
| _version_ |
1853761295167258624 |
| score |
12.747614 |