Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance
- Autores
- Morán, Marina; Balladini, Javier; Rexachs del Rosario, Dolores; Rucci, Enzo
- Año de publicación
- 2022
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Inquiring about different ways to reduce energy consumption during the execution of large-scale applications is essential to maintain and increase the enormous computing power achieved in HPC systems. Fault tolerance methods can have an impact on power consumption. In particular, rollback-recovery methods using uncoordinated checkpoints prevent all processes from re-executing in the event of a failure. In this context, it is possible to take actions on the nodes of the processes that do not re-execute to reduce energy consumption. In this work, we describe some issues to consider when we extend the application of energy-saving strategies beyond the nodes that communicate directly with the failed one.
Instituto de Investigación en Informática - Materia
-
Ciencias Informáticas
Energy consumption
Fault tolerance
Uncoordinated checkpoints
HPC - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/140642
Ver los metadatos del registro completo
id |
SEDICI_d4788ae6a0fc0a7e988fd09272fd0a6a |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/140642 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault ToleranceMorán, MarinaBalladini, JavierRexachs del Rosario, DoloresRucci, EnzoCiencias InformáticasEnergy consumptionFault toleranceUncoordinated checkpointsHPCInquiring about different ways to reduce energy consumption during the execution of large-scale applications is essential to maintain and increase the enormous computing power achieved in HPC systems. Fault tolerance methods can have an impact on power consumption. In particular, rollback-recovery methods using uncoordinated checkpoints prevent all processes from re-executing in the event of a failure. In this context, it is possible to take actions on the nodes of the processes that do not re-execute to reduce energy consumption. In this work, we describe some issues to consider when we extend the application of energy-saving strategies beyond the nodes that communicate directly with the failed one.Instituto de Investigación en Informática2022-07info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf17-22http://sedici.unlp.edu.ar/handle/10915/140642enginfo:eu-repo/semantics/altIdentifier/isbn/978-950-34-2126-0info:eu-repo/semantics/reference/hdl/10915/139373info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-17T10:18:23Zoai:sedici.unlp.edu.ar:10915/140642Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-17 10:18:23.902SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
title |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
spellingShingle |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance Morán, Marina Ciencias Informáticas Energy consumption Fault tolerance Uncoordinated checkpoints HPC |
title_short |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
title_full |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
title_fullStr |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
title_full_unstemmed |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
title_sort |
Some Issues to Consider in the Management of Energy Consumption in HPC Systems with Fault Tolerance |
dc.creator.none.fl_str_mv |
Morán, Marina Balladini, Javier Rexachs del Rosario, Dolores Rucci, Enzo |
author |
Morán, Marina |
author_facet |
Morán, Marina Balladini, Javier Rexachs del Rosario, Dolores Rucci, Enzo |
author_role |
author |
author2 |
Balladini, Javier Rexachs del Rosario, Dolores Rucci, Enzo |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Energy consumption Fault tolerance Uncoordinated checkpoints HPC |
topic |
Ciencias Informáticas Energy consumption Fault tolerance Uncoordinated checkpoints HPC |
dc.description.none.fl_txt_mv |
Inquiring about different ways to reduce energy consumption during the execution of large-scale applications is essential to maintain and increase the enormous computing power achieved in HPC systems. Fault tolerance methods can have an impact on power consumption. In particular, rollback-recovery methods using uncoordinated checkpoints prevent all processes from re-executing in the event of a failure. In this context, it is possible to take actions on the nodes of the processes that do not re-execute to reduce energy consumption. In this work, we describe some issues to consider when we extend the application of energy-saving strategies beyond the nodes that communicate directly with the failed one. Instituto de Investigación en Informática |
description |
Inquiring about different ways to reduce energy consumption during the execution of large-scale applications is essential to maintain and increase the enormous computing power achieved in HPC systems. Fault tolerance methods can have an impact on power consumption. In particular, rollback-recovery methods using uncoordinated checkpoints prevent all processes from re-executing in the event of a failure. In this context, it is possible to take actions on the nodes of the processes that do not re-execute to reduce energy consumption. In this work, we describe some issues to consider when we extend the application of energy-saving strategies beyond the nodes that communicate directly with the failed one. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-07 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/140642 |
url |
http://sedici.unlp.edu.ar/handle/10915/140642 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-950-34-2126-0 info:eu-repo/semantics/reference/hdl/10915/139373 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.format.none.fl_str_mv |
application/pdf 17-22 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1843532859143356416 |
score |
13.001348 |