H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments
- Autores
- Royo, Ambrosio; Villamayor, Jorge; Castro-León, Marcela; Rexachs del Rosario, Dolores; Luque Fadón, Emilio
- Año de publicación
- 2018
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Even though the cloud platform promises to be reliable, several availability incidents prove that they are not. How can we be sure that a parallel application finishes the execution even if a site is affected by a failure? This paper presents H-RADIC, an approach based on RADIC architecture, that executes a parallel application in at least 3 different virtual clusters or sites. The execution state of each site is saved periodically in another site and it is recovered in case of failure. The paper details the configuration of the architecture and the experiments results using 3 virtual clusters running NAS parallel applications protected with DMTCP, a very well-known distributed multi-threaded checkpoint tool. Our experiments show that the execution time was increased between a 5% to 36% without failures and 27% to 66% in case of failures.
Facultad de Informática - Materia
-
Ciencias Informáticas
cloud computing
cloud, fault-tolerance, high-performance computing, RADIC - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/69674
Ver los metadatos del registro completo
id |
SEDICI_0e1edc4ae701acd7bad983378c3875e4 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/69674 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud EnvironmentsRoyo, AmbrosioVillamayor, JorgeCastro-León, MarcelaRexachs del Rosario, DoloresLuque Fadón, EmilioCiencias Informáticascloud computingcloud, fault-tolerance, high-performance computing, RADICEven though the cloud platform promises to be reliable, several availability incidents prove that they are not. How can we be sure that a parallel application finishes the execution even if a site is affected by a failure? This paper presents H-RADIC, an approach based on RADIC architecture, that executes a parallel application in at least 3 different virtual clusters or sites. The execution state of each site is saved periodically in another site and it is recovered in case of failure. The paper details the configuration of the architecture and the experiments results using 3 virtual clusters running NAS parallel applications protected with DMTCP, a very well-known distributed multi-threaded checkpoint tool. Our experiments show that the execution time was increased between a 5% to 36% without failures and 27% to 66% in case of failures.Facultad de Informática2018-06info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf7-13http://sedici.unlp.edu.ar/handle/10915/69674enginfo:eu-repo/semantics/altIdentifier/isbn/978-950-34-1659-4info:eu-repo/semantics/reference/hdl/10915/69464info:eu-repo/semantics/reference/hdl/10915/71655info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:43:01Zoai:sedici.unlp.edu.ar:10915/69674Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:43:01.491SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
title |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
spellingShingle |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments Royo, Ambrosio Ciencias Informáticas cloud computing cloud, fault-tolerance, high-performance computing, RADIC |
title_short |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
title_full |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
title_fullStr |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
title_full_unstemmed |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
title_sort |
H-RADIC: The Fault Tolerance Framework for Virtual Clusters on Multi-Cloud Environments |
dc.creator.none.fl_str_mv |
Royo, Ambrosio Villamayor, Jorge Castro-León, Marcela Rexachs del Rosario, Dolores Luque Fadón, Emilio |
author |
Royo, Ambrosio |
author_facet |
Royo, Ambrosio Villamayor, Jorge Castro-León, Marcela Rexachs del Rosario, Dolores Luque Fadón, Emilio |
author_role |
author |
author2 |
Villamayor, Jorge Castro-León, Marcela Rexachs del Rosario, Dolores Luque Fadón, Emilio |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas cloud computing cloud, fault-tolerance, high-performance computing, RADIC |
topic |
Ciencias Informáticas cloud computing cloud, fault-tolerance, high-performance computing, RADIC |
dc.description.none.fl_txt_mv |
Even though the cloud platform promises to be reliable, several availability incidents prove that they are not. How can we be sure that a parallel application finishes the execution even if a site is affected by a failure? This paper presents H-RADIC, an approach based on RADIC architecture, that executes a parallel application in at least 3 different virtual clusters or sites. The execution state of each site is saved periodically in another site and it is recovered in case of failure. The paper details the configuration of the architecture and the experiments results using 3 virtual clusters running NAS parallel applications protected with DMTCP, a very well-known distributed multi-threaded checkpoint tool. Our experiments show that the execution time was increased between a 5% to 36% without failures and 27% to 66% in case of failures. Facultad de Informática |
description |
Even though the cloud platform promises to be reliable, several availability incidents prove that they are not. How can we be sure that a parallel application finishes the execution even if a site is affected by a failure? This paper presents H-RADIC, an approach based on RADIC architecture, that executes a parallel application in at least 3 different virtual clusters or sites. The execution state of each site is saved periodically in another site and it is recovered in case of failure. The paper details the configuration of the architecture and the experiments results using 3 virtual clusters running NAS parallel applications protected with DMTCP, a very well-known distributed multi-threaded checkpoint tool. Our experiments show that the execution time was increased between a 5% to 36% without failures and 27% to 66% in case of failures. |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-06 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/69674 |
url |
http://sedici.unlp.edu.ar/handle/10915/69674 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-950-34-1659-4 info:eu-repo/semantics/reference/hdl/10915/69464 info:eu-repo/semantics/reference/hdl/10915/71655 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.format.none.fl_str_mv |
application/pdf 7-13 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1842260300188876800 |
score |
13.13397 |