A multipath routing method for tolerating permanent and non-permanent faults

Autores
Zarza, Gonzalo; Lugones, Diego; Franco, Daniel; Luque Fadón, Emilio
Año de publicación
2009
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
The intensive and continuous use of high-performance computers for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. The interconnection network is a critical part of such systems, therefore, network faults have an extremely high impact because most routing algorithms are not designed to tolerate faults. In such algorithms, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked confi gurations. This work focuses on the problem of fault tolerance for high-speed interconnection networks by designing a fault-tolerant routing method to solve an unbounded number of dynamic faults (permanent and non- permanent). To accomplish this task we take advantage of the communication path redundancy, by means of a multipath routing approach. Experiments show that our method allows applications to finalize their execution in the presence of several number of faults, with an average performance value of 97% compared to the fault-free scenarios.
Presentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)
Red de Universidades con Carreras en Informática (RedUNCI)
Materia
Ciencias Informáticas
high-performance computers
high-speed interconnection
fault-tolerant routing method
Communications Applications
Reliability, Testing, and Fault-Tolerance
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/20912

id SEDICI_7651d4d352c174f29115780267db6a89
oai_identifier_str oai:sedici.unlp.edu.ar:10915/20912
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling A multipath routing method for tolerating permanent and non-permanent faultsZarza, GonzaloLugones, DiegoFranco, DanielLuque Fadón, EmilioCiencias Informáticashigh-performance computershigh-speed interconnectionfault-tolerant routing methodCommunications ApplicationsReliability, Testing, and Fault-ToleranceThe intensive and continuous use of high-performance computers for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. The interconnection network is a critical part of such systems, therefore, network faults have an extremely high impact because most routing algorithms are not designed to tolerate faults. In such algorithms, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked confi gurations. This work focuses on the problem of fault tolerance for high-speed interconnection networks by designing a fault-tolerant routing method to solve an unbounded number of dynamic faults (permanent and non- permanent). To accomplish this task we take advantage of the communication path redundancy, by means of a multipath routing approach. Experiments show that our method allows applications to finalize their execution in the presence of several number of faults, with an average performance value of 97% compared to the fault-free scenarios.Presentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI)2009-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf251-261http://sedici.unlp.edu.ar/handle/10915/20912enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T10:47:04Zoai:sedici.unlp.edu.ar:10915/20912Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 10:47:05.121SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv A multipath routing method for tolerating permanent and non-permanent faults
title A multipath routing method for tolerating permanent and non-permanent faults
spellingShingle A multipath routing method for tolerating permanent and non-permanent faults
Zarza, Gonzalo
Ciencias Informáticas
high-performance computers
high-speed interconnection
fault-tolerant routing method
Communications Applications
Reliability, Testing, and Fault-Tolerance
title_short A multipath routing method for tolerating permanent and non-permanent faults
title_full A multipath routing method for tolerating permanent and non-permanent faults
title_fullStr A multipath routing method for tolerating permanent and non-permanent faults
title_full_unstemmed A multipath routing method for tolerating permanent and non-permanent faults
title_sort A multipath routing method for tolerating permanent and non-permanent faults
dc.creator.none.fl_str_mv Zarza, Gonzalo
Lugones, Diego
Franco, Daniel
Luque Fadón, Emilio
author Zarza, Gonzalo
author_facet Zarza, Gonzalo
Lugones, Diego
Franco, Daniel
Luque Fadón, Emilio
author_role author
author2 Lugones, Diego
Franco, Daniel
Luque Fadón, Emilio
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
high-performance computers
high-speed interconnection
fault-tolerant routing method
Communications Applications
Reliability, Testing, and Fault-Tolerance
topic Ciencias Informáticas
high-performance computers
high-speed interconnection
fault-tolerant routing method
Communications Applications
Reliability, Testing, and Fault-Tolerance
dc.description.none.fl_txt_mv The intensive and continuous use of high-performance computers for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. The interconnection network is a critical part of such systems, therefore, network faults have an extremely high impact because most routing algorithms are not designed to tolerate faults. In such algorithms, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked confi gurations. This work focuses on the problem of fault tolerance for high-speed interconnection networks by designing a fault-tolerant routing method to solve an unbounded number of dynamic faults (permanent and non- permanent). To accomplish this task we take advantage of the communication path redundancy, by means of a multipath routing approach. Experiments show that our method allows applications to finalize their execution in the presence of several number of faults, with an average performance value of 97% compared to the fault-free scenarios.
Presentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)
Red de Universidades con Carreras en Informática (RedUNCI)
description The intensive and continuous use of high-performance computers for executing computationally intensive applications, coupled with the large number of elements that make them up, dramatically increase the likelihood of failures during their operation. The interconnection network is a critical part of such systems, therefore, network faults have an extremely high impact because most routing algorithms are not designed to tolerate faults. In such algorithms, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked confi gurations. This work focuses on the problem of fault tolerance for high-speed interconnection networks by designing a fault-tolerant routing method to solve an unbounded number of dynamic faults (permanent and non- permanent). To accomplish this task we take advantage of the communication path redundancy, by means of a multipath routing approach. Experiments show that our method allows applications to finalize their execution in the presence of several number of faults, with an average performance value of 97% compared to the fault-free scenarios.
publishDate 2009
dc.date.none.fl_str_mv 2009-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/20912
url http://sedici.unlp.edu.ar/handle/10915/20912
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
dc.format.none.fl_str_mv application/pdf
251-261
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1846063895341432832
score 13.22299