Analisis and tools for performance prediction

Autores
González, J.A.; León, C.; Piccoli, María Fabiana; Printista, Alicia Marcela; Roda García, José Luis; Rodriguez, C.; Rodríguez, J.M.; Sande Gonzalez, Francisco de
Año de publicación
2001
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
We present an analytical model that extends BSP to cover both oblivious synchronization and group partitioning. There are a few oversimplifications in BSP that make difficult to have accurate predictions. Even if the numbers of individual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. Even worse, the assumption that a constant number of machine instructions takes constant time is far from the truth. Current memory hierarchies imply that memory access vary from a few cycles to several thousands. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each “communication block”. Unfortunately, to use this approach implies that the evaluation parameters not only depend on given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.
Eje: Programación concurrente
Red de Universidades con Carreras en Informática (RedUNCI)
Materia
Ciencias Informáticas
Complexity model
Performance analysis
Performance prediction
Oblivious synchronization
Performance
Concurrent Programming
Tools
Performance profiling
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/23310

id SEDICI_5c2dcd86c212273c4ab81c75178e8209
oai_identifier_str oai:sedici.unlp.edu.ar:10915/23310
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Analisis and tools for performance predictionGonzález, J.A.León, C.Piccoli, María FabianaPrintista, Alicia MarcelaRoda García, José LuisRodriguez, C.Rodríguez, J.M.Sande Gonzalez, Francisco deCiencias InformáticasComplexity modelPerformance analysisPerformance predictionOblivious synchronizationPerformanceConcurrent ProgrammingToolsPerformance profilingWe present an analytical model that extends BSP to cover both oblivious synchronization and group partitioning. There are a few oversimplifications in BSP that make difficult to have accurate predictions. Even if the numbers of individual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. Even worse, the assumption that a constant number of machine instructions takes constant time is far from the truth. Current memory hierarchies imply that memory access vary from a few cycles to several thousands. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each “communication block”. Unfortunately, to use this approach implies that the evaluation parameters not only depend on given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.Eje: Programación concurrenteRed de Universidades con Carreras en Informática (RedUNCI)2001-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/23310enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-22T16:36:55Zoai:sedici.unlp.edu.ar:10915/23310Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-22 16:36:56.248SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Analisis and tools for performance prediction
title Analisis and tools for performance prediction
spellingShingle Analisis and tools for performance prediction
González, J.A.
Ciencias Informáticas
Complexity model
Performance analysis
Performance prediction
Oblivious synchronization
Performance
Concurrent Programming
Tools
Performance profiling
title_short Analisis and tools for performance prediction
title_full Analisis and tools for performance prediction
title_fullStr Analisis and tools for performance prediction
title_full_unstemmed Analisis and tools for performance prediction
title_sort Analisis and tools for performance prediction
dc.creator.none.fl_str_mv González, J.A.
León, C.
Piccoli, María Fabiana
Printista, Alicia Marcela
Roda García, José Luis
Rodriguez, C.
Rodríguez, J.M.
Sande Gonzalez, Francisco de
author González, J.A.
author_facet González, J.A.
León, C.
Piccoli, María Fabiana
Printista, Alicia Marcela
Roda García, José Luis
Rodriguez, C.
Rodríguez, J.M.
Sande Gonzalez, Francisco de
author_role author
author2 León, C.
Piccoli, María Fabiana
Printista, Alicia Marcela
Roda García, José Luis
Rodriguez, C.
Rodríguez, J.M.
Sande Gonzalez, Francisco de
author2_role author
author
author
author
author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Complexity model
Performance analysis
Performance prediction
Oblivious synchronization
Performance
Concurrent Programming
Tools
Performance profiling
topic Ciencias Informáticas
Complexity model
Performance analysis
Performance prediction
Oblivious synchronization
Performance
Concurrent Programming
Tools
Performance profiling
dc.description.none.fl_txt_mv We present an analytical model that extends BSP to cover both oblivious synchronization and group partitioning. There are a few oversimplifications in BSP that make difficult to have accurate predictions. Even if the numbers of individual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. Even worse, the assumption that a constant number of machine instructions takes constant time is far from the truth. Current memory hierarchies imply that memory access vary from a few cycles to several thousands. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each “communication block”. Unfortunately, to use this approach implies that the evaluation parameters not only depend on given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.
Eje: Programación concurrente
Red de Universidades con Carreras en Informática (RedUNCI)
description We present an analytical model that extends BSP to cover both oblivious synchronization and group partitioning. There are a few oversimplifications in BSP that make difficult to have accurate predictions. Even if the numbers of individual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. Even worse, the assumption that a constant number of machine instructions takes constant time is far from the truth. Current memory hierarchies imply that memory access vary from a few cycles to several thousands. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each “communication block”. Unfortunately, to use this approach implies that the evaluation parameters not only depend on given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.
publishDate 2001
dc.date.none.fl_str_mv 2001-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/23310
url http://sedici.unlp.edu.ar/handle/10915/23310
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1846782828560252928
score 12.982451