Many parallel applications do not completely fit into the data parallel model. Although these applications contain data parallelism, task parallelism is needed to represent the natural computation structure or enhance performance. To combine the easiness of programming of the data parallel model with the efficiency of the task parallel model allows to parallel forms to be nested, giving Nested parallelism. In this work, we examine the solutions provided to N ested parallelism in two standard parallel programming platforms, HPF and MPI. Both their expression capacity and their efficiency are compared on a Cray- 3TE, which is distributed memory machine. Finally, an additional speech about the use of the methodology proposed for MPI is done on two different architectures
I Workshop de Procesamiento Distribuido y Paralelo (WPDP)
Red de Universidades con Carreras en Informática (RedUNCI)
The parallel computing model used in this paper, the Collective Computing Model (CCM), is a variant of the well-known Bulk Synchronous Parallel (BSP) model. The synchronicity imposed by the BSP model restricts the set of available algorithms and prevents the overlapping of computation and communication. Other models, like the LogP model, allow asynchronous computing and overlapping but depend on the use of specific libraries. The CCM describes a system exploited through a standard software platform providing facilities for group creation, collective operations and remote memory operations. Based in the BSP model, two kinds of supersteps are considered: Division supersteps and Normal supersteps. The structure of divisions produced by the Division Functions and the partnership relation among processors give place to communication patterns among processors that are topologically similar to a hypercube. We have named the resulting structures Dynamic Polytopes To illustrate these concepts, the Fast Fourier Transform Algorithm is used. Computational results prove the accuracy of the model in four different parallel computers: a Parsytec Power PC, a Cray T3E, a Silicon Graphics Origin 2000 and a Digital Alpha Server.