Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs
- Autores
- Costanzo, Manuel; Rucci, Enzo; García-Sánchez, Carlos; Naiouf, Marcelo
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a matrix multiplication (MM) application across different GPU architectures. The experimental work showed that, while the CUDA implementation outperforms the SYCL implementation on NVIDIA devices due to optimizations provided by the nvcc compiler, the latter implementation demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency percentages obtained on AMD and Intel GPUs showed consistency with the results observed on NVIDIA devices.
Facultad de Informática - Materia
-
Ciencias Informáticas
oneAPI
SYCL
GPU
CUDA
Performance portability - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/155420
Ver los metadatos del registro completo
id |
SEDICI_bf452f6a5ff30203bb3f01d89ac48f73 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/155420 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUsCostanzo, ManuelRucci, EnzoGarcía-Sánchez, CarlosNaiouf, MarceloCiencias InformáticasoneAPISYCLGPUCUDAPerformance portabilityThe heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a matrix multiplication (MM) application across different GPU architectures. The experimental work showed that, while the CUDA implementation outperforms the SYCL implementation on NVIDIA devices due to optimizations provided by the nvcc compiler, the latter implementation demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency percentages obtained on AMD and Intel GPUs showed consistency with the results observed on NVIDIA devices.Facultad de Informática2023-06info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf13-18http://sedici.unlp.edu.ar/handle/10915/155420enginfo:eu-repo/semantics/altIdentifier/isbn/978-950-34-2271-7info:eu-repo/semantics/reference/hdl/10915/155281info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:40:21Zoai:sedici.unlp.edu.ar:10915/155420Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:40:21.525SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
spellingShingle |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs Costanzo, Manuel Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability |
title_short |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_full |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_fullStr |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_full_unstemmed |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_sort |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
dc.creator.none.fl_str_mv |
Costanzo, Manuel Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo |
author |
Costanzo, Manuel |
author_facet |
Costanzo, Manuel Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo |
author_role |
author |
author2 |
Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability |
topic |
Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability |
dc.description.none.fl_txt_mv |
The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a matrix multiplication (MM) application across different GPU architectures. The experimental work showed that, while the CUDA implementation outperforms the SYCL implementation on NVIDIA devices due to optimizations provided by the nvcc compiler, the latter implementation demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency percentages obtained on AMD and Intel GPUs showed consistency with the results observed on NVIDIA devices. Facultad de Informática |
description |
The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a matrix multiplication (MM) application across different GPU architectures. The experimental work showed that, while the CUDA implementation outperforms the SYCL implementation on NVIDIA devices due to optimizations provided by the nvcc compiler, the latter implementation demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency percentages obtained on AMD and Intel GPUs showed consistency with the results observed on NVIDIA devices. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-06 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/155420 |
url |
http://sedici.unlp.edu.ar/handle/10915/155420 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-950-34-2271-7 info:eu-repo/semantics/reference/hdl/10915/155281 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.format.none.fl_str_mv |
application/pdf 13-18 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1844616276531478528 |
score |
13.070432 |