OpenCL overview, implementation, and performance comparison

Autores
Fraire, Juan Andres; Ferreyra, Pablo Alejandro; Marques, Carlos Alberto
Año de publicación
2013
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
High performance parallel computing was something exclusive for expensive specialized hardware some years ago. But now we can find powerful parallel processors in many home graphics card whose interface has been recently opened by many manufacturers for general purpose computing. OpenCL, created by the world most important processors manufacturers, went a little further, aiming for a platform and manufacturer independent parallel language. However, understanding this new processing paradigm is challenging and critical for future computation demanding applications. The first approach of this document is to provide a deep technical background of OpenCL architecture. Second, we propose an implementation of a matrix product calculation OpenCL kernel directly implemented in C++ without wrappers so as to describe in detail the OpenCL programming flow. Thirdly, different platforms and algebraic scenarios are created for this program concluding that the improvement of calculation performance can reach up to 3 orders of magnitude over the same algorithm in plain C++.
Fil: Fraire, Juan Andres. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Ferreyra, Pablo Alejandro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina
Fil: Marques, Carlos Alberto. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Materia
Heterogeneus
Systems
Paralelism
Computing
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/25002

id CONICETDig_f49a3fe1a9074420cdee2a8f6eb0d069
oai_identifier_str oai:ri.conicet.gov.ar:11336/25002
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling OpenCL overview, implementation, and performance comparisonFraire, Juan AndresFerreyra, Pablo AlejandroMarques, Carlos AlbertoHeterogeneusSystemsParalelismComputinghttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2High performance parallel computing was something exclusive for expensive specialized hardware some years ago. But now we can find powerful parallel processors in many home graphics card whose interface has been recently opened by many manufacturers for general purpose computing. OpenCL, created by the world most important processors manufacturers, went a little further, aiming for a platform and manufacturer independent parallel language. However, understanding this new processing paradigm is challenging and critical for future computation demanding applications. The first approach of this document is to provide a deep technical background of OpenCL architecture. Second, we propose an implementation of a matrix product calculation OpenCL kernel directly implemented in C++ without wrappers so as to describe in detail the OpenCL programming flow. Thirdly, different platforms and algebraic scenarios are created for this program concluding that the improvement of calculation performance can reach up to 3 orders of magnitude over the same algorithm in plain C++.Fil: Fraire, Juan Andres. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Ferreyra, Pablo Alejandro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; ArgentinaFil: Marques, Carlos Alberto. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaInstitute of Electrical and Electronics Engineers2013-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/25002Fraire, Juan Andres; Ferreyra, Pablo Alejandro; Marques, Carlos Alberto; OpenCL overview, implementation, and performance comparison; Institute of Electrical and Electronics Engineers; IEEE Latin America Transactions; 11; 1; 4-2013; 274-2801548-0992CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1109/TLA.2013.6502816info:eu-repo/semantics/altIdentifier/url/http://ieeexplore.ieee.org/document/6502816/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:20:55Zoai:ri.conicet.gov.ar:11336/25002instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:20:55.64CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv OpenCL overview, implementation, and performance comparison
title OpenCL overview, implementation, and performance comparison
spellingShingle OpenCL overview, implementation, and performance comparison
Fraire, Juan Andres
Heterogeneus
Systems
Paralelism
Computing
title_short OpenCL overview, implementation, and performance comparison
title_full OpenCL overview, implementation, and performance comparison
title_fullStr OpenCL overview, implementation, and performance comparison
title_full_unstemmed OpenCL overview, implementation, and performance comparison
title_sort OpenCL overview, implementation, and performance comparison
dc.creator.none.fl_str_mv Fraire, Juan Andres
Ferreyra, Pablo Alejandro
Marques, Carlos Alberto
author Fraire, Juan Andres
author_facet Fraire, Juan Andres
Ferreyra, Pablo Alejandro
Marques, Carlos Alberto
author_role author
author2 Ferreyra, Pablo Alejandro
Marques, Carlos Alberto
author2_role author
author
dc.subject.none.fl_str_mv Heterogeneus
Systems
Paralelism
Computing
topic Heterogeneus
Systems
Paralelism
Computing
purl_subject.fl_str_mv https://purl.org/becyt/ford/2.2
https://purl.org/becyt/ford/2
dc.description.none.fl_txt_mv High performance parallel computing was something exclusive for expensive specialized hardware some years ago. But now we can find powerful parallel processors in many home graphics card whose interface has been recently opened by many manufacturers for general purpose computing. OpenCL, created by the world most important processors manufacturers, went a little further, aiming for a platform and manufacturer independent parallel language. However, understanding this new processing paradigm is challenging and critical for future computation demanding applications. The first approach of this document is to provide a deep technical background of OpenCL architecture. Second, we propose an implementation of a matrix product calculation OpenCL kernel directly implemented in C++ without wrappers so as to describe in detail the OpenCL programming flow. Thirdly, different platforms and algebraic scenarios are created for this program concluding that the improvement of calculation performance can reach up to 3 orders of magnitude over the same algorithm in plain C++.
Fil: Fraire, Juan Andres. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Ferreyra, Pablo Alejandro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina
Fil: Marques, Carlos Alberto. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
description High performance parallel computing was something exclusive for expensive specialized hardware some years ago. But now we can find powerful parallel processors in many home graphics card whose interface has been recently opened by many manufacturers for general purpose computing. OpenCL, created by the world most important processors manufacturers, went a little further, aiming for a platform and manufacturer independent parallel language. However, understanding this new processing paradigm is challenging and critical for future computation demanding applications. The first approach of this document is to provide a deep technical background of OpenCL architecture. Second, we propose an implementation of a matrix product calculation OpenCL kernel directly implemented in C++ without wrappers so as to describe in detail the OpenCL programming flow. Thirdly, different platforms and algebraic scenarios are created for this program concluding that the improvement of calculation performance can reach up to 3 orders of magnitude over the same algorithm in plain C++.
publishDate 2013
dc.date.none.fl_str_mv 2013-04
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/25002
Fraire, Juan Andres; Ferreyra, Pablo Alejandro; Marques, Carlos Alberto; OpenCL overview, implementation, and performance comparison; Institute of Electrical and Electronics Engineers; IEEE Latin America Transactions; 11; 1; 4-2013; 274-280
1548-0992
CONICET Digital
CONICET
url http://hdl.handle.net/11336/25002
identifier_str_mv Fraire, Juan Andres; Ferreyra, Pablo Alejandro; Marques, Carlos Alberto; OpenCL overview, implementation, and performance comparison; Institute of Electrical and Electronics Engineers; IEEE Latin America Transactions; 11; 1; 4-2013; 274-280
1548-0992
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1109/TLA.2013.6502816
info:eu-repo/semantics/altIdentifier/url/http://ieeexplore.ieee.org/document/6502816/
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers
publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844614194357338112
score 13.070432