Energy-Efficient Algebra Kernels in FPGA for High Performance Computing

Autores: Favaro, Federico; Dufrechou, Ernesto; Ezzatti, Pablo; Oliver, Juan P.
Año de publicación: 2021
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: The dissemination of multi-core architectures and the later irruption of massively parallel devices, led to a revolution in High-Performance Computing (HPC) platforms in the last decades. As a result, Field- Programmable Gate Arrays (FPGAs) are re-emerging as a versatile and more energy-efficient alternative to other platforms. Traditional FPGA design implies using low-level Hardware Description Languages (HDL) such as VHDL or Verilog, which follow an entirely different programming model than standard software languages, and their use requires specialized knowledge of the underlying hardware. In the last years, manufacturers started to make big efforts to provide High-Level Synthesis (HLS) tools, in order to allow a grater adoption of FPGAs in the HPC coimnunity. Our work studies the use of multi-core hardware and different FPGAs to address Numerical Linear Algebra (NLA) kernels such as the general matrix multiplication (GEMM) and the sparse matrix-vector multiplication (SpMV). Specifically, we compare the behavior of fine-tuned kernels in a multi-core CPU processor and HLS implementations on FPGAs. We perform the experimental evaluation of our implementations on a low-end and a cutting-edge FPGA platform, in terms of runtime and energy consumption, and compare the results against the Intel MKL library in CPU.
La masificación de arquitecturas de multinúcleo y la posterior irrupción de dispositivos masivamente paralelos produjeron una revolución en las plataformas de computación de altas prestaciones. Como resultado, las FPGAs (del inglés, Field-Programmable Gate Arrays) están resurgiendo como una alternativa versátil y más eficiente desde el punto de vista energético. El flujo de diseño tradicional en FPGAs implica el uso de lenguajes de descripción de hardware de bajo nivel, como VHDL o Verilog, que siguen un modelo de programación completamente diferente al de los lenguajes de software estándar, y su uso requiere un conocimiento especializado del hardware subyacente. En los últimos años, los fabricantes comenzaron a hacer grandes esfuerzos para proporcionar herramientas de síntesis de alto nivel, con el fin de permitir una mayor adopción de las FPGAs en la comunidad de computación de altas prestaciones. Nuestro trabajo estudia el uso de plataformas multinúcleo y diferentes FPGAs para abordar problemas de álgebra lineal numérica (NLA) como la multiplicación de matrices (GEMM) y la multiplicación de matriz dispersa por vector (SpMV). Específicamente, comparamos el comportamiento de implementaciónes optimizadas para un procesador multinúcleo y las im- plementaciones con síntesis de alto nivel en FPGAs. Realizamos la evaluación experimental de nuestras im- plementaciones en una plataforma FPGA de gama baja y otra de gama alta, analizando tiempo de ejecución y consumo de energía, y comparamos los resultados con la biblioteca Intel MKL para CPU.
Facultad de Informática
Materia: Ciencias Informáticas
Dense and sparse NLA
FPGA
HLS
Energy consumption
Algebra densa y dispersa
Consumo de energía
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-nc/4.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/128258

Acceder

id	SEDICI_a14389a556d8d1e1acf8d39806fae938
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/128258
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Energy-Efficient Algebra Kernels in FPGA for High Performance ComputingNúcleos de álgebra energéticamente eficientes en FPGA para computación de altas prestacionesFavaro, FedericoDufrechou, ErnestoEzzatti, PabloOliver, Juan P.Ciencias InformáticasDense and sparse NLAFPGAHLSEnergy consumptionAlgebra densa y dispersaConsumo de energíaThe dissemination of multi-core architectures and the later irruption of massively parallel devices, led to a revolution in High-Performance Computing (HPC) platforms in the last decades. As a result, Field- Programmable Gate Arrays (FPGAs) are re-emerging as a versatile and more energy-efficient alternative to other platforms. Traditional FPGA design implies using low-level Hardware Description Languages (HDL) such as VHDL or Verilog, which follow an entirely different programming model than standard software languages, and their use requires specialized knowledge of the underlying hardware. In the last years, manufacturers started to make big efforts to provide High-Level Synthesis (HLS) tools, in order to allow a grater adoption of FPGAs in the HPC coimnunity. Our work studies the use of multi-core hardware and different FPGAs to address Numerical Linear Algebra (NLA) kernels such as the general matrix multiplication (GEMM) and the sparse matrix-vector multiplication (SpMV). Specifically, we compare the behavior of fine-tuned kernels in a multi-core CPU processor and HLS implementations on FPGAs. We perform the experimental evaluation of our implementations on a low-end and a cutting-edge FPGA platform, in terms of runtime and energy consumption, and compare the results against the Intel MKL library in CPU.La masificación de arquitecturas de multinúcleo y la posterior irrupción de dispositivos masivamente paralelos produjeron una revolución en las plataformas de computación de altas prestaciones. Como resultado, las FPGAs (del inglés, Field-Programmable Gate Arrays) están resurgiendo como una alternativa versátil y más eficiente desde el punto de vista energético. El flujo de diseño tradicional en FPGAs implica el uso de lenguajes de descripción de hardware de bajo nivel, como VHDL o Verilog, que siguen un modelo de programación completamente diferente al de los lenguajes de software estándar, y su uso requiere un conocimiento especializado del hardware subyacente. En los últimos años, los fabricantes comenzaron a hacer grandes esfuerzos para proporcionar herramientas de síntesis de alto nivel, con el fin de permitir una mayor adopción de las FPGAs en la comunidad de computación de altas prestaciones. Nuestro trabajo estudia el uso de plataformas multinúcleo y diferentes FPGAs para abordar problemas de álgebra lineal numérica (NLA) como la multiplicación de matrices (GEMM) y la multiplicación de matriz dispersa por vector (SpMV). Específicamente, comparamos el comportamiento de implementaciónes optimizadas para un procesador multinúcleo y las im- plementaciones con síntesis de alto nivel en FPGAs. Realizamos la evaluación experimental de nuestras im- plementaciones en una plataforma FPGA de gama baja y otra de gama alta, analizando tiempo de ejecución y consumo de energía, y comparamos los resultados con la biblioteca Intel MKL para CPU.Facultad de Informática2021-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdf80-92http://sedici.unlp.edu.ar/handle/10915/128258enginfo:eu-repo/semantics/altIdentifier/issn/1666-6038info:eu-repo/semantics/altIdentifier/doi/10.24215/16666038.21.e09info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc/4.0/Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-05-27T11:25:25Zoai:sedici.unlp.edu.ar:10915/128258Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-05-27 11:25:25.496SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing Núcleos de álgebra energéticamente eficientes en FPGA para computación de altas prestaciones
title	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
spellingShingle	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing Favaro, Federico Ciencias Informáticas Dense and sparse NLA FPGA HLS Energy consumption Algebra densa y dispersa Consumo de energía
title_short	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
title_full	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
title_fullStr	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
title_full_unstemmed	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
title_sort	Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
dc.creator.none.fl_str_mv	Favaro, Federico Dufrechou, Ernesto Ezzatti, Pablo Oliver, Juan P.
author	Favaro, Federico
author_facet	Favaro, Federico Dufrechou, Ernesto Ezzatti, Pablo Oliver, Juan P.
author_role	author
author2	Dufrechou, Ernesto Ezzatti, Pablo Oliver, Juan P.
author2_role	author author author
dc.subject.none.fl_str_mv	Ciencias Informáticas Dense and sparse NLA FPGA HLS Energy consumption Algebra densa y dispersa Consumo de energía
topic	Ciencias Informáticas Dense and sparse NLA FPGA HLS Energy consumption Algebra densa y dispersa Consumo de energía
dc.description.none.fl_txt_mv	The dissemination of multi-core architectures and the later irruption of massively parallel devices, led to a revolution in High-Performance Computing (HPC) platforms in the last decades. As a result, Field- Programmable Gate Arrays (FPGAs) are re-emerging as a versatile and more energy-efficient alternative to other platforms. Traditional FPGA design implies using low-level Hardware Description Languages (HDL) such as VHDL or Verilog, which follow an entirely different programming model than standard software languages, and their use requires specialized knowledge of the underlying hardware. In the last years, manufacturers started to make big efforts to provide High-Level Synthesis (HLS) tools, in order to allow a grater adoption of FPGAs in the HPC coimnunity. Our work studies the use of multi-core hardware and different FPGAs to address Numerical Linear Algebra (NLA) kernels such as the general matrix multiplication (GEMM) and the sparse matrix-vector multiplication (SpMV). Specifically, we compare the behavior of fine-tuned kernels in a multi-core CPU processor and HLS implementations on FPGAs. We perform the experimental evaluation of our implementations on a low-end and a cutting-edge FPGA platform, in terms of runtime and energy consumption, and compare the results against the Intel MKL library in CPU. La masificación de arquitecturas de multinúcleo y la posterior irrupción de dispositivos masivamente paralelos produjeron una revolución en las plataformas de computación de altas prestaciones. Como resultado, las FPGAs (del inglés, Field-Programmable Gate Arrays) están resurgiendo como una alternativa versátil y más eficiente desde el punto de vista energético. El flujo de diseño tradicional en FPGAs implica el uso de lenguajes de descripción de hardware de bajo nivel, como VHDL o Verilog, que siguen un modelo de programación completamente diferente al de los lenguajes de software estándar, y su uso requiere un conocimiento especializado del hardware subyacente. En los últimos años, los fabricantes comenzaron a hacer grandes esfuerzos para proporcionar herramientas de síntesis de alto nivel, con el fin de permitir una mayor adopción de las FPGAs en la comunidad de computación de altas prestaciones. Nuestro trabajo estudia el uso de plataformas multinúcleo y diferentes FPGAs para abordar problemas de álgebra lineal numérica (NLA) como la multiplicación de matrices (GEMM) y la multiplicación de matriz dispersa por vector (SpMV). Específicamente, comparamos el comportamiento de implementaciónes optimizadas para un procesador multinúcleo y las im- plementaciones con síntesis de alto nivel en FPGAs. Realizamos la evaluación experimental de nuestras im- plementaciones en una plataforma FPGA de gama baja y otra de gama alta, analizando tiempo de ejecución y consumo de energía, y comparamos los resultados con la biblioteca Intel MKL para CPU. Facultad de Informática
description	The dissemination of multi-core architectures and the later irruption of massively parallel devices, led to a revolution in High-Performance Computing (HPC) platforms in the last decades. As a result, Field- Programmable Gate Arrays (FPGAs) are re-emerging as a versatile and more energy-efficient alternative to other platforms. Traditional FPGA design implies using low-level Hardware Description Languages (HDL) such as VHDL or Verilog, which follow an entirely different programming model than standard software languages, and their use requires specialized knowledge of the underlying hardware. In the last years, manufacturers started to make big efforts to provide High-Level Synthesis (HLS) tools, in order to allow a grater adoption of FPGAs in the HPC coimnunity. Our work studies the use of multi-core hardware and different FPGAs to address Numerical Linear Algebra (NLA) kernels such as the general matrix multiplication (GEMM) and the sparse matrix-vector multiplication (SpMV). Specifically, we compare the behavior of fine-tuned kernels in a multi-core CPU processor and HLS implementations on FPGAs. We perform the experimental evaluation of our implementations on a low-end and a cutting-edge FPGA platform, in terms of runtime and energy consumption, and compare the results against the Intel MKL library in CPU.
publishDate	2021
dc.date.none.fl_str_mv	2021-10
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Articulo http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/128258
url	http://sedici.unlp.edu.ar/handle/10915/128258
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/issn/1666-6038 info:eu-repo/semantics/altIdentifier/doi/10.24215/16666038.21.e09
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc/4.0/ Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc/4.0/ Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
dc.format.none.fl_str_mv	application/pdf 80-92
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1866371858132107264
score	13.468372

Energy-Efficient Algebra Kernels in FPGA for High Performance Computing

Publicaciones similares