Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set
- Autores
- Rucci, Enzo; Moreno, Ezequiel Tomás; Pousa, Adrián; Chichizola, Franco
- Año de publicación
- 2020
- Idioma
- español castellano
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- The N-body simulations have become a powerful tool to test the gravitational interaction among particles, ranging from a few bodies to complete galaxies. Even though N-body has already been optimized on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 instruction set. This SIMD set was initially supported by Intel’s Xeon Phi Knights Landing (KNL) manycore processors launched at 2016. Recently, it has been included in Intel’s general-purpose processors too, starting at the Skylake (SKL) server microarchitecture and now in its successor Cascade Lake (CKL). This paper optimizes the all-pairs N-body simulation on both current Intel platforms supporting AVX-512 extensions: a Xeon Phi KNL node and a server equipped with a dual CKL processor. On the basis of a naive implementation, it is shown how the parallel implementation (can) reach, through different optimization techniques, 2355 and 2449 GFLOPS on the Xeon Phi KNL and the Xeon CKL platforms, respectively.
Publicado en Communications in Computer and Information Science book series (vol. 1184).
Red de Universidades con Carreras en Informática - Materia
-
Ciencias Informáticas
N-body
AVX-512
Xeon Phi
Knights Landing
Xeon Platinum
Skylake
Cascade Lake - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-nd/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/95855
Ver los metadatos del registro completo
id |
SEDICI_592d8e1f3a83c063a4bad1cf1a820572 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/95855 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction SetRucci, EnzoMoreno, Ezequiel TomásPousa, AdriánChichizola, FrancoCiencias InformáticasN-bodyAVX-512Xeon PhiKnights LandingXeon PlatinumSkylakeCascade LakeThe N-body simulations have become a powerful tool to test the gravitational interaction among particles, ranging from a few bodies to complete galaxies. Even though N-body has already been optimized on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 instruction set. This SIMD set was initially supported by Intel’s Xeon Phi Knights Landing (KNL) manycore processors launched at 2016. Recently, it has been included in Intel’s general-purpose processors too, starting at the Skylake (SKL) server microarchitecture and now in its successor Cascade Lake (CKL). This paper optimizes the all-pairs N-body simulation on both current Intel platforms supporting AVX-512 extensions: a Xeon Phi KNL node and a server equipped with a dual CKL processor. On the basis of a naive implementation, it is shown how the parallel implementation (can) reach, through different optimization techniques, 2355 and 2449 GFLOPS on the Xeon Phi KNL and the Xeon CKL platforms, respectively.Publicado en <i>Communications in Computer and Information Science</i> book series (vol. 1184).Red de Universidades con Carreras en Informática2020info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/95855spainfo:eu-repo/semantics/altIdentifier/isbn/978-3-030-48325-8info:eu-repo/semantics/altIdentifier/issn/1865-0937info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-48325-8_3info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/4.0/Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T11:12:33Zoai:sedici.unlp.edu.ar:10915/95855Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 11:12:34.093SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
title |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
spellingShingle |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set Rucci, Enzo Ciencias Informáticas N-body AVX-512 Xeon Phi Knights Landing Xeon Platinum Skylake Cascade Lake |
title_short |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
title_full |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
title_fullStr |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
title_full_unstemmed |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
title_sort |
Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set |
dc.creator.none.fl_str_mv |
Rucci, Enzo Moreno, Ezequiel Tomás Pousa, Adrián Chichizola, Franco |
author |
Rucci, Enzo |
author_facet |
Rucci, Enzo Moreno, Ezequiel Tomás Pousa, Adrián Chichizola, Franco |
author_role |
author |
author2 |
Moreno, Ezequiel Tomás Pousa, Adrián Chichizola, Franco |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas N-body AVX-512 Xeon Phi Knights Landing Xeon Platinum Skylake Cascade Lake |
topic |
Ciencias Informáticas N-body AVX-512 Xeon Phi Knights Landing Xeon Platinum Skylake Cascade Lake |
dc.description.none.fl_txt_mv |
The N-body simulations have become a powerful tool to test the gravitational interaction among particles, ranging from a few bodies to complete galaxies. Even though N-body has already been optimized on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 instruction set. This SIMD set was initially supported by Intel’s Xeon Phi Knights Landing (KNL) manycore processors launched at 2016. Recently, it has been included in Intel’s general-purpose processors too, starting at the Skylake (SKL) server microarchitecture and now in its successor Cascade Lake (CKL). This paper optimizes the all-pairs N-body simulation on both current Intel platforms supporting AVX-512 extensions: a Xeon Phi KNL node and a server equipped with a dual CKL processor. On the basis of a naive implementation, it is shown how the parallel implementation (can) reach, through different optimization techniques, 2355 and 2449 GFLOPS on the Xeon Phi KNL and the Xeon CKL platforms, respectively. Publicado en <i>Communications in Computer and Information Science</i> book series (vol. 1184). Red de Universidades con Carreras en Informática |
description |
The N-body simulations have become a powerful tool to test the gravitational interaction among particles, ranging from a few bodies to complete galaxies. Even though N-body has already been optimized on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 instruction set. This SIMD set was initially supported by Intel’s Xeon Phi Knights Landing (KNL) manycore processors launched at 2016. Recently, it has been included in Intel’s general-purpose processors too, starting at the Skylake (SKL) server microarchitecture and now in its successor Cascade Lake (CKL). This paper optimizes the all-pairs N-body simulation on both current Intel platforms supporting AVX-512 extensions: a Xeon Phi KNL node and a server equipped with a dual CKL processor. On the basis of a naive implementation, it is shown how the parallel implementation (can) reach, through different optimization techniques, 2355 and 2449 GFLOPS on the Xeon Phi KNL and the Xeon CKL platforms, respectively. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/95855 |
url |
http://sedici.unlp.edu.ar/handle/10915/95855 |
dc.language.none.fl_str_mv |
spa |
language |
spa |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-3-030-48325-8 info:eu-repo/semantics/altIdentifier/issn/1865-0937 info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-48325-8_3 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-nd/4.0/ Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1846064183392600064 |
score |
13.22299 |