Fetch unit design for scalable simultaneous multithreading (ScSMT)

Autores
Moure, Juan Carlos; Rexachs del Rosario, Dolores; Luque Fadón, Emilio
Año de publicación
2001
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Continuous IC process enhancements make possible to integrate on a single chip the re-sources required for simultaneously executing multiple control flows or threads, exploiting different levels of thread-level parallelism: application-, function-, and loop-level. Scalable simultaneous multi-threading combines static and dynamic mechanisms to assemble a complexity-effective design that provides high instruction per cycle rates without sacrificing cycle time nor single-thread performance. This paper addresses the design of the fetch unit for a high-performance, scalable, simultaneous multithreaded processor. We present the detailed microarchitecture of a clustered and reconfigurable fetch unit based on an existing single-thread fetch unit. In order to minimize the occurrence of fetch hazards, the fetch unit dynamically adapts to the available thread-level parallelism and to the fetch characteristics of the active threads, working as a single shared unit or as two separate clusters. It combines static and dynamic methods in a complexity-efficient way. The design is supported by a simulation- based analysis of different instruction cache and branch target buffer configurations on the context of a multithreaded execution workload. Average reductions on the miss rates between 30% and 60% and peak reductions greater than 200% are obtained.
Facultad de Informática
Materia
Ciencias Informáticas
Procesador paralelo
Threads
Arquitectura del procesador
Informática
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/9404

id SEDICI_6d137cf5ecce1d65cfef22894ee9d403
oai_identifier_str oai:sedici.unlp.edu.ar:10915/9404
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Fetch unit design for scalable simultaneous multithreading (ScSMT)Moure, Juan CarlosRexachs del Rosario, DoloresLuque Fadón, EmilioCiencias InformáticasProcesador paraleloThreadsArquitectura del procesadorInformáticaContinuous IC process enhancements make possible to integrate on a single chip the re-sources required for simultaneously executing multiple control flows or threads, exploiting different levels of thread-level parallelism: application-, function-, and loop-level. Scalable simultaneous multi-threading combines static and dynamic mechanisms to assemble a complexity-effective design that provides high instruction per cycle rates without sacrificing cycle time nor single-thread performance. This paper addresses the design of the fetch unit for a high-performance, scalable, simultaneous multithreaded processor. We present the detailed microarchitecture of a clustered and reconfigurable fetch unit based on an existing single-thread fetch unit. In order to minimize the occurrence of fetch hazards, the fetch unit dynamically adapts to the available thread-level parallelism and to the fetch characteristics of the active threads, working as a single shared unit or as two separate clusters. It combines static and dynamic methods in a complexity-efficient way. The design is supported by a simulation- based analysis of different instruction cache and branch target buffer configurations on the context of a multithreaded execution workload. Average reductions on the miss rates between 30% and 60% and peak reductions greater than 200% are obtained.Facultad de Informática2001info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/9404enginfo:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/ipaper1.pdfinfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc/3.0/Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-11-12T10:15:27Zoai:sedici.unlp.edu.ar:10915/9404Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-11-12 10:15:28.2SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Fetch unit design for scalable simultaneous multithreading (ScSMT)
title Fetch unit design for scalable simultaneous multithreading (ScSMT)
spellingShingle Fetch unit design for scalable simultaneous multithreading (ScSMT)
Moure, Juan Carlos
Ciencias Informáticas
Procesador paralelo
Threads
Arquitectura del procesador
Informática
title_short Fetch unit design for scalable simultaneous multithreading (ScSMT)
title_full Fetch unit design for scalable simultaneous multithreading (ScSMT)
title_fullStr Fetch unit design for scalable simultaneous multithreading (ScSMT)
title_full_unstemmed Fetch unit design for scalable simultaneous multithreading (ScSMT)
title_sort Fetch unit design for scalable simultaneous multithreading (ScSMT)
dc.creator.none.fl_str_mv Moure, Juan Carlos
Rexachs del Rosario, Dolores
Luque Fadón, Emilio
author Moure, Juan Carlos
author_facet Moure, Juan Carlos
Rexachs del Rosario, Dolores
Luque Fadón, Emilio
author_role author
author2 Rexachs del Rosario, Dolores
Luque Fadón, Emilio
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Procesador paralelo
Threads
Arquitectura del procesador
Informática
topic Ciencias Informáticas
Procesador paralelo
Threads
Arquitectura del procesador
Informática
dc.description.none.fl_txt_mv Continuous IC process enhancements make possible to integrate on a single chip the re-sources required for simultaneously executing multiple control flows or threads, exploiting different levels of thread-level parallelism: application-, function-, and loop-level. Scalable simultaneous multi-threading combines static and dynamic mechanisms to assemble a complexity-effective design that provides high instruction per cycle rates without sacrificing cycle time nor single-thread performance. This paper addresses the design of the fetch unit for a high-performance, scalable, simultaneous multithreaded processor. We present the detailed microarchitecture of a clustered and reconfigurable fetch unit based on an existing single-thread fetch unit. In order to minimize the occurrence of fetch hazards, the fetch unit dynamically adapts to the available thread-level parallelism and to the fetch characteristics of the active threads, working as a single shared unit or as two separate clusters. It combines static and dynamic methods in a complexity-efficient way. The design is supported by a simulation- based analysis of different instruction cache and branch target buffer configurations on the context of a multithreaded execution workload. Average reductions on the miss rates between 30% and 60% and peak reductions greater than 200% are obtained.
Facultad de Informática
description Continuous IC process enhancements make possible to integrate on a single chip the re-sources required for simultaneously executing multiple control flows or threads, exploiting different levels of thread-level parallelism: application-, function-, and loop-level. Scalable simultaneous multi-threading combines static and dynamic mechanisms to assemble a complexity-effective design that provides high instruction per cycle rates without sacrificing cycle time nor single-thread performance. This paper addresses the design of the fetch unit for a high-performance, scalable, simultaneous multithreaded processor. We present the detailed microarchitecture of a clustered and reconfigurable fetch unit based on an existing single-thread fetch unit. In order to minimize the occurrence of fetch hazards, the fetch unit dynamically adapts to the available thread-level parallelism and to the fetch characteristics of the active threads, working as a single shared unit or as two separate clusters. It combines static and dynamic methods in a complexity-efficient way. The design is supported by a simulation- based analysis of different instruction cache and branch target buffer configurations on the context of a multithreaded execution workload. Average reductions on the miss rates between 30% and 60% and peak reductions greater than 200% are obtained.
publishDate 2001
dc.date.none.fl_str_mv 2001
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Articulo
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/9404
url http://sedici.unlp.edu.ar/handle/10915/9404
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/ipaper1.pdf
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc/3.0/
Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc/3.0/
Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1848605215810387968
score 13.24909