Flexible Quantization for Efficient Convolutional Neural Networks
- Autores
- Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel
- Año de publicación
- 2024
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to ∼1.58 bits, but with a loss in performance of only ∼0.6%.
Fil: Zacchigna, Federico Giordano. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina
Fil: Lew, Sergio Eduardo. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina
Fil: Lutenberg, Ariel. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina - Materia
-
CNN
quantization
uniform
non-uniform
mixed-precision
FPGA
ASIC
edge devices
embedded systems - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/236859
Ver los metadatos del registro completo
| id |
CONICETDig_9a64063c49a0356a03f7554b2550ffba |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/236859 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
Flexible Quantization for Efficient Convolutional Neural NetworksZacchigna, Federico GiordanoLew, Sergio EduardoLutenberg, ArielCNNquantizationuniformnon-uniformmixed-precisionFPGAASICedge devicesembedded systemshttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to ∼1.58 bits, but with a loss in performance of only ∼0.6%.Fil: Zacchigna, Federico Giordano. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; ArgentinaFil: Lew, Sergio Eduardo. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; ArgentinaFil: Lutenberg, Ariel. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaMDPI2024-05info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/236859Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel; Flexible Quantization for Efficient Convolutional Neural Networks; MDPI; Electronics; 13; 10; 5-2024; 1-162079-9292CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2079-9292/13/10/1923info:eu-repo/semantics/altIdentifier/doi/10.3390/electronics13101923info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-11-12T09:39:49Zoai:ri.conicet.gov.ar:11336/236859instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-11-12 09:39:49.376CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
Flexible Quantization for Efficient Convolutional Neural Networks |
| title |
Flexible Quantization for Efficient Convolutional Neural Networks |
| spellingShingle |
Flexible Quantization for Efficient Convolutional Neural Networks Zacchigna, Federico Giordano CNN quantization uniform non-uniform mixed-precision FPGA ASIC edge devices embedded systems |
| title_short |
Flexible Quantization for Efficient Convolutional Neural Networks |
| title_full |
Flexible Quantization for Efficient Convolutional Neural Networks |
| title_fullStr |
Flexible Quantization for Efficient Convolutional Neural Networks |
| title_full_unstemmed |
Flexible Quantization for Efficient Convolutional Neural Networks |
| title_sort |
Flexible Quantization for Efficient Convolutional Neural Networks |
| dc.creator.none.fl_str_mv |
Zacchigna, Federico Giordano Lew, Sergio Eduardo Lutenberg, Ariel |
| author |
Zacchigna, Federico Giordano |
| author_facet |
Zacchigna, Federico Giordano Lew, Sergio Eduardo Lutenberg, Ariel |
| author_role |
author |
| author2 |
Lew, Sergio Eduardo Lutenberg, Ariel |
| author2_role |
author author |
| dc.subject.none.fl_str_mv |
CNN quantization uniform non-uniform mixed-precision FPGA ASIC edge devices embedded systems |
| topic |
CNN quantization uniform non-uniform mixed-precision FPGA ASIC edge devices embedded systems |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/2.2 https://purl.org/becyt/ford/2 |
| dc.description.none.fl_txt_mv |
This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to ∼1.58 bits, but with a loss in performance of only ∼0.6%. Fil: Zacchigna, Federico Giordano. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina Fil: Lew, Sergio Eduardo. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina Fil: Lutenberg, Ariel. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina |
| description |
This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to ∼1.58 bits, but with a loss in performance of only ∼0.6%. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024-05 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/236859 Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel; Flexible Quantization for Efficient Convolutional Neural Networks; MDPI; Electronics; 13; 10; 5-2024; 1-16 2079-9292 CONICET Digital CONICET |
| url |
http://hdl.handle.net/11336/236859 |
| identifier_str_mv |
Zacchigna, Federico Giordano; Lew, Sergio Eduardo; Lutenberg, Ariel; Flexible Quantization for Efficient Convolutional Neural Networks; MDPI; Electronics; 13; 10; 5-2024; 1-16 2079-9292 CONICET Digital CONICET |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2079-9292/13/10/1923 info:eu-repo/semantics/altIdentifier/doi/10.3390/electronics13101923 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
MDPI |
| publisher.none.fl_str_mv |
MDPI |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1848597466332528640 |
| score |
13.24909 |