Bioinspired sparse spectro-temporal representation of speech for robust classification
- Autores
- Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo
- Año de publicación
- 2012
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.
Fil: Martínez, César Ernesto. Centro de I+d En Señales; Argentina. Universidad Nacional de Entre Ríos; Argentina
Fil: Goddard, J.. Universidad Autónoma Metropolitana - Iztapalapa; México
Fil: Milone, Diego Humberto. Centro de I+d En Señales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Rufiner, Hugo Leonardo. Universidad Nacional de Entre Ríos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Centro de I+d En Señales; Argentina - Materia
-
APPROXIMATED AUDITORY CORTICAL REPRESENTATION
ROBUST PHONEME RECOGNITION
SPARSE CODING - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/96495
Ver los metadatos del registro completo
id |
CONICETDig_5d31a75007b487fba0442057e897f2c6 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/96495 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Bioinspired sparse spectro-temporal representation of speech for robust classificationMartínez, César ErnestoGoddard, J.Milone, Diego HumbertoRufiner, Hugo LeonardoAPPROXIMATED AUDITORY CORTICAL REPRESENTATIONROBUST PHONEME RECOGNITIONSPARSE CODINGhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.Fil: Martínez, César Ernesto. Centro de I+d En Señales; Argentina. Universidad Nacional de Entre Ríos; ArgentinaFil: Goddard, J.. Universidad Autónoma Metropolitana - Iztapalapa; MéxicoFil: Milone, Diego Humberto. Centro de I+d En Señales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Rufiner, Hugo Leonardo. Universidad Nacional de Entre Ríos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Centro de I+d En Señales; ArgentinaAcademic Press Ltd - Elsevier Science Ltd2012-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/96495Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-3480885-2308CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0885230812000125info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csl.2012.02.002info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T10:07:20Zoai:ri.conicet.gov.ar:11336/96495instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 10:07:20.817CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
title |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
spellingShingle |
Bioinspired sparse spectro-temporal representation of speech for robust classification Martínez, César Ernesto APPROXIMATED AUDITORY CORTICAL REPRESENTATION ROBUST PHONEME RECOGNITION SPARSE CODING |
title_short |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
title_full |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
title_fullStr |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
title_full_unstemmed |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
title_sort |
Bioinspired sparse spectro-temporal representation of speech for robust classification |
dc.creator.none.fl_str_mv |
Martínez, César Ernesto Goddard, J. Milone, Diego Humberto Rufiner, Hugo Leonardo |
author |
Martínez, César Ernesto |
author_facet |
Martínez, César Ernesto Goddard, J. Milone, Diego Humberto Rufiner, Hugo Leonardo |
author_role |
author |
author2 |
Goddard, J. Milone, Diego Humberto Rufiner, Hugo Leonardo |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
APPROXIMATED AUDITORY CORTICAL REPRESENTATION ROBUST PHONEME RECOGNITION SPARSE CODING |
topic |
APPROXIMATED AUDITORY CORTICAL REPRESENTATION ROBUST PHONEME RECOGNITION SPARSE CODING |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/2.2 https://purl.org/becyt/ford/2 |
dc.description.none.fl_txt_mv |
In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations. Fil: Martínez, César Ernesto. Centro de I+d En Señales; Argentina. Universidad Nacional de Entre Ríos; Argentina Fil: Goddard, J.. Universidad Autónoma Metropolitana - Iztapalapa; México Fil: Milone, Diego Humberto. Centro de I+d En Señales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Rufiner, Hugo Leonardo. Universidad Nacional de Entre Ríos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Centro de I+d En Señales; Argentina |
description |
In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations. |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012-10 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/96495 Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-348 0885-2308 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/96495 |
identifier_str_mv |
Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-348 0885-2308 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0885230812000125 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csl.2012.02.002 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Academic Press Ltd - Elsevier Science Ltd |
publisher.none.fl_str_mv |
Academic Press Ltd - Elsevier Science Ltd |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842269999741272064 |
score |
13.13397 |