Bioinspired sparse spectro-temporal representation of speech for robust classification

Autores: Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo
Año de publicación: 2012
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.
Fil: Martínez, César Ernesto. Centro de I+d En Señales; Argentina. Universidad Nacional de Entre Ríos; Argentina
Fil: Goddard, J.. Universidad Autónoma Metropolitana - Iztapalapa; México
Fil: Milone, Diego Humberto. Centro de I+d En Señales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Rufiner, Hugo Leonardo. Universidad Nacional de Entre Ríos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Centro de I+d En Señales; Argentina
Materia: APPROXIMATED AUDITORY CORTICAL REPRESENTATION
ROBUST PHONEME RECOGNITION
SPARSE CODING
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/96495

Acceder

id	CONICETDig_5d31a75007b487fba0442057e897f2c6
oai_identifier_str	oai:ri.conicet.gov.ar:11336/96495
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Bioinspired sparse spectro-temporal representation of speech for robust classificationMartínez, César ErnestoGoddard, J.Milone, Diego HumbertoRufiner, Hugo LeonardoAPPROXIMATED AUDITORY CORTICAL REPRESENTATIONROBUST PHONEME RECOGNITIONSPARSE CODINGhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.Fil: Martínez, César Ernesto. Centro de I+d En Señales; Argentina. Universidad Nacional de Entre Ríos; ArgentinaFil: Goddard, J.. Universidad Autónoma Metropolitana - Iztapalapa; MéxicoFil: Milone, Diego Humberto. Centro de I+d En Señales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Rufiner, Hugo Leonardo. Universidad Nacional de Entre Ríos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Centro de I+d En Señales; ArgentinaAcademic Press Ltd - Elsevier Science Ltd2012-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/96495Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-3480885-2308CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0885230812000125info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csl.2012.02.002info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:30:50Zoai:ri.conicet.gov.ar:11336/96495instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:30:51.092CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Bioinspired sparse spectro-temporal representation of speech for robust classification
title	Bioinspired sparse spectro-temporal representation of speech for robust classification
spellingShingle	Bioinspired sparse spectro-temporal representation of speech for robust classification Martínez, César Ernesto APPROXIMATED AUDITORY CORTICAL REPRESENTATION ROBUST PHONEME RECOGNITION SPARSE CODING
title_short	Bioinspired sparse spectro-temporal representation of speech for robust classification
title_full	Bioinspired sparse spectro-temporal representation of speech for robust classification
title_fullStr	Bioinspired sparse spectro-temporal representation of speech for robust classification
title_full_unstemmed	Bioinspired sparse spectro-temporal representation of speech for robust classification
title_sort	Bioinspired sparse spectro-temporal representation of speech for robust classification
dc.creator.none.fl_str_mv	Martínez, César Ernesto Goddard, J. Milone, Diego Humberto Rufiner, Hugo Leonardo
author	Martínez, César Ernesto
author_facet	Martínez, César Ernesto Goddard, J. Milone, Diego Humberto Rufiner, Hugo Leonardo
author_role	author
author2	Goddard, J. Milone, Diego Humberto Rufiner, Hugo Leonardo
author2_role	author author author
dc.subject.none.fl_str_mv	APPROXIMATED AUDITORY CORTICAL REPRESENTATION ROBUST PHONEME RECOGNITION SPARSE CODING
topic	APPROXIMATED AUDITORY CORTICAL REPRESENTATION ROBUST PHONEME RECOGNITION SPARSE CODING
purl_subject.fl_str_mv	https://purl.org/becyt/ford/2.2 https://purl.org/becyt/ford/2
dc.description.none.fl_txt_mv	In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations. Fil: Martínez, César Ernesto. Centro de I+d En Señales; Argentina. Universidad Nacional de Entre Ríos; Argentina Fil: Goddard, J.. Universidad Autónoma Metropolitana - Iztapalapa; México Fil: Milone, Diego Humberto. Centro de I+d En Señales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Rufiner, Hugo Leonardo. Universidad Nacional de Entre Ríos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Centro de I+d En Señales; Argentina
description	In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.
publishDate	2012
dc.date.none.fl_str_mv	2012-10
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/96495 Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-348 0885-2308 CONICET Digital CONICET
url	http://hdl.handle.net/11336/96495
identifier_str_mv	Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-348 0885-2308 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0885230812000125 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csl.2012.02.002
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Academic Press Ltd - Elsevier Science Ltd
publisher.none.fl_str_mv	Academic Press Ltd - Elsevier Science Ltd
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858306060157190144
score	12.665996

Bioinspired sparse spectro-temporal representation of speech for robust classification

Publicaciones similares