Isolated spanish digit recognition based on audio-visual features
- Autores
- Sad, Gonzalo D.; Terissi, Lucas D.; Gómez, Juan Carlos
- Año de publicación
- 2013
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- The performance of classical speech recognition techniques based on audio features is degraded in noisy environments. The inclu-sion of visual features related to mouth movements into the recogni-tion process improves the performance of the system. This paper proposes an isolated word speech recognition system based on audio-visual features. The proposed system combines three classifiers based on au-dio, visual and audio-visual information, respectively. An audio-visual database composed by the utterances of the digits (in Spanish language) is employed to test the proposed system. The experimental results show a significant improvement on the recognition rates through a wide range of signal-to-noise ratios.
IV Workshop procesamiento de señales y sistemas de tiempo real.
Red de Universidades con Carreras en Informática (RedUNCI) - Materia
-
Ciencias Informáticas
Informática
speech recognition
audio-visual speech feature
Speech recognition and synthesis
Object recognition
Hidden Markov Models
Markov processes - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/31841
Ver los metadatos del registro completo
id |
SEDICI_5bbd9a4d54705099107c6da36135ce07 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/31841 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Isolated spanish digit recognition based on audio-visual featuresSad, Gonzalo D.Terissi, Lucas D.Gómez, Juan CarlosCiencias InformáticasInformáticaspeech recognitionaudio-visual speech featureSpeech recognition and synthesisObject recognitionHidden Markov ModelsMarkov processesThe performance of classical speech recognition techniques based on audio features is degraded in noisy environments. The inclu-sion of visual features related to mouth movements into the recogni-tion process improves the performance of the system. This paper proposes an isolated word speech recognition system based on audio-visual features. The proposed system combines three classifiers based on au-dio, visual and audio-visual information, respectively. An audio-visual database composed by the utterances of the digits (in Spanish language) is employed to test the proposed system. The experimental results show a significant improvement on the recognition rates through a wide range of signal-to-noise ratios.IV Workshop procesamiento de señales y sistemas de tiempo real.Red de Universidades con Carreras en Informática (RedUNCI)2013-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/31841enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-17T09:41:25Zoai:sedici.unlp.edu.ar:10915/31841Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-17 09:41:26.109SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Isolated spanish digit recognition based on audio-visual features |
title |
Isolated spanish digit recognition based on audio-visual features |
spellingShingle |
Isolated spanish digit recognition based on audio-visual features Sad, Gonzalo D. Ciencias Informáticas Informática speech recognition audio-visual speech feature Speech recognition and synthesis Object recognition Hidden Markov Models Markov processes |
title_short |
Isolated spanish digit recognition based on audio-visual features |
title_full |
Isolated spanish digit recognition based on audio-visual features |
title_fullStr |
Isolated spanish digit recognition based on audio-visual features |
title_full_unstemmed |
Isolated spanish digit recognition based on audio-visual features |
title_sort |
Isolated spanish digit recognition based on audio-visual features |
dc.creator.none.fl_str_mv |
Sad, Gonzalo D. Terissi, Lucas D. Gómez, Juan Carlos |
author |
Sad, Gonzalo D. |
author_facet |
Sad, Gonzalo D. Terissi, Lucas D. Gómez, Juan Carlos |
author_role |
author |
author2 |
Terissi, Lucas D. Gómez, Juan Carlos |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Informática speech recognition audio-visual speech feature Speech recognition and synthesis Object recognition Hidden Markov Models Markov processes |
topic |
Ciencias Informáticas Informática speech recognition audio-visual speech feature Speech recognition and synthesis Object recognition Hidden Markov Models Markov processes |
dc.description.none.fl_txt_mv |
The performance of classical speech recognition techniques based on audio features is degraded in noisy environments. The inclu-sion of visual features related to mouth movements into the recogni-tion process improves the performance of the system. This paper proposes an isolated word speech recognition system based on audio-visual features. The proposed system combines three classifiers based on au-dio, visual and audio-visual information, respectively. An audio-visual database composed by the utterances of the digits (in Spanish language) is employed to test the proposed system. The experimental results show a significant improvement on the recognition rates through a wide range of signal-to-noise ratios. IV Workshop procesamiento de señales y sistemas de tiempo real. Red de Universidades con Carreras en Informática (RedUNCI) |
description |
The performance of classical speech recognition techniques based on audio features is degraded in noisy environments. The inclu-sion of visual features related to mouth movements into the recogni-tion process improves the performance of the system. This paper proposes an isolated word speech recognition system based on audio-visual features. The proposed system combines three classifiers based on au-dio, visual and audio-visual information, respectively. An audio-visual database composed by the utterances of the digits (in Spanish language) is employed to test the proposed system. The experimental results show a significant improvement on the recognition rates through a wide range of signal-to-noise ratios. |
publishDate |
2013 |
dc.date.none.fl_str_mv |
2013-10 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/31841 |
url |
http://sedici.unlp.edu.ar/handle/10915/31841 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1843532109857161216 |
score |
13.001348 |