Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
- Autores
- Sad, Gonzalo D.; Terissi, Lucas D.; Gómez, Juan Carlos
- Año de publicación
- 2015
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
Sociedad Argentina de Informática e Investigación Operativa (SADIO) - Materia
-
Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-sa/3.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/52105
Ver los metadatos del registro completo
id |
SEDICI_2b17753401d413545a2dc680ce979d7b |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/52105 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Combination of Standard and Complementary Models for Audio-Visual Speech RecognitionSad, Gonzalo D.Terissi, Lucas D.Gómez, Juan CarlosCiencias InformáticasSpeech recognition and synthesisaudio-visual information fusiondecision level fusionIn this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2015info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf113-120http://sedici.unlp.edu.ar/handle/10915/52105enginfo:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7585info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T10:57:05Zoai:sedici.unlp.edu.ar:10915/52105Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 10:57:05.827SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
title |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
spellingShingle |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition Sad, Gonzalo D. Ciencias Informáticas Speech recognition and synthesis audio-visual information fusion decision level fusion |
title_short |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
title_full |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
title_fullStr |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
title_full_unstemmed |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
title_sort |
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition |
dc.creator.none.fl_str_mv |
Sad, Gonzalo D. Terissi, Lucas D. Gómez, Juan Carlos |
author |
Sad, Gonzalo D. |
author_facet |
Sad, Gonzalo D. Terissi, Lucas D. Gómez, Juan Carlos |
author_role |
author |
author2 |
Terissi, Lucas D. Gómez, Juan Carlos |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Speech recognition and synthesis audio-visual information fusion decision level fusion |
topic |
Ciencias Informáticas Speech recognition and synthesis audio-visual information fusion decision level fusion |
dc.description.none.fl_txt_mv |
In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies. Sociedad Argentina de Informática e Investigación Operativa (SADIO) |
description |
In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies. |
publishDate |
2015 |
dc.date.none.fl_str_mv |
2015 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/52105 |
url |
http://sedici.unlp.edu.ar/handle/10915/52105 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdf info:eu-repo/semantics/altIdentifier/issn/2451-7585 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) |
dc.format.none.fl_str_mv |
application/pdf 113-120 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1846064015579545600 |
score |
13.22299 |