Combination of Standard and Complementary Models for Audio-Visual Speech Recognition

Autores
Sad, Gonzalo D.; Terissi, Lucas D.; Gómez, Juan Carlos
Año de publicación
2015
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
Materia
Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-sa/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/52105

id SEDICI_2b17753401d413545a2dc680ce979d7b
oai_identifier_str oai:sedici.unlp.edu.ar:10915/52105
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Combination of Standard and Complementary Models for Audio-Visual Speech RecognitionSad, Gonzalo D.Terissi, Lucas D.Gómez, Juan CarlosCiencias InformáticasSpeech recognition and synthesisaudio-visual information fusiondecision level fusionIn this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2015info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf113-120http://sedici.unlp.edu.ar/handle/10915/52105enginfo:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7585info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T10:57:05Zoai:sedici.unlp.edu.ar:10915/52105Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 10:57:05.827SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
spellingShingle Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
Sad, Gonzalo D.
Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
title_short Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_full Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_fullStr Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_full_unstemmed Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_sort Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
dc.creator.none.fl_str_mv Sad, Gonzalo D.
Terissi, Lucas D.
Gómez, Juan Carlos
author Sad, Gonzalo D.
author_facet Sad, Gonzalo D.
Terissi, Lucas D.
Gómez, Juan Carlos
author_role author
author2 Terissi, Lucas D.
Gómez, Juan Carlos
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
topic Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
dc.description.none.fl_txt_mv In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
description In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
publishDate 2015
dc.date.none.fl_str_mv 2015
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/52105
url http://sedici.unlp.edu.ar/handle/10915/52105
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdf
info:eu-repo/semantics/altIdentifier/issn/2451-7585
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
dc.format.none.fl_str_mv application/pdf
113-120
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1846064015579545600
score 13.22299