Combination of Standard and Complementary Models for Audio-Visual Speech Recognition

Autores: Sad, Gonzalo D.; Terissi, Lucas D.; Gómez, Juan Carlos
Año de publicación: 2015
Idioma: inglés
Tipo de recurso: documento de conferencia
Estado: versión publicada
Descripción: In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
Materia: Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-sa/3.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/52105

Acceder

id	SEDICI_2b17753401d413545a2dc680ce979d7b
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/52105
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Combination of Standard and Complementary Models for Audio-Visual Speech RecognitionSad, Gonzalo D.Terissi, Lucas D.Gómez, Juan CarlosCiencias InformáticasSpeech recognition and synthesisaudio-visual information fusiondecision level fusionIn this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2015info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf113-120http://sedici.unlp.edu.ar/handle/10915/52105enginfo:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7585info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-05-27T10:59:18Zoai:sedici.unlp.edu.ar:10915/52105Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-05-27 10:59:18.612SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
spellingShingle	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition Sad, Gonzalo D. Ciencias Informáticas Speech recognition and synthesis audio-visual information fusion decision level fusion
title_short	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_full	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_fullStr	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_full_unstemmed	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_sort	Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
dc.creator.none.fl_str_mv	Sad, Gonzalo D. Terissi, Lucas D. Gómez, Juan Carlos
author	Sad, Gonzalo D.
author_facet	Sad, Gonzalo D. Terissi, Lucas D. Gómez, Juan Carlos
author_role	author
author2	Terissi, Lucas D. Gómez, Juan Carlos
author2_role	author author
dc.subject.none.fl_str_mv	Ciencias Informáticas Speech recognition and synthesis audio-visual information fusion decision level fusion
topic	Ciencias Informáticas Speech recognition and synthesis audio-visual information fusion decision level fusion
dc.description.none.fl_txt_mv	In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies. Sociedad Argentina de Informática e Investigación Operativa (SADIO)
description	In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
publishDate	2015
dc.date.none.fl_str_mv	2015
dc.type.none.fl_str_mv	info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia
format	conferenceObject
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/52105
url	http://sedici.unlp.edu.ar/handle/10915/52105
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdf info:eu-repo/semantics/altIdentifier/issn/2451-7585
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
dc.format.none.fl_str_mv	application/pdf 113-120
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1866371441696440320
score	13.040872

Combination of Standard and Complementary Models for Audio-Visual Speech Recognition

Publicaciones similares