Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources

Autores
Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.
Año de publicación
2009
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.
Fil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Yanagida, M.. Doshisha University; Japón
Materia
Blind Source Separation
Reverberation
Independent Component Analysis
Speech Enhancement
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/102035

id CONICETDig_ff58205a4583287e6714959fab64aa45
oai_identifier_str oai:ri.conicet.gov.ar:11336/102035
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio SourcesDi Persia, Leandro EzequielMilone, Diego HumbertoYanagida, M.Blind Source SeparationReverberationIndependent Component AnalysisSpeech Enhancementhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.Fil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Yanagida, M.. Doshisha University; JapónInstitute of Electrical and Electronics Engineers2009-02info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/102035Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.; Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources; Institute of Electrical and Electronics Engineers; Ieee Transactions On Audio Speech And Language Processing; 17; 2; 2-2009; 299-3111558-7916CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1109/TASL.2008.2009568info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:55:30Zoai:ri.conicet.gov.ar:11336/102035instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:55:30.435CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
title Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
spellingShingle Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
Di Persia, Leandro Ezequiel
Blind Source Separation
Reverberation
Independent Component Analysis
Speech Enhancement
title_short Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
title_full Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
title_fullStr Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
title_full_unstemmed Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
title_sort Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
dc.creator.none.fl_str_mv Di Persia, Leandro Ezequiel
Milone, Diego Humberto
Yanagida, M.
author Di Persia, Leandro Ezequiel
author_facet Di Persia, Leandro Ezequiel
Milone, Diego Humberto
Yanagida, M.
author_role author
author2 Milone, Diego Humberto
Yanagida, M.
author2_role author
author
dc.subject.none.fl_str_mv Blind Source Separation
Reverberation
Independent Component Analysis
Speech Enhancement
topic Blind Source Separation
Reverberation
Independent Component Analysis
Speech Enhancement
purl_subject.fl_str_mv https://purl.org/becyt/ford/2.2
https://purl.org/becyt/ford/2
dc.description.none.fl_txt_mv Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.
Fil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Yanagida, M.. Doshisha University; Japón
description Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.
publishDate 2009
dc.date.none.fl_str_mv 2009-02
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/102035
Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.; Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources; Institute of Electrical and Electronics Engineers; Ieee Transactions On Audio Speech And Language Processing; 17; 2; 2-2009; 299-311
1558-7916
CONICET Digital
CONICET
url http://hdl.handle.net/11336/102035
identifier_str_mv Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.; Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources; Institute of Electrical and Electronics Engineers; Ieee Transactions On Audio Speech And Language Processing; 17; 2; 2-2009; 299-311
1558-7916
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1109/TASL.2008.2009568
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers
publisher.none.fl_str_mv Institute of Electrical and Electronics Engineers
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269347997810688
score 13.13397