Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
- Autores
- Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.
- Año de publicación
- 2009
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.
Fil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Yanagida, M.. Doshisha University; Japón - Materia
-
Blind Source Separation
Reverberation
Independent Component Analysis
Speech Enhancement - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/102035
Ver los metadatos del registro completo
id |
CONICETDig_ff58205a4583287e6714959fab64aa45 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/102035 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio SourcesDi Persia, Leandro EzequielMilone, Diego HumbertoYanagida, M.Blind Source SeparationReverberationIndependent Component AnalysisSpeech Enhancementhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.Fil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Yanagida, M.. Doshisha University; JapónInstitute of Electrical and Electronics Engineers2009-02info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/102035Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.; Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources; Institute of Electrical and Electronics Engineers; Ieee Transactions On Audio Speech And Language Processing; 17; 2; 2-2009; 299-3111558-7916CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1109/TASL.2008.2009568info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:55:30Zoai:ri.conicet.gov.ar:11336/102035instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:55:30.435CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
title |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
spellingShingle |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources Di Persia, Leandro Ezequiel Blind Source Separation Reverberation Independent Component Analysis Speech Enhancement |
title_short |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
title_full |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
title_fullStr |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
title_full_unstemmed |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
title_sort |
Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources |
dc.creator.none.fl_str_mv |
Di Persia, Leandro Ezequiel Milone, Diego Humberto Yanagida, M. |
author |
Di Persia, Leandro Ezequiel |
author_facet |
Di Persia, Leandro Ezequiel Milone, Diego Humberto Yanagida, M. |
author_role |
author |
author2 |
Milone, Diego Humberto Yanagida, M. |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Blind Source Separation Reverberation Independent Component Analysis Speech Enhancement |
topic |
Blind Source Separation Reverberation Independent Component Analysis Speech Enhancement |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/2.2 https://purl.org/becyt/ford/2 |
dc.description.none.fl_txt_mv |
Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms. Fil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina Fil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina Fil: Yanagida, M.. Doshisha University; Japón |
description |
Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man?machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among con- secutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is per- formed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective mea- sure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms. |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-02 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/102035 Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.; Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources; Institute of Electrical and Electronics Engineers; Ieee Transactions On Audio Speech And Language Processing; 17; 2; 2-2009; 299-311 1558-7916 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/102035 |
identifier_str_mv |
Di Persia, Leandro Ezequiel; Milone, Diego Humberto; Yanagida, M.; Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources; Institute of Electrical and Electronics Engineers; Ieee Transactions On Audio Speech And Language Processing; 17; 2; 2-2009; 299-311 1558-7916 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1109/TASL.2008.2009568 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Institute of Electrical and Electronics Engineers |
publisher.none.fl_str_mv |
Institute of Electrical and Electronics Engineers |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842269347997810688 |
score |
13.13397 |