Turn-taking cues in task-oriented dialogue

Autores
Gravano, Agustin; Hirschberg, Julia
Año de publicación
2011
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system and user also play an important role in system usability. In particular, both systems and users have difficulty determining when the other is taking or relinquishing the turn. In this paper, we seek to identify turn-taking cues correlated with human-human turn exchanges which are automatically computable. We compare the presence of potential prosodic, acoustic, and lexico-syntactic turn-yielding cues in prosodic phrases preceding turn changes (smooth switches) vs. turn retentions (holds) vs. backchannels in the Columbia Games Corpus, a large corpus of task-oriented dialogues, to determine which features reliably distinguish between these three. We identify seven turn-yielding cues, all of which can be extracted automatically, for future use in turn generation and recognition in interactive voice response (IVR) systems. Testing Duncan's (1972) hypothesis that these turn-yielding cues are linearly correlated with the occurrence of turn-taking attempts, we further demonstrate that, the greater the number of turn-yielding cues that are present, the greater the likelihood that a turn change will occur. We also identify six cues that precede backchannels, which will also be useful for IVR backchannel generation and recognition; these cues correlate with backchannel occurrence in a quadratic manner. We find similar results for overlapping and for non-overlapping speech.
Fil: Gravano, Agustin. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Universidad de Buenos Aires. Facultad de Medicina. Hospital de Clínicas General San Martín; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Hirschberg, Julia. Columbia University; Estados Unidos
Materia
Dialogue
Ivr Systems
Prosody
Turn-Taking
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/68351

id CONICETDig_91da0b235e5e107025a437b75455cb8a
oai_identifier_str oai:ri.conicet.gov.ar:11336/68351
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Turn-taking cues in task-oriented dialogueGravano, AgustinHirschberg, JuliaDialogueIvr SystemsProsodyTurn-Takinghttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system and user also play an important role in system usability. In particular, both systems and users have difficulty determining when the other is taking or relinquishing the turn. In this paper, we seek to identify turn-taking cues correlated with human-human turn exchanges which are automatically computable. We compare the presence of potential prosodic, acoustic, and lexico-syntactic turn-yielding cues in prosodic phrases preceding turn changes (smooth switches) vs. turn retentions (holds) vs. backchannels in the Columbia Games Corpus, a large corpus of task-oriented dialogues, to determine which features reliably distinguish between these three. We identify seven turn-yielding cues, all of which can be extracted automatically, for future use in turn generation and recognition in interactive voice response (IVR) systems. Testing Duncan's (1972) hypothesis that these turn-yielding cues are linearly correlated with the occurrence of turn-taking attempts, we further demonstrate that, the greater the number of turn-yielding cues that are present, the greater the likelihood that a turn change will occur. We also identify six cues that precede backchannels, which will also be useful for IVR backchannel generation and recognition; these cues correlate with backchannel occurrence in a quadratic manner. We find similar results for overlapping and for non-overlapping speech.Fil: Gravano, Agustin. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Universidad de Buenos Aires. Facultad de Medicina. Hospital de Clínicas General San Martín; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Hirschberg, Julia. Columbia University; Estados UnidosAcademic Press Ltd - Elsevier Science Ltd2011-07info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/68351Gravano, Agustin; Hirschberg, Julia; Turn-taking cues in task-oriented dialogue; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 25; 3; 7-2011; 601-6340885-2308CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0885230810000690info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csl.2010.10.003info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T15:35:24Zoai:ri.conicet.gov.ar:11336/68351instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 15:35:25.023CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Turn-taking cues in task-oriented dialogue
title Turn-taking cues in task-oriented dialogue
spellingShingle Turn-taking cues in task-oriented dialogue
Gravano, Agustin
Dialogue
Ivr Systems
Prosody
Turn-Taking
title_short Turn-taking cues in task-oriented dialogue
title_full Turn-taking cues in task-oriented dialogue
title_fullStr Turn-taking cues in task-oriented dialogue
title_full_unstemmed Turn-taking cues in task-oriented dialogue
title_sort Turn-taking cues in task-oriented dialogue
dc.creator.none.fl_str_mv Gravano, Agustin
Hirschberg, Julia
author Gravano, Agustin
author_facet Gravano, Agustin
Hirschberg, Julia
author_role author
author2 Hirschberg, Julia
author2_role author
dc.subject.none.fl_str_mv Dialogue
Ivr Systems
Prosody
Turn-Taking
topic Dialogue
Ivr Systems
Prosody
Turn-Taking
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system and user also play an important role in system usability. In particular, both systems and users have difficulty determining when the other is taking or relinquishing the turn. In this paper, we seek to identify turn-taking cues correlated with human-human turn exchanges which are automatically computable. We compare the presence of potential prosodic, acoustic, and lexico-syntactic turn-yielding cues in prosodic phrases preceding turn changes (smooth switches) vs. turn retentions (holds) vs. backchannels in the Columbia Games Corpus, a large corpus of task-oriented dialogues, to determine which features reliably distinguish between these three. We identify seven turn-yielding cues, all of which can be extracted automatically, for future use in turn generation and recognition in interactive voice response (IVR) systems. Testing Duncan's (1972) hypothesis that these turn-yielding cues are linearly correlated with the occurrence of turn-taking attempts, we further demonstrate that, the greater the number of turn-yielding cues that are present, the greater the likelihood that a turn change will occur. We also identify six cues that precede backchannels, which will also be useful for IVR backchannel generation and recognition; these cues correlate with backchannel occurrence in a quadratic manner. We find similar results for overlapping and for non-overlapping speech.
Fil: Gravano, Agustin. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Universidad de Buenos Aires. Facultad de Medicina. Hospital de Clínicas General San Martín; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Hirschberg, Julia. Columbia University; Estados Unidos
description As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system and user also play an important role in system usability. In particular, both systems and users have difficulty determining when the other is taking or relinquishing the turn. In this paper, we seek to identify turn-taking cues correlated with human-human turn exchanges which are automatically computable. We compare the presence of potential prosodic, acoustic, and lexico-syntactic turn-yielding cues in prosodic phrases preceding turn changes (smooth switches) vs. turn retentions (holds) vs. backchannels in the Columbia Games Corpus, a large corpus of task-oriented dialogues, to determine which features reliably distinguish between these three. We identify seven turn-yielding cues, all of which can be extracted automatically, for future use in turn generation and recognition in interactive voice response (IVR) systems. Testing Duncan's (1972) hypothesis that these turn-yielding cues are linearly correlated with the occurrence of turn-taking attempts, we further demonstrate that, the greater the number of turn-yielding cues that are present, the greater the likelihood that a turn change will occur. We also identify six cues that precede backchannels, which will also be useful for IVR backchannel generation and recognition; these cues correlate with backchannel occurrence in a quadratic manner. We find similar results for overlapping and for non-overlapping speech.
publishDate 2011
dc.date.none.fl_str_mv 2011-07
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/68351
Gravano, Agustin; Hirschberg, Julia; Turn-taking cues in task-oriented dialogue; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 25; 3; 7-2011; 601-634
0885-2308
CONICET Digital
CONICET
url http://hdl.handle.net/11336/68351
identifier_str_mv Gravano, Agustin; Hirschberg, Julia; Turn-taking cues in task-oriented dialogue; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 25; 3; 7-2011; 601-634
0885-2308
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0885230810000690
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csl.2010.10.003
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Academic Press Ltd - Elsevier Science Ltd
publisher.none.fl_str_mv Academic Press Ltd - Elsevier Science Ltd
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846083480331485184
score 13.22299