RESUMO
In spontaneous conversation, speakers seldom have a full plan of what they are going to say in advance: they need to conceptualise and plan incrementally as they articulate each word in turn. This often leads to long pauses mid-utterance. Listeners either wait out the pause, offer a possible completion, or respond with an incremental clarification request (iCR), intended to recover the rest of the truncated turn. The ability to generate iCRs in response to pauses is therefore important in building natural and robust everyday voice assistants (EVA) such as Amazon Alexa. This becomes crucial with people with dementia (PwDs) as a target user group since they are known to pause longer and more frequently, with current state-of-the-art EVAs interrupting them prematurely, leading to frustration and breakdown of the interaction. In this article, we first use two existing corpora of truncated utterances to establish the generation of clarification requests as an effective strategy for recovering from interruptions. We then proceed to report on, analyse, and release SLUICE-CR: a new corpus of 3,000 crowdsourced, human-produced iCRs, the first of its kind. We use this corpus to probe the incremental processing capability of a number of state-of-the-art large language models (LLMs) by evaluating (1) the quality of the model's generated iCRs in response to incomplete questions and (2) the ability of the said LLMs to respond correctly after the users response to the generated iCR. For (1), our experiments show that the ability to generate contextually appropriate iCRs only emerges at larger LLM sizes and only when prompted with example iCRs from our corpus. For (2), our results are in line with (1), that is, that larger LLMs interpret incremental clarificational exchanges more effectively. Overall, our results indicate that autoregressive language models (LMs) are, in principle, able to both understand and generate language incrementally and that LLMs can be configured to handle speech phenomena more commonly produced by PwDs, mitigating frustration with today's EVAs by improving their accessibility.
RESUMO
Social robots have limited social competences. This leads us to view them as depictions of social agents rather than actual social agents. However, people also have limited social competences. We argue that all social interaction involves the depiction of social roles and that they originate in, and are defined by, their function in accounting for failures of social competence.
RESUMO
People give feedback in conversation: both positive signals of understanding, such as nods, and negative signals of misunderstanding, such as frowns. How do signals of understanding and misunderstanding affect the coordination of language use in conversation? Using a chat tool and a maze-based reference task, we test two experimental manipulations that selectively interfere with feedback in live conversation: (a) "Attenuation" that replaces positive signals of understanding such as "right" or "okay" with weaker, more provisional signals such as "errr" or "umm" and (2) "Amplification" that replaces relatively specific signals of misunderstanding from clarification requests such as "on the left?" with generic signals of trouble such as "huh?" or "eh?". The results show that Amplification promotes rapid convergence on more systematic, abstract ways of describing maze locations while Attenuation has no significant effect. We interpret this as evidence that "running repairs"-the processes of dealing with misunderstandings on the fly-are key drivers of semantic coordination in dialogue. This suggests a new direction for experimental work on conversation and a productive way to connect the empirical accounts of Conversation Analysis with the representational and processing concerns of Formal Semantics and Psycholinguistics.
Assuntos
Compreensão/fisiologia , Comportamento Cooperativo , Relações Interpessoais , Psicolinguística , Desempenho Psicomotor/fisiologia , Comportamento Verbal/fisiologia , Adulto , Humanos , Adulto JovemRESUMO
Anecdotal evidence suggests that participants in conversation can sometimes act as a coalition. This implies a level of conversational organization in which groups of individuals form a coherent unit. This paper investigates the implications of this phenomenon for psycholinguistic and semantic models of shared context in dialog. We present a corpus study of multiparty dialog which shows that, in certain circumstances, people with different levels of overt involvement in a conversation, that is, one responding and one not, can nonetheless access the same shared context. We argue that contemporary models of shared context need to be adapted to capture this situation. To address this, we propose "grounding by proxy," in which one person can respond on behalf of another, as a simple mechanism by which shared context can accumulate for a coalition as a whole. We explore this hypothesis experimentally by investigating how people in a task-oriented coalition respond when their shared context appears to be weakened. The results provide evidence that, by default, coalition members act on each other's behalf, and when this fails they work to compensate. We conclude that this points to the need for a new concept of collective grounding acts and a corresponding concept of collective contexts in psycholinguistic and semantic models of dialog.
Assuntos
Comunicação , Comportamento Cooperativo , Comportamento Verbal/fisiologia , Adulto , Feminino , Humanos , Masculino , Modelos Teóricos , PsicolinguísticaRESUMO
We present empirical evidence from dialogue that challenges some of the key assumptions in the Pickering & Garrod (P&G) model of speaker-hearer coordination in dialogue. The P&G model also invokes an unnecessarily complex set of mechanisms. We show that a computational implementation, currently in development and based on a simpler model, can account for more of this type of dialogue data.