::
TUTORIALS :: |
2 |
TITLE |
Cognitive and Computational Foundations for Embodied,
Situated Dialogue Processing in Human-Robot Interaction
|
SPEAKER |
Dr.
Geert-Jan M. Kruijff (DFKI GmbH) |
|
BRIEF
INFORMATION OF THE SPEAKERS |
||
Geert-Jan
Kruijff has a background in human-robot interaction, artificiall intelligence,
computational and mathematical linguistics, and philosophy. Heobtained
an MSc in Philosophy/Computer Science from the University of Twente
(1995), and a PhD in Informatics from Charles University (2001; UFAL,
Faculty of Mathematics & Physics). He is a Senior Researcher working
in the Language Technology group at the German Research Center for Artificial
Intelligence (DFKI GmbH), where he leads DFKI’s efforts on human-robot
interaction in the EU FP6 CoSy project (http://www.dfki.de/cosy).
His research interests are in the intersection between natural language
processing and robotics, particularly in the development of a cognitively
motivated computational frame work for modeling how a mobile robot could
carry out situated, collaborative dialogue with human agents. |
||
MOTIVATION
AND OBJECTIVES |
||
Spoken
dialogue is considered one of the principal modalities for human-robot
interaction (HRI), primarily because it is a natural means of communication
between humans. A fundamental characteristic of spoken dialogue for
HRI is that it is situated, and that the dialogue is between embodied
dialogue partners which perceive the environment. The question that
arises here is how this embodied situatedness affects comprehension
and production of spoken dialogue. The goal of the tutorial is twofold: (1) to provide an overview of relevant insights we have in human, situated language processing, and (2) to present a computational frame work in which these insights can be translated into practical systems for situated, embodied dialogue processing in human-robot interaction. In addressing these goals, the tutorial interleaves theoretical discussions about cognitive insights with concrete illustrations from HRI, in the form of videos of existing HRI systems and wizard-of-oz studies. The point here is to motivate why basing HRI systems on (specific) cognitive insights can be beneficial to HRI. On the cognitive side, the tutorial focuses on insights which reveal how meaning, communicated in a dialogue, is grounded in a deeper categorical and spatiotemporalcausal understanding of reality (i. e. ”situational awareness” in a broad sense). For these insights, the tutorial appeals to empirical observations and theories from psycholinguistics, neuroscience, and developmental psychology. The tutorial relates these cognitive insights to several key issues in situated dialogue processing for HRI-e. g. resolving and producing situated referents for objects and locations being talked about, situated possibilities of (planned) actions, clarification, and more non-verbal aspects such as robot anticipatory gaze movement and pointing gestures. The fundamental point the tutorial tries to make is that insights in human sentence processing can not only inspire HRI systems; they can also make them more efficient, (by making language comprehension prefer analyses that are situationally supported), and more effective, (by appealing to function swhich affect how humans understand situated dialogue). Simply put, we should not see a dialogue system for HRI is an add-on to an intelligent system, we should tightly couple it to how that system understands its environment. The tutorial supports this point through illustrative videos, and data from empirical evaluations. |
||
(TENTATIVE)
SCHEDULE OF THE TUTORIALS |
||
|
The proposed length of the tutorialis 4 hours (one afternoon).The
tutorial length could be shortened to 3hours, with some revisions to
the structure. (1) Incremental nature of dialogue processing (45min.) How is spoken dialogue processed? (incremental processing models) How can cross-modal information guide incremental processing? (preferencing; salience; cross-modal interconnectivity) (2) Dimensions of cross-modal interpretation in situated dialogue (90min.) How can we measure what information humans use in situated language processing? (relevant experimental paradigms in psycholinguistics, etc.) What role does "world knowledge" play in comprehending visually situated language? (category systems; mediation; affordances) How do we comprehend spatial language on spatial organization of visual objects? (topological and projective spatial relations; cognitive load in spatial referencing; affordances and spatial relations) How do we ground actions in an understanding of what we can do in a space? (topokinetic spatial organization) How are spatial and temporal situational perspectives formed? (ego-vs. Allocentric perspectivization, common ground; dialogue about plans) (3) Non-verbal communication in situated dialogue processing (30min.) How can gaze movements be triggered during incremental language comprehension? (anticipatory gaze movement) How can deictic gestures be generated during language production? (pointing gestures; iconic gestures) (4) Closing discussion (15min.) |
||