Vision and Research Strategy
Language is essential for communication. However, communication is multimodal and not only based on language: That is, we look at objects when talking about them or hearing someone else talk about them; we point at things to disambiguate "That one there, please."; and we certainly use numerous gestures and other non-verbal cues in order to express uncertainty (a shrug), emotions (a frown), or an attitude (a raised eye- brow). Thus, non-verbal cues are an essential part of how we communicate – even when interacting with artificial agents. However, many such cues, and eye-gaze in particular, are extremely dynamic and strongly affected by various external factors such as the visual environment and the gaze and gesture of the partner. For this reason, they are a valuable, but very difficult to parse and interpret, source of information.
We therefore aim to investigate when and how precisely humans typically produce and integrate these different information channels during interaction in order to augment and improve human-agent interaction. Since people's non-verbal behaviour is difficult to control and manipulate systematically, we propose to address this problem mainly by developing and employing artificial agents and dialog systems as controlled communication partners in different roles: As gaze-producing speakers, speakers reacting to listener gaze, gaze-following listeners and so on. Using this method enables us to explore production patterns of eye-gaze and gestures and their role in grounding language and action. Thus, our approach serves both the understanding of human communication mechanisms as well as the direct application of these findings to improve human-agent interaction.





