Collaborations within the Cluster of Excellence
In the area of markerless motion and performance capture, we most closely collaborate with Bodo Rosenhahn's JRG, and this collaboration has led to a multitude of joint publications that have appeared internationally in some of the best venues. Although B. Rosenhahn has since left the Cluster and accepted a faculty position at Univ. Hannover, the collaboration with him continues.
Our research in this area also benefits significantly from the work on realtime visual computing algorithms and optical flow computations done in
Joachim Weickert's group (
RA2), and the following two publications have emerged from this. The work on markerless tracking of athletes interacting with sports gear is of particular relevance here, since it takes into account the motion restrictions that arise from interactions with sports gear (e.g. bicycle, snowboard) as soft constraints during pose estimation. Another result of the cooperation with
Joachim Weickert (
RA2) joint work with C. Theobalt on the simultaneous computation of scene flow, stereo geometry (fundamental matrix), and the depth map from stereo image sequences, that is discussed in more detail in
RA2.
In joint work with the IRG on Multimedia Information Retrieval and Music Processing chaired by
M. Müller, we made use of their results on semantic classification of motion patterns (motion templates) to stabilize tracking (with motion templates used as a prior).
Our work on high-quality performance capture from sparse multi-view video makes use of a sophisticated representation of the underlying geometry (coupled hierarchy of a highly detailed surface mesh and an underlying coarser tetrahedral mesh). Both the underlying representations and some of the algorithms for point location build on earlier work with researchers from
RA3 (computational geometry and geometric computing), and although A. Belyaev has since left the Cluster, we continue to interact with him on these matters. Similarly, the work on markerless motion capture with unsynchronized moving cameras benefits from algorithms and code, developed in
RA3 and in particular in
M. Wand's IRG, that allows for the reconstruction of surfaces from unstructured point clouds.
Together with researchers from
RA7, we are working on integrating advanced characters and scanned point models into virtual environments, and finding scalable and efficient ways to handle such large-scale models. This integration takes place inside the VE demonstrator, and much of it will be based on the use of XFlow and XML3D.
Together with the group of
Michael Kipp we are also integrating gesture generation of virtual humans into the VE demonstrator. It is based on the XFlow system for animating the virtual characters and includes elements from the EMBR Character Animation Engine and driving the animations through EMBRScript. The recent work on realtime prosody-driven synthesis of body language for a virtual agent given a realtime audio track is also relevant for this work.
Our joint publication on gesture modeling and animation based on a probabilistic re-creation of speaker style is based on a collaboration between
Michael Kipp's IRG and researchers from
Hans-Peter Seidel's graphics group that started when Michael Neff (formerly Univ. Toronto, now UC Davis) joined us as a postdoc. Our approach is based on combining
Michael Kipp's Anvil annotation tool for multimodal dialog with previous work on the generation of non-verbal expressions from speech done in the graphics group.
An important aspect of embodied communication is gaze. Based on researcch in Matt Crocker's group (
RA1) on how users integrate spoken and visual information from the environment, including robots and virtual agents, we have started to investigate the impact of gaze in embodied communication and developed an embodied agent testbed for this task.
Much of the work on embodied agents and multimodal behavior is done in close collaboration between
Michael Kipp's IRG and
Wolfgang Wahlster's group at DFKI, and there is also a strong link to multimodal dialog (
RA9). One example is the work on extending the Scenemaker visual tool for interaction modeling to allow parallel processes and integration of a rule-based reasoner. This has become a major ingredient in the INTAKT system, where multiple supermarket advisor agents, including a shopping cart personal assistant, communicate both with the user and among themselves.
A second example is our collaboration in the context of the
In-Car Dialog Demonstrator, where we are studying the use of animated characters in the car for persuasive systems. Carefully controlling the driving conditions (e.g. animated characters are only allowed when the wheels are not turning), we are particularly interested in the question of whether animated characters are a suitable means for persuading the driver to drive in a more energy efficient manner.
A very recent third example is our work on multitouch puppetry, which presents a novel multitouch interface for simultaneously controlling the many degrees of freedom of a human arm.
External Collaborations
External collaborations exist on a personal, project and institutional level, but they are far too numerous to list them all. We therefore strictly limit ourselves here to listing those collaborations that are of direct relevance to the results described above.
Collaboration with Stanford within the framework of the Max Planck Center for Visual Computing and Communication led to several joint publications with researchers from Stanford, including the work on performance capture from sparse multi-view video, our recent work on video-based animatable human characters (both with Sebastian Thrun), and the work on realtime prosody-driven synthesis of body language (with Vladlen Koltun).
We also continued our successful collaboration with Michael Neff (UC Davis) on gesture modeling and animation based on a probabilistic recreation of speaker style, and published another joint paper on augmenting gesture animation with motion capture data at IVA'09.