Gesture synthesis for conversational agents
Synthesis of lifelike gesture is finding growing attention in human-computer interaction. In particular, synchronization of synthetic gestures with speech output is one of the goals for
embodied conversational agents which have become a new paradigm for the study of gesture and for human-computer interface. In this context, this contribution presents an operational model that enables lifelike gesture animations of an articulated figure to be rendered in real time from representations of spatiotemporal gesture knowledge. Based on various findings on the production of human gesture, the model provides means for motion representation, planning, and control to drive the kinematic skeleton of a figure which comprises 43 degrees of freedom in 29 joints for the main body and 20 DOF for each hand. The model is conceived to enable cross-modal synchrony with respect to the coordination of gestures with the signal generated by a text-to-speech system.