Wiki

Version 1 (Anonymous, 02/08/2010 04:54 pm) → Version 2/7 (Anonymous, 02/08/2010 05:16 pm)

h1. SAIBA - Multimodal Behavior Generation Framework

Hannes Vilhjalmsson, Norman Badler, Lewis Johnson, Stefan Kopp, Brigitte Krenn, Stacy Marsella, Andrew N. Marshall, Catherine Pelachaud, Hannes Pirker, Kristinn R. Thorisson

The generation of natural multimodal output for embodied conversational agents requires a time-critical production process with high flexibility. To scaffold this production process and encourage sharing and collaboration, a working group of ECA researchers has introduced the SAIBA framework (Situation, Agent, Intention, Behavior, Animation). The framework specifies multimodal generation at a macro-scale, consisting of processing stages on three different levels: (1) planning of a communicative intent, (2) planning of a multimodal realization of this intent, and (3) realization of the planned behaviors.

Introduction

The overall goal of this international effort is to unify a multimodal behavior generation framework for Embodied Conversational Agents (ECAs) so that people in the field can more easily work together and share resources.

So far the following research centers and institutions actively participate in the effort (alphabetical):

Articulab, Northwestern University, USA
Artificial Intelligence Group, University of Bielefeld, Germany
Austrian Research Institute for AI (OFAI), Vienna, Austria
Center for Analysis and Design of Intelligent Agents (CADIA), Reykjavik University, Iceland
Center for Human Modeling and Simulation, University of Pennsylvania, USA
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKA), Germany
Human Media Interaction, University of Twente, The Netherlands
Human-Oriented Technology Lab, University of Zagreb, Croatia
Information Sciences Institute (ISI), University of Southern California, USA
Institute for Creative Technologies (ICT), University of Southern California, USA
Intelligent Agents and Synthetic Characters Group at INESC, Lisbon, Portugal
IUT de Montreuil, University de Paris 8, France
Overview

The first step towards a unifying representational framework for multimodal generation has been to lay down the general planning stages and knowledge structures that are involved in the creation of multimodal communicative behavior. We do not want to impose a particular micro-architecture. Yet, as our goal is to define representation languages that can serve as clear interfaces at separate levels of abstraction—building upon our experiences from previous ECA systems—we need to modularize the problem.

We aim for the representation languages to be:

Independent of a particular application or domain
Independent of the employed graphics and sound player model
Represent a clear-cut separation between information types (function-related versus process-related specification of behavior)
The generation of natural multimodal output requires a time-critical production process with high flexibility. To scaffold this production process we introduced the SAIBA framework (Situation, Agent, Intention, Behavior, Animation), and specify the macro-scale multimodal generation consisting of processing stages on three different levels:

Planning of a communicative intent
Planning of multimodal behaviors that carry out this intent
Realization of the planned behaviors
These processing stages are depicted below:
!!