« Previous - Version 44/135 (diff) - Next » - Current version
Anonymous, 11/20/2011 11:24 pm


BML 1.0 Standard

This document introduces and describes version 1.0 of the Behavior Markup Language standard. This document contains background information, descriptions of typical use contexts, and, most importantly, the syntactic and semantic details of the XML format of the Behavior Markup Language.


The Behavior Markup Language, or BML, is an XML description language for controlling the verbal and nonverbal behavior of (humanoid) embodied conversational agents (ECAs). A BML block (see example in figure below) describes the physical realization of behaviors (such as speech and gesture) and the synchronization constraints between these behaviors. BML is not concerned with the communicative intent underlying the requested behaviors. The module that executed behaviors specified in BML on the embodiment of the ECA is called a BML Realizer.

Figure 1: Example of a BML Request

Core Standard and Extensions

The BML Standard consists of a small and lean core, plus a few clearly defined mechanisms for extending the language.

Lean Core Standard

The Core of the BML Standard defines the form and use of BML blocks, mechanisms for synchronisation, the basic rules for feedback about the processing of BML messages (see later in this document), plus a number of generic basic behaviors. BML compliant realizers implement the complete BML Core Standard and provide a meaningful execution for all its behavior elements. Some realizers might offer only partial compliance, for example because they only steer a head (and therefore do not need to interpret bodily behaviors). In that case, a realizer should at least provide an exception/warning feedback when being requested to execute unsupported Core Standard behaviors (see Feedback).

{{div_start_tag(core_summary, inset)}}
  |*What:*         |BML Core Standard. |
  |*Status:*        |Mandatory. |
  |*XML namespace:* |http://www.bml-initiative.org/bml/bml-1.0 |
  |*Examples:*      |basic speech, point (INSERT REFS) |


BML provides several standardized mechanisms for extension. One can define new behaviors (in a custom namespace), or extend upon Core behaviors by adding custom attributes. /Description extensions/ provide a standardized manner for a user to give more detail about how the BML Realizer should realize a given instance of a core behavior, while allowing a fallback to the Core specification when the BML Realizer does not support the extension.

The BML standard defines a number of Core Extensions, both in the form of additional behaviors and in the form of description extensions. The Core Extensions provide behaviors and description levels that we do not want to make mandatory, but we do want to be implemented in a standardized way whenever a BML Realizer implements them. We encourage authors of realizers to collaborate and define shared behavior types and description extensions.

{{div_start_tag(coreext_summary, inset)}}
  |*What:*         |BML Core Extensions. |
  |*Status:*        |Optional, but if a realizer implements the functionality of a Core Extension, it should exactly follow the standard specification. |
  |*XML namespace:* |http://www.bml-initiative.org/bml/...  (last part is specified in the definition of the Core Extension) |
  |*Examples:*      |FACS face expressions, SSML description extension for speech (INSERT REFS) |

Global Context


The Behavior Markup Language is part of the SAIBA Multimodal Behavior Generation Framework (see Figure 2 below). In this framework, the intention for the ECA to express something arises in the Intent Planner. The Behavior Planner is responsible for deciding which multimodal behaviors to choose for expressing the communicative intent (through speech, face expressions, gestures, etcetera) and for specifying proper synchronisation between the various modalities. This multimodal behavior is specified in the form of BML messages. A BML Realizer is responsible for physically realizing the specified BML message through sound and motion (animation, robot movement, ...), in such a way that the time constraints specified in the BML block are satisfied. At runtime, the BML realizer sends back feedback messages to keep the planning modules updated about the progress and result of the realization of previously sent BML messages, allowing, e.g., for monitoring and possible error recovery.

Figure 2: SAIBA Framework

{{div_start_tag(intentplanning, inset)}}
The exact nature of the intent and behavior planning processes is left unspecified here. As far as the BML Realizer is concerned, it makes no difference whether BML messages are the result of a complicated multimodal affective dialog system, or are simply predefined BML messages pulled from a library of pre-authored materials. {{div_end_tag}}

BML Messaging Architecture

BML does not prescribe a specific message transport. Different architectures have drastically different notions of a message. A message may come in the form of a string, an XML document or DOM, a message object, or just a function call. However, no matter what message transport is used, the transport and routing layer should adhere to the following requirements:

  • Messages must be received in sent order.
  • Messages must contain specific contents that can be fully expressed as XML expressions in the format detailed in this document.

Currently, there are two types of messages:

  • BML Requests.
    • Sent by the Behavior Planner to the Behavior Realizer.
    • BML requests are sent as <bml> blocks containing a number of behavior elements with synchronisation.
  • Feedback Messages.
    • Sent by the Behavior Realizer.
    • Used to inform the planner (and possibly other processes) of the progress of the realization process.

The BML Realizer

Conceptually, BML Realizers execute a multimodal plan that is incrementally constructed (scheduled) on the basis of a stream of incoming BML Requests (see Figure 3). A BML Realizer is responsible for executing the behaviors specified in each BML request sent to it, in such a way that the time constraints specified in the BML request are satisfied. If a new request is sent before the realisation of previous requests has been completed, a composition attribute determines how to combine the behaviors in the new request with the behaviors from earlier requests (see documentation of composition attribute.

Each BML Request represents a scheduling boundary. That is: if behaviors are in the same BML request, this means that the constraints between them are resolved before any of the behaviors in the request is executed.

Figure 3: Dealing with an incoming stream of BML Requests

XML Format: Values and Types

Before describing the various XML elements in the BML Standard, we describe here the available attribute types.

We use camelCase throughout for element names and attribute names. Values of type openSetItem and closedSetItem defined in this document are generally all uppercase. The names of default synchpoints for various behavior types are written as lowercase with underscores to separate words (e.g., stroke_start).

Attribute Value Types

Values for various types of behavior attributes can be one of the following:

  • ID: An identifier that is unique within a specified context (see <bml> and "behavior element"). Adheres to standard XML type ID
  • synchref: describes the relative timing of synch points (see the section on synchronisation)
  • worldObjectID: A unique ID of an object in the character’s world. Adheres to standard XML type ID
  • closedSetItem: A string from a closed set of strings, where the standard will provide the exhaustive list of strings in the set.
  • openSetItem: A string from an open set of strings, where the standard may provide a few common strings in the set.
  • bool: A truth value, either "true" or "false"
  • int: A whole number
  • float: A number with decimals
  • angle: A float number specifying angle in degrees counterclockwise, from (-180, 180].
  • string: An arbitrary string
  • direction: A particular closedSetItem type from the ClosedSet [LEFT, RIGHT, UP, DOWN, FRONT, BACK, UPRIGHT, UPLEFT, DOWNLEFT, DOWNRIGHT]
  • vector: a string of format “float; float; float” indicating the x, y, and z coordinates of a vector

Coordinate System and Units

While we prefer specifying behavior by common verbs and nouns, for some attributes or applications it is unavoidable to use precise vectors.

All units are in kms (kilograms, meters, seconds).

BML assumes a global coordinate system in which the positive Y-axis is up. The local (character-based) coordinate system1 adheres to the guidelines of the H-Anim standard ( v1.1 and H-Anim ): "The humanoid shall be modelled in a standing position, facing in the +Z direction with +Y up and +X to the humanoid’s left. The local character-based origin (0, 0, 0) shall be located at ground level, between the humanoid’s feet."

1 Currently, there are no expressions in BML 1.0 that actually use the local character based coordinate system. However, future versions may introduce references such as "2 meters to the left of the character".

<bml> request

All BML behaviors must belong to a <bml> behavior block. A <bml> block is formed by placing one or more BML behavior elements inside a top-level <bml> element. Unless synchronization is specified (see the section on synchronisation), it is assumed that all behaviors in a <bml> block start at the same time after arriving at the BML realizer.


NAMESPACE: http://www.bml-initiative.org/bml/bml-1.0
ELEMENT: <bml>
ATTRIBUTES: characterId, id, composition
CONTENTS: behaviors of various types, <required> blocks, <constraint> blocks
1  <bml xmlns="http://www.bml-initiative.org/bml/bml-1.0" 
2       id="bml1" character="Alice" composition="merge">
3  </bml>

Example: An empty <bml> request

Attribute Details

Attribute Type Use Default Description
characterId worldObjectID optional "" a reference towards the controlled character
id ID required Unique ID that allows referencing to a particular <bml> block. The id 'bml’ is reserved.
composition openSetItem optional "merge" one among [merge,append,replace], defines the composition policy to apply if the current <bml> block overlaps with previous <bml> blocks (see below).


No Communicative Meaning

The BML specification does not prescribe a communicative meaning for the BML Request. This allows users of BML to specify short spurts of behavior (for example: speech clauses or individual gaze shifts) and generate performances incrementally, or, if they prefer, to construct elaborate performances as a whole and send them in a single request (for example: entire monologues).

Ordering is not meaningful

The order of elements inside the <bml> block does not have any semantic meaning. Authors writing BML expressions should not rely on a BML Realizer realizing something in a certain order because it is in a certain order in the BML block

Start time, end time, delays

Each <bml> request represents a scheduling boundary. That is: if behaviors are in the same <bml> request, this means that the constraints between them are resolved before any of the behaviors in the request is executed.

start time – the start time of a block b is the global timestamp when it actually starts being executed. The start time may be influenced by various delays, as well as by the composition attribute (both explained further below).
end time – the end time of a block is the global timestamp when all behaviors in the block have ended.

When a planner sends a <bml> request to a realizer, there will be a slight (hopefully negligible) delay before the behavior actually starts being performed on the embodiment. The transport and routing layer supporting the transmission of a sequence of <bml> blocks will introduce a transmission delay; parsing the request and solving the constraints may introduce another delay.


If a new request is sent before the realisation of previous requests has been completed, a composition attribute determines how to combine the behaviors in the new <bml> block with the behaviors from prior <bml> blocks. The values for the composition attribute have the following meaning.

  • merge: (default) The start time of the new <bml> block will be as soon as possible. The behaviors specified in the new <bml> block will be realized together with the behaviors specified in prior <bml> blocks. In case of conflict, behaviors in the newly merged <bml> block cannot modify behaviors defined by prior <bml> blocks.
  • append: The start time of the new block will be as soon as possible after the end time of all prior blocks.
  • replace: The start time of the new block will be as soon as possible. The new block will completely replace all prior <bml> blocks. All behavior specified in earlier blocks will be ended and the ECA will revert to a neutral state before the new block starts.

    As an example of a merge conflict, one might consider two consecutive <bml> blocks that both specify a right handed gesture, with the timing being such that they should be performed at the same time. When this turns out to be impossible, the gesture in the block that arrived last should be dropped, and an appropriate warning should be issued (see Feedback section)


It is generally assumed that the behavior realizer will attempt to realize all behaviors in a block, but if some of the behaviors don’t successfully complete for some reason, other behaviors still get carried out (see Feedback and Failure and Fallback).

If there is an all-or-nothing requirement for all or some of the behaviors, they can be enclosed in a <required> block inside the <bml> block.


NAMESPACE: http://www.bml-initiative.org/bml/bml-1.0
ELEMENT: <required>
CONTENTS: behaviors of various types, <constraint> blocks


If behaviors or constraints enclosed in a <required> block cannot be realized, the complete <bml> block of which the <required> block is a part should be aborted, with appropriate feedback.

In the following example, the entire performance in the <bml> block will be aborted if either the gaze or the speech behavior is unsuccessful (and an appropriate feedback message sent back from the behavior realizer, see Feedback section), but if only the head nod is unsuccessful, the rest will be carried out regardless (and an appropriate feedback message sent back from the behavior realizer).

  <bml id="bml1" xmlns="http://www.bml-initiative.org/bml/bml-1.0" character="Alice">
      <gaze id="gaze1" target="PERSON1"/>
      <speech id="speech1"><text>Welcome to my humble abode</text></speech>
    <head id="nod1" type="NOD"/>

Behaviors (common aspects)

A behavior element describes one kind of a behavior to the behavior realizer. In its simplest form, a behavior element is a single XML tag with a few key attributes:

  <bml id="bml1" xmlns="http://www.bml-initiative.org/bml/bml-1.0" character="Alice">
     <gaze id="gaze1" target="PERSON1"/>


This document specifies a number of XML elements for specifying various sorts of behavior. Any behavior element has at least the following attributes:

Attribute Type Use Default Description
id ID required Unique ID that allows referencing to a particular behavior. The id 'bml’ is reserved.
start synchref optional Determines the start time of the behavior, either as offset relative to the start time of the enclosing <bml> block, or relative to another behavior contained in this block or in another block. If no synchrefs are specified for this behavior, start time is 0; if start is unspecified but other synchrefs are given for this behavior, start is determined by the other synchrefs (and the possible duration for this behavior).
end synchref optional local end time of the behavior, either as offset relative to the start time of the enclosing <bml> block, or relative to another behavior contained in this block or in another block. If unspecified, the end time will be dependent on the start time, other synchrefs specified on this behavior, and the possible duration of the behavior.

In addition, there may be attributes concerning other default synch points for a specific behavior type.


There are a few aspects concerning the semantics of behaviors that are common to all behavior types.

Timing and synchronisation

Unless synchronization or timing constraints are specified, it is assumed that all behaviors in a <bml> block start at the start time of the <bml> block. In the section on synchronisation, more detail is given concerning how to specify such constraints.

Targets in the world

Some of the behavior types specified in this document, require reference to a target in the world (gaze target, point target, ...). A BML Realizer may assume a number of predefined targets, referenced by an attribute value of type worldObjectID.

{{div_start_tag(target, inset)}}
For next version, we are working on working out a &lt;target&gt; element that allows more control over specification and modification of targets in the world.

Behaviors with residual effect

Some types of behavior have a residual effect. That is, after the end time of the behavior has been reached, the ground state of the ECA will be different than before the behavior started.

An example of a behavior type with a residual effect is <locomotion>: after a <locomotion> behavior has been completed, part of the ground state of the ECA (in this case: location and orientation in the world) will be different than before, and other behaviors will be realized from this new ground state.

An example of a behavior type without a residual effect is <point>: usually, realization of a <point> behavior involves a final retraction phase that returns the ECA back to the ground state in which it was before starting realization of the <point> behavior.

A number of behavior types exist both in a version with and without residual effect. For example, after completion of a <face> behavior, the face of the ECA returns to the state it was in before the <face> behavior started, but a <faceShift> behavior will cause the face of the ECA to have a new ground state.

When both versions of a behavior are active at the same time, the version without residual effect has priority for being displayed, but the ground state is nevertheless changed by the behavior with residual effect.


List of available behavior types


For every behavior, its realization may be broken down into phases. Each phase is bounded by a synch-point that carries the name of the transition it represents, making it relatively straight-forward to align behaviors at meaningful boundaries (see Figure 4 for an example of the synch points for gestures). In the example below, the speech behavior and the point gesture are aligned at their start times.

  <bml id="bml1" xmlns="http://www.bml-initiative.org/bml/bml-1.0" character="Alice">
    <gesturePointing id="behavior1" target="blueBox" hand="RIGHT_HAND" start="speech1:start"/> 
    <speech id="speech1"><text>Look there!</text></speech>

Example: speech and gesture aligned at their start times

Figure 4: Synchronisation points for a gesture


Synchronisation is specified by assigning a synchref value to one or more of the synch attributes of a behavior.

An synchref value is one of the following two forms:

  • [block_id:]behavior_id:sync_id [+/- offset] – A reference to a synch point of another behavior, optionally with a float offset in seconds. By default, this is a behavior in the same <bml> block that the synchref is contained in; if optional prefix block_id: is present, the synchref specifies a synch point of a behavior in the <bml> block with that ID.)
  • offset –A positive float offset in seconds relative to the start time of the surrounding <bml> block.

  <!-- Timing example behaviors -->
  <gaze start="0.3" end="2.14" /><!-- absolute timing in seconds -->
  <gaze stroke="behavior1:stroke" /><!-- relative to another behavior -->
  <gaze ready="behavior1:relax + 1.1" /><!-- relative to another behavior, with offset -->
  <gaze ready="bml3:behavior1:relax" /><!-- relative to a behavior in another block -->

For a next version of the BML Standard, we are working out the <constaint> element that allows more flexible and finegrained specification of synchronisation.


The synchronization constraints described above are all bidirectional. That is:

<head id="head1" stroke="gesture1:stroke" ... />

means that the the strokes of head1 and gesture1 should be aligned. This synchronization constraint must be interpreted bidirectional: the exact same time constraint can be expressed by:

<gesture id="gesture1" stroke="head1:stroke" ... />

Default Synch Points

All behaviors have synch points called start and end. Furthermore, for each behavior type a number of additional default synch points may be available. For every default synch point, the corresponding behavior XML element has a synch attribute of the same name.

New Synch Points

New synch points can be introduced for specific behavior types or description extensions. For example, in speech one can use the special <sync> tag to insert additional synch points in speech.

When new synch-points are introduced for a behavior, it is assumed that start and end will still refer to the first and last synch-point for that behavior.

Face behaviors





Gaze behaviors



Gesture behaviors



Head behaviors




Locomotion behavior


Posture behaviors



Speech behaviors


Utterance to be spoken by a character.


NAMESPACE: http://www.bml-initiative.org/bml/bml-1.0
ELEMENT: <speech>
ATTRIBUTES: id, synch attributes
CONTENTS: exactly one <text> child containing the text to be spoken, which in turn may contain one or more <sync> markers.
A <synch> marker has an attribute id of type ID, the value of which is unique within the context of this <speech> element.
1<bml xmlns="http://www.bml-initiative.org/bml/bml-1.0" 
2        character="Alice" 
3        id="bml1">
4    <speech id="speech1" start="6"><text>This is a complete <sync id="syncstart1" /> 
5    BML core speech description.</text></speech>        

Example: This is an example of a complete speech behavior, synchronized to a beat gesture (striking on "speech").


Realization of the <speech> element generates both speech audio (or text) and speech movement, for example using a speech synthesizer and viseme morphing.
The <speech> element requires a sub-element. This sub-element is a <text> element that contains the text to be spoken, with optionally embedded <synch> elements for alignment with other behaviors.

Description levels and other extension mechanisms

The core BML behavior elements are by no means comprehensive, and much of the ongoing work behind BML involves identifying and defining a broad and flexible library of behavior (types). Implementors are encouraged to explore new behavior elements and more detailed ways to specify existing core behaviors. BML allows such extensions in several ways:

  • Additional behaviors should be designed as new XML elements using custom XML namespaces.
  • Specialized attributes can be used to extend core BML behaviors. Such attributes should be identified as non-standard BML by utilizing XML namespaces.
  • Behavior Description Extensions provide a principled way of specifying core BML behaviors in a more detailed manner, typically using existing XML languages for that specific behavior.

Figure 5: Extending BML

The following example utilizes a customized animation behavior and a customized joint-speeds attribute. The latter specifies the core gaze behavior in a more detailed manner. Both extensions are from the SmartBody project.

  <bml xmlns="http://www.bml-initiative.org/bml/bml-1.0" xmlns:sbm="http://www.smartbody-anim.org/sbm">
     <gaze id="gaze1"  target="AUDIENCE" sbm:joint-speeds="100 100 100 300 600"/>
     <sbm:animation name="CrossedArms_RArm_beat"/>   

Example: Using extensions

If a realizer cannot interpret extended BML, it should deal with it in the way suggested in the Section Failure and Fallback.

Behavior Description Extensions

BML allows for additional behavior descriptions that go beyond the core BML behavior specification in describing the form of a behavior. Additional descriptions are embedded within a behavior element as children elements of the type description. The type attribute of the description element should identify the type of content, indicating how it should be interpreted. Even if additional descriptions are included in a behavior, the core attributes of the behavior element itself cannot be omitted since the core specification is always the default fall-back.

Description elements in BML can include existing representation languages such as SSML, Tobi, etc. or new languages can be created that make use of advanced realization capabilities. Each description element should be a self-contained description of a behavior because a behavior realizer may not know how to combine multiple behavior descriptions. It is required that each description provides exactly the same synchronization points as its accompanying core BML. It is however allowed to place the synchronization points in the description level at slightly different positions than those in the core BML. This can be used to, for example, to provide synchronization at syllable level rather than a word level in a description extension of a speech behavior.

If a realizer does not known how to interpret the available description types, it should default to the core behavior.

If multiple description elements are given, and a realizer is capable of interpreting more than one, the realizer should use the highest priority description.

Example: use an audio file to play back this speech behavior. If that’s not supported, use SSML. As a last resort, fall back to the core behavior. Note that the descriptions specify the same synch points as the core behavior.

  <speech id="s1">
     <text>This is the proposed BML <synch id="tm1"/> extended speech specification.</text>
     <description priority="1" type="application/ssml+xml">
        <ssml:speak xmlns:ssml="http://www.w3.org/2001/10/synthesis">
        This is the <ssml:emphasis>proposed</ssml:emphasis>  BML  <ssml:mark name=”tm1”/> extended speech specification.
     <description priority="3" type="audio/x-wav">
        <audio:sound xmlns:audio="http://www.ouraudiodesc.com/">
          <audio:file ref="bml.wav"/>
          <audio:synch id="tm1" time="2.3" />

Example: Using description extensions for speech

  <speech id="s1">
     <text>This is the proposed BML <synch id="tm1"/> extended speech specification.</text>
     <description priority="1" type="application/ssml+xml">
        <speak xmlns="http://www.w3.org/2001/10/synthesis">
        This is the <emphasis>proposed</emphasis>  BML  <mark name=”tm1”/> extended speech specification.
     <description priority="3" type="audio/x-wav">
        <sound xmlns="http://www.ouraudiodesc.com/">
          <file ref="bml.wav"/>
          <synch id="tm1" time="2.3" />

Example: A slightly less verbose example of the same behavior, using default namespaces for audio and SSML.

Failure and Fallback

When a realizer is unable to interpret or execute part of a <bml> block, it should deal with it in the following ways.

  • if unable to execute <required> block: drop complete <bml> block; send error feedback
  • if unable to execute a behavior child: drop behavior, send warning feedback
  • if unable to adhere to a constraint specified in an attribute in a behavior child: drop behavior, send warning feedback
  • if unable to interpret a description extension: fallback to lower priority description extension, or to core behavior, and send warning feedback
  • if unable to interpret extended behaviors: drop behavior, send warning feedback
  • if unable to interpret extended attributes: drop attribute, send warning feedback


A BML realizer should provide a behavior planner with different types of feedback. Progress feedback gives information on the execution status of ongoing behaviors. Solution feedback provides the "scheduling solution" of behaviors, that is, the exact predicted timing of synch points. Error feedback is used to indicate that a BML block as a whole failed. Warning feedback indicates that the execution or scheduling of some behaviors failed, or that some time constraints could not be achieved.

Scheduling.png (62.9 kB) Anonymous, 11/20/2011 12:42 pm

gesturephases.png (177.9 kB) Anonymous, 11/20/2011 02:31 pm

Extension.png (19.2 kB) Anonymous, 11/20/2011 03:08 pm

bmlexample.png (18.9 kB) Anonymous, 11/20/2011 03:09 pm