EMBODIED AGENTS

IN AUGMENTED & VIRTUAL REALITIES

Course E6998-004, Dept. of Computer Science, Columbia University, Fall 2002
Prof. Kris Thórisson, Ph.D.
 
 
 

 

LECTURE NOTES

LECTURE 2 PART 1

Multimodal Perception

September 12, 2002

 

 
     









1

Concepts Covered Today

 
 

Sensation

 
 

Perception

 
 

Sensory fusion

 
 

Hearing & vision

 
 

Perception-Action Loop

 
 

Blackboards

 

Perceptors

 

Prosody

 
 

Broad-stroke hypothesis of perception

 

   












2

How is Perception Relevant?

 
 

Embodiment ->

 
 

position in space ->

 
 

perceptual point of view

 
 
  • By being embodied, you are bound by that embodiment's perceptual point of view

 
     








 





3

What is Perception For?

 

 

Perception exists solely to enable us to act - to move around and operate on the world

Constantly helping us organize sensory input

 

 

 
 

Choice reaction time - the "survival loop": 100ms in humans, much faster in cockroaches, dogs

 
 

Deliberate task actions - 1-10 seconds

 
 

Deliberation, 'mulling it over' - 2 seconds to 10 years

 
     










4

The Perception-Action Loop

 

 

 
 

Choice reaction time: "when the light comes on, press the button"

 

 

 
 

When do we go ballistic?

 
     











5

Sensation - Perception

 

Sensation: Transduction "how bright is it?"

 
 

Perception: Interpretation "what is that?"

 
 

Sensation/Perception in humans:

              1. Vision

              2. Hearing

              3. Proprioception

              4. Touch

              5. Balance

              6. Pain

              7. Smell

              8. Taste

 
 

 

 













6

Vision

 
   
 

2-D retinal image

 
 

Monocular, binocular

 
 

Shape from shading

 
 

Patterns

 
 

Gradients

 
 
 
     












7

Hearing

 
 
 
 

 

Spatial analysis

 
 

Speech perception - phoneme extraction, word composition

 
 

Sound analysis - "hearing objects", often called "auditory scene analysis"

 
 
 
 
 
 
 
     












8

Proprioception

 
 

Your sense of your body posture and where your limbs are (you know where your arms are with your eyes closed)

 

Relevant to models of embodied agents that have a realistic model of balance

 

Easy to implement in agents in virtual environments because all object relationships are explicitly represented

 
     










9

Touch

 
 

Somewhat of a sore thumb in robotics - artificial skin nowhere to be seen?

 

 

Typical case: Honda P3 robot

 
 

In virtual worlds: collision detection - classical computer graphics issue

 
 

In augmented realities: analysis of both the real world and the virtual world

 
     











10

Multimodal Integration

 
 

Early integration (a.k.a. 'sensor fusion') - information from various senses combined early in the data path

 
 

Late integration - integrate data later in the data path, closer to knowledge representation

 
 

Largest bulk of research on the issue in neurophysiology and physiological psychology

 
 

 

 













11

Perception in the Real World

 
 

Noisy

 
 

Sensor failure

 
 

Argument for needing robots in A.I.: You need actual physical embodiment to get to a solid understanding of mind (or just to get further)

 
 

Argument for virtual beings: A lot less work, all salient factors of real world can be simulated

 
 

 

 












12

Perception in Virtual Worlds

 
 

All the information is there - any data in the world can be piped into the agent's perception

 
 

Very sterile - good because no noise, bad because:

 
 
  • A lot of basic programming is needed

 
 
  • Difficult to do real-time - virtually all software for VR created for non-real time

 
 

DIVE: use "auras" to create spatial 'zones of perception'

 
     













13

Perception in Augmented Realities

 

Agent is animated, user is partially tracked

 

User behavior can be piped directly into the agent's perception - makes communication easier between user and agent

 
 

Difficult to bring the outside world into the agent's perceptors

 
 

A mixture of virtual and real requires calibration

 
     








14

What's worth perceiving?

 
 

- task dependent

 
 

Our focus for perception: communication -

  • spans many timescales of behavior

  • is a required part of collaborative agents

  • embodied communication is in need of serious study

 
 

Basics:

  • 'is this a person?'

  • 'who is it?'

  • 'what is she doing?'

  • 'is she talking?'

  • 'to whom?' - etc.

 
 

More advanced:

  • Topic changes

  • turn-taking cues

  • multimodal references

  • content

  • ...and much much more
 
     




PART 2 ->


2002©K.R.Thórisson