Coordination and Fusion in Multimodal Interaction

By Dr. Mark Maybury

When we converse with one another, we utilize an array of media to interact, including spoken language, gestures, and drawings.

Download Resources


PDF Accessibility

One or more of the PDF files on this page fall under E202.2 Legacy Exceptions and may not be completely accessible. You may request an accessible version of a PDF using the form on the Contact Us page.

When we converse with one another, we utilize an array of media to interact, including spoken language, gestures, and drawings. We exploit multiple sensory systems or modalities of communication including vision, audition, and tac-tion. Providing machines with the ability to interpret multimedia input and generate coordinated multimedia output promises benefits including: More efficient interaction: enabling faster task completion with less work. More effective interaction: doing the right thing at the right time, tailoring the content and form of interaction to the context of the user, task, and dialogue. More natural interaction: supporting fused spoken, written, and gestural interaction, as found in human-human communication. Our research has focused on intelligent systems that exploit multiple media and modes.