A computer vision integration model for a multi-modal cognitive system

Alen Vrecko, Danijel Skocaj, Nick Hawes, Ales Leonardis

2009 (modified: 03 Nov 2022)IROS 2009Readers: Everyone

Abstract: We present a general method for integrating visual components into a multi-modal cognitive system. The integration is very generic and can work with an arbitrary set of modalities. We illustrate our integration approach with a specific instantiation of the architecture schema that focuses on integration of vision and language: a cognitive system able to collaborate with a human, learn and display some understanding of its surroundings. As examples of cross-modal interaction we describe mechanisms for clarification and visual learning.

0 Replies