MuCHEx: A Multimodal Conversational Debugging Tool for Interactive Visual Exploration of Hierarchical Object Classification

Published: 2025, Last Modified: 22 Feb 2026IEEE Computer Graphics and Applications 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Object recognition is a fundamental challenge in computer vision, particularly for fine-grained object classification, where classes differ in minor features. Improved fine-grained object classification requires a teaching system with numerous classes and instances of data. As the number of hierarchical levels and instances grows, debugging these models becomes increasingly complex. Moreover, different types of debugging tasks require varying approaches, explanations, and levels of detail. We present MuCHEx, a multimodal conversational system that blends natural language and visual interaction for interactive debugging of hierarchical object classification. Natural language allows users to flexibly express high-level questions or debugging goals without needing to navigate complex interfaces, while adaptive explanations surface only the most relevant visual or textual details based on the user’s current task. This multimodal approach combines the expressiveness of language with the precision of direct manipulation, enabling context-aware exploration during model debugging.
Loading