Abstract: The utilization of automated depression detection significantly
enhances early intervention for individuals experiencing depres-
sion. Despite numerous proposals on automated depression detec-
tion using recorded clinical interview videos, limited attention has
been paid to considering the hierarchical structure of the inter-
view questions. In clinical interviews for diagnosing depression,
clinicians use a structured questionnaire that includes routine base-
line questions and follow-up questions to assess the interviewee’s
condition. This paper introduces HiQuE (Hierarchical Question
Embedding network), a novel depression detection framework that
leverages the hierarchical relationship between primary and follow-
up questions in clinical interviews. HiQuE can effectively capture
the importance of each question in diagnosing depression by learn-
ing mutual information across multiple modalities. We conduct
extensive experiments on the widely-used clinical interview data,
DAIC-WOZ, where our model outperforms other state-of-the-art
multimodal depression detection models and emotion recognition
models, showcasing its clinical utility in depression detection.
Loading