Abstract: Highlights•Referring expression: adjective phrase, proper name, relational phrase, pronouns.•A novel multimodal framework for understanding different referring expressions.•Analyze the attention of speaker; Construct objects relationship matrix.•Parse language with ChatGPT; Associate expression to candidate entities.•Experiments on semi-structured human–robot interaction verify performance.
Loading