An analysis of observation length requirements for machine understanding of human behaviors from spoken language

Sandeep Nallan Chakravarthula, Brian R. W. Baucom, Shrikanth Narayanan, Panayiotis G. Georgiou

2021 (modified: 05 Sept 2024)Comput. Speech Lang. 2021Readers: Everyone

Abstract: Highlights • Accuracy of human behavior estimation from conversational language depends on length of window used by system to observe language cues • Proposed framework analyzes (1) extrinsic similarity between system predictions and human judgments and (2) intrinsic consistency between system predictions of related behaviors to determine appropriate observation window lengths for different behaviors • Behaviors related to negative and positive affect best captured from short, frequently occurring expressions • Behaviors describing problem-solving traits such as negotiation, responsibility, etc. require longer observations for accurate estimation • Behaviors related to dysphoria such as sadness, withdrawal, etc. exhibit impulsive characteristics, difficult to capture from language cues alone Abstract The task of quantifying human behavior by observing interaction cues is an important and useful one across a range of domains in psychological research and practice. Machine learning-based approaches typically perform this task by first estimating behavior based on cues within an observation window, such as a fixed number of words, and then aggregating the behavior over all the windows in that interaction. The length of this window directly impacts the accuracy of estimation by controlling the amount of information being used. The exact link between window length and accuracy, however, has not been well studied, especially in spoken language. In this paper, we investigate this link and present an analysis framework that determines appropriate window lengths for the task of behavior estimation. Our proposed framework utilizes a two-pronged evaluation approach: (a) extrinsic similarity between machine predictions and human expert annotations, and (b) intrinsic consistency between intra-machine and intra-human behavior relations. We apply our analysis to real-life conversations that are annotated for a large and diverse set of behavior codes and examine the relation between the nature of a behavior and how long it should be observed. We find that behaviors describing negative and positive affect can be accurately estimated from short to medium-length expressions whereas behaviors related to problem-solving and dysphoria require much longer observations and are difficult to quantify from language alone. These findings are found to be generally consistent across different behavior modeling approaches.

0 Replies