Are Human Conversations Special? A Large Language Model Perspective

Are Human Conversations Special? A Large Language Model Perspective

ACL ARR 2024 June Submission1784 Authors

15 Jun 2024 (modified: 19 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In this paper, we study the changes in the attention behavior of large language models (LLMs) when used to understand natural conversations between humans (human-human conversations). By analyzing metrics such as attention distance, dispersion, and interdependency across these domains, we highlight the unique challenges posed to LLMs by conversational data. Our findings reveal that while language models exhibit domain-specific attention behaviors, there is a significant gap in their ability to specialize in human conversations. Through detailed attention entropy analysis and t-SNE visualizations, we demonstrate the need for models trained with diverse, high-quality conversational data to enhance understanding and generation of human-like dialogue.

Paper Type: Long

Research Area: Generation

Research Area Keywords: domain adaptation, text-to-text generation, model architectures, data influence, generative models

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 1784

Loading