Abstract: Sensorial language --- the language connected to our senses including vision, sound, touch, taste, smell, and interoception --- plays a fundamental role in how we communicate experiences and perceptions. We explore the relationship between sensorial language and traditional stylistic features, like those measured by LIWC, using a novel Reduced-Rank Ridge Regression (R4) approach. We demonstrate that low-dimensional latent representations of LIWC features ($r = 24$) effectively capture stylistic information for sensorial language prediction compared to the full feature set ($r = 74$). We introduce Stylometrically Lean Interpretable Models (SLIM-LLMs), which model non-linear relationships between these style dimensions. Evaluated across five genres, SLIM-LLMs with low-rank LIWC features match the performance of full-scale language models while reducing parameters by up to 80%.
Paper Type: Long
Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining
Research Area Keywords: stylometry, style analysis, LIWC, linguistic style, dimension reduction, sensorial linguistics, sensorial style
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Data analysis
Languages Studied: english
Submission Number: 1217
Loading