A Study of Large Language Models for Extraction of Themes from Homeless Shelter Case Notes

Madhumitha Selvaraj; Teale Masrani; Yani Ioannou; Geoffrey Messier

A Study of Large Language Models for Extraction of Themes from Homeless Shelter Case Notes

Madhumitha Selvaraj, Teale Masrani, Yani Ioannou, Geoffrey Messier

Published: 25 Jul 2025, Last Modified: 12 Oct 2025COLM 2025 Workshop SoLaR PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, prompt engineering, homelessness, case notes, theme extraction, text classification, text analysis

TL;DR: This study explores how LLMs can be optimized through prompt engineering to effectively analyze homeless shelter case notes to identify abstract client behavioural themes.

Abstract: Homeless shelters generate large amounts of unstructured text data in the form of case notes, which are challenging to analyze using traditional methods due to their variability and domain-specific language. This study explores the use of Large Language Models (LLMs) to extract abstract themes related to client behaviour and experiences from these notes. We focus on prompt engineering techniques and evaluate the performance of smaller LLMs against human-generated labels. Our results demonstrate that for certain themes requiring contextual understanding, smaller LLMs offer advantages over simpler methods such as keyword search or Naive Bayes. However, discrepancies between model predictions and human labels remain, with models occasionally making broad assumptions that may be undesirable. Overall, our findings highlight the role of prompt design in optimizing model performance and demonstrate the potential of LLMs to effectively understand complex homelessness data.

Submission Number: 15

Loading