Open, Reproducible Morphology Probes for Plains Cree

Published: 14 Dec 2025, Last Modified: 14 Dec 2025LM4UC@AAAI2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Plains Cree, LLM
Abstract: We present a minimal, fully open baseline for morphology-aware evaluation in Plains Cree (nêhiyawêwin) designed for constrained settings. The goal is a recipe that community partners and researchers can run on a single workstation with a stock PyTorch plus Transformers environment and no additional installs, while producing machine-readable artifacts for audit and reuse. We treat a compact open causal language model as a zero-shot probe and evaluate two tasks that reflect real linguistic structure: (1) reinflection given a lemma and a plus-delimited feature bundle, and (2) analysis that outputs a plus-delimited segmentation with feature tags. Tag conventions follow GiellaLT-style resources to keep outputs interpretable. Decoding is greedy. Metrics are exact-match accuracy and average Levenshtein distance for reinflection, and Jaccard overlap over tag sets for analysis. Using a small curated gold set (reinflection $n{=}6$, analysis $n{=}6$), we find limited morphology competence. Reinflection reaches accuracy $0.17$ with average edit distance $3.17$. By mood, Ind outperforms Cnj (accuracy $0.33$ vs $0.00$; AvgED $2.00$ vs $4.33$). By person, only 3Sg yields any exact matches (accuracy $0.50$, AvgED $1.50$). Analysis averages Jaccard $0.00$ overall and in both Ind and Cnj, driven by lemma copying, missing conjunct morphology, and prompt echo that replaces structured analyses with meta-text. These results establish a clear, reproducible baseline and pinpoint failure modes to target with format-constrained prompting, a few in-context exemplars, and lightweight FST-assisted checks while maintaining an open-only constraint.
Submission Number: 18
Loading