Evaluating Text Humanlikeness via Self-Similarity Exponent

Ilya Pershin

Evaluating Text Humanlikeness via Self-Similarity Exponent

Ilya Pershin

Published: 05 Mar 2025, Last Modified: 10 Apr 2025BuildingTrustEveryoneRevisionsBibTeXCC BY 4.0

Track: Tiny Paper Track (between 2 and 4 pages)

Keywords: humanlikeness, self-similarity exponent, fractal structure of language

TL;DR: We calculated the fractal dimension for texts generated by various LLMs.

Abstract: Evaluating text generation quality in large language models (LLMs) is critical for their deployment. We investigate the self-similarity exponent S, a fractal-based metric, as a metric for quantifying "humanlikeness." Using texts from the public available dataset and Qwen models (with/without instruction tuning), we find human-written texts exhibit S = 0.57, while non-instruct models show higher values, and instruct-tuned models approach human-like patterns. Larger models improve quality but benefit more with instruction tuning. Our findings suggest S as an effective metric for assessing LLM performance.

Submission Number: 66

Loading