SEMANTIC RHEOLOGY: THE FLOW OF IDEAS IN LANGUAGE MODELS

19 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: word embeddings, random walk, cosine similarity
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Using tools of microrheology to guide and understand self-similar walks over an embedding space
Abstract: The flow of ideas has been extensively studied by physicists, psychologists, and machine learning engineers. This paper adopts certain tools from microrheology to investigate the similarity-based flow of ideas. We introduce a random walker in the word embeddings and study its behaviour. Such similarity mediated random walks through the embedding space shows signatures of anomalous diffusion, commonly observed in complex structured systems such as biological cells and complex fluids. The paper concludes by proposing the application of popular tools employed in the study of random walks and diffusion of particles under Brownian motion to quantitatively assess the incorporation of diverse ideas in a document. Overall, this paper presents a self-referenced method that combines concepts from microrheology and machine learning to explore the meandering tendencies of language models and their potential association with creativity.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1769
Loading