The Modular Encyclopedia: LLMs and the Assemblage of Cultural Knowledge

Published: 01 Jun 2026, Last Modified: 01 Jun 2026Culture x AI 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI, Cultural Heritage, Semiotic Theory, GLAMs, Archival Studies
Abstract: Early paradigms for evaluating cultural aspects of generative AI have taken an ethical stance, focusing on bias mitigation, harm reduction, and value alignment. This framed culture in AI and AI in culture primarily as a source of risk to be managed, a narrative that helped more inclusive AI development but at the cost of a delayed conversation on how AI actually does culture in more analytical terms. While work on ethical framing of AI is still very much necessary, it leaves underdeveloped a constructive question: what does it mean for an AI system to positively engage culture? This paper responds to this call for a positive vision of cultural AI, reflecting on the history of technologies and infrastructures that have mediated our access to cultural artifacts and information. In particular, I argue that LLMs, when situated within bounded archives or domains of knowledge through architectures such as RAG, MCP, or domain-specialised small models, instantiate a long-anticipated but never-realised technological form: what Umberto Eco called the encyclopedia. I further argue that this instantiation is modular— that is, a scalable system that can be bounded or universal, based on the epistemological combination of many linguistic assemblages specific cultural domain rather than universal. This modularity is the constitutive feature that distinguishes the new architecture from its predecessors and gives it its interpretive force. Throughout the paper, the argument proceeds across four main points. The first (section 2) contextualises Eco’s encyclopedia within his semiotic theory and critique of dictionary-based models of meaning. The second point (section 3) traces the history of technological encyclopedic forms — from the print encyclopedia and Vannevar Bush’s Memex through CD-ROM, the Web, Wikipedia, and LOD — and shows that each approximates Eco’s concept along one axis while failing on others. The third point (section 4) develops the central claim that LLMs, when bounded by RAG/MCP/small-model architectures, constitute a modular encyclopedia neither replicable by LD nor by “closed” archival systems. The last point (section 5) addresses implications for the evaluation of cultural AI and for the methodological commitments of culturally-situated ML. I conclude with open questions concerning the role of these AI-mediated modular encyclopedias in the context of GLAM digital collections architectures and the broader internet and platform culture, and what it might entail for humanistic inquiry.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 13
Loading