Keywords: Algorithmic Probability, Algorithmic Information Theory, Transformation Semigroups, Automata, Random Walks
TL;DR: A Solomonoff-like prior using semigroups instead of UTMs
Abstract: Why do learning systems generalize?
The Solomonoff prior is our current best theoretical answer; it formalizes Occam’s razor into an exponential bias towards simpler hypotheses.
Crucially, its proof requires that both the data source and the learner be Universal Turing Machines (UTMs), idealized computers with infinite memory and time.
And yet, all empirical observations of generalization—biological neural networks and machine learning algorithms—occur exclusively in physical, finite systems.
Dispensing with the unphysical requirements of UTMs, we model finite learners with transformation semigroups, an algebraic framework that immediately applies to neural networks and all finite computational systems. Within this framework, we prove finite analogues of the Solomonoff prior and Kolmogorov invariance:
1. An exponential simplicity prior on ideals (absorbing sets, corresponding to domain partitions), and
2. Invariance of this prior across different generator sets of the same semigroup, up to multiplicative slack.
Intuitively: Given a set of computational primitives, certain distinctions in the input domain are simpler to express and therefore exponentially more probable to be computed.
Furthermore, different computational primitives that can emulate each other exhibit an equivalent bias toward simpler computations, within bounds. This implies that learners capable of emulating their target system inherently acquire the appropriate simplicity prior.
Serve As Reviewer: ~Matthias_Dellago1
Confirmation: I confirm that I and my co-authors have read the policies are releasing our work under a CC-BY 4.0 license.
Submission Number: 24
Loading