Abstract: We introduce Partition of Unity Neural Networks (PUNNs), a neural architecture that
constructs class probabilities directly as a partition of unity, eliminating the need for a
softmax layer. PUNNs produce nonnegative functions that sum to one via a recursive
product of gate functions, guaranteeing valid probability distributions by design.
Our contributions are threefold. First, we prove that PUNNs are dense in the space of
continuous probability maps on compact domains, establishing a universal approximation
guarantee for probabilistic classification. Second, the recursive gate construction induces
a hierarchical rejection chain that explicitly reveals how predictions are formed: each gate
performs a sequential accept/reject decision, passing remaining probability mass onward.
We demonstrate this on MNIST, where the resulting gate trace localizes model uncertainty
and pinpoints specific gating failures in misclassified examples. Third, we evaluate PUNNs
against multilayer perceptrons and Explainable Boosting Machines across MNIST, UCI
benchmarks, and synthetic datasets. Under matched parameter budgets on MNIST, PUNNs
achieve accuracy within 0.4–1.1 percentage points of MLPs, with performance stable across
random class orderings; on UCI tabular benchmarks, the gap is at most one percentage point.
When geometric priors align with the data structure, shape-informed gate parameterizations
can achieve comparable accuracy with up to 300× fewer parameters.
We relate PUNNs to stick-breaking constructions from Bayesian nonparametrics, clarifying
connections to probabilistic modeling while emphasizing the deterministic, input-dependent
nature of the architecture. Overall, PUNNs provide a principled alternative to softmaxbased
classifiers, offering transparent class probability assignments through explicit gate
decompositions, with controlled accuracy trade-offs.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Martin_Mundt1
Submission Number: 8965
Loading