Abstract: Sparse Tsetlin Machines (STMs) open up for assigning a small sample of the literals to each clause, and then permanently pruning those literals during learning through absorbing automata states. By trimming the STM in this manner, one can achieve a tenfold speed increase. However, when each clause is limited to the literal sample selected during clause creation, accuracy can drop. To reduce accuracy loss, we here introduce a scheme for streaming unallocated literals through the clauses. That is, the literals left out during clause construction are gradually integrated into the appropriate clauses throughout learning. That is, the literals left out during clause construction are gradually integrated into the appropriate clauses throughout learning. Each time an absorbing state eliminates a literal from a clause, a new unallocated one is added to that clause, placed in an insertion state. We study the effectiveness of the scheme at various degrees of sampling, varying the absorbing and insertion states. In particular, we investigate the effect of incorporating unallocated literal during learning. Across several benchmark datasets, we observe a boost in accuracy with sampling rates 5% and 20%. However, without literal streaming, the accuracy drops markedly for sampling rates 1 % to 4%, which confirms the positive effect literal streaming has on STMs. In conclusion, literal streaming makes the Tsetlin Machine more scalable, yielding higher accuracy with fewer resources.
Loading