Hierarchical Voting Experts: An Unsupervised Algorithm for Segmenting Hierarchically Structured Sequences
Abstract: This paper extends the Voting Experts (VE) algorithm (Cohen, Adams, & Heeringa 2007) to segment hierarchically structured sequences. The original algorithm was tested on text segmentation, and made use of two proposed characteristics of chunks, namely low internal entropy and high boundary entropy of segments. VE looks for these two properties, and uses them to segment sequences of tokens. It is surprisingly powerful given its simplicity, suggesting that the principle of segmenting based on low internal entropy and high boundary entropy is promising. Real world data often exhibits an inherently hierarchical structure, and it is well known that humans tend to chunk the world hierarchically (Miller 1956). It is therefore interesting to explore the applicability of a modi ed version of VE on hierarchically structured data. We show that VE can be generalized to work on hierarchical data, and also that the higher order models can be used to improve the accuracy of the segmentation at lower levels.
0 Replies
Loading