Exponential Reservoir Sampling for Streaming Language ModelsDownload PDF

2014 (modified: 16 Jul 2019)ACL (2) 2014Readers: Everyone
Abstract: We show how rapidly changing textual streams such as Twitter can be modelled in fixed space. Our approach is based upon a randomised algorithm called Exponential Reservoir Sampling, unexplored by this community until now. Using language models over Twitter and Newswire as a testbed, our experimental results based on perplexity support the intuition that recently observed data generally outweighs that seen in the past, but that at times, the past can have valuable signals enabling better modelling of the present.
0 Replies

Loading