Abstract: How often do individuals perform a given communication activity in the Web, such as posting comments on blogs or news? Could we have a generative model to create communication events with realistic inter-event time distributions (IEDs)? Which properties should we strive to match? Current literature has seemingly contradictory results for IED: some studies claim good fits with power laws; others with non-homogeneous Poisson processes. Given these two approaches, we ask: which is the correct one? Can we reconcile them all? We show here that, surprisingly, both approaches are correct, being corner cases of the proposed Self-Feeding Process (SFP). We show that the SFP (a) exhibits a unifying power, which generates power law tails (including the so-called "top-concavity" that real data exhibits), as well as short-term Poisson behavior; (b) avoids the "i.i.d. fallacy", which none of the prevailing models have studied before; and (c) is extremely parsimonious, requiring usually only one, and in general, at most two parameters. Experiments conducted on eight large, diverse real datasets (e.g., Youtube and blog comments, e-mails, SMSs, etc) reveal that the SFP mimics their properties very well.
0 Replies
Loading