Move as many frequency calculations to the constructor as possible to avoid duplicate calculations over the same corpus. The algorithm itself should remain semantically identical.