Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate distributions for StochasticMux #148

Closed
bmcfee opened this issue Jul 16, 2019 · 14 comments
Closed

Alternate distributions for StochasticMux #148

bmcfee opened this issue Jul 16, 2019 · 14 comments

Comments

@bmcfee
Copy link
Collaborator

bmcfee commented Jul 16, 2019

I've been doing a bit of analysis of pescador's sampling behavior, and the Poisson distribution is a bit unwieldy. However, for a specific choice of parameters, drawing n_samples_to_stream from a binomial distribution leads to a clean solution in terms of total replacement rate of streamers.

More generally, I think it would be useful to extend StochasticMux to have a dist= parameter that works as follows:

  • constant: each streamer gets exactly (at most) rate samples.
  • binomial (proposed new default): each streamer gets a binomial draw from Binomial(rate, 1-1/n_active)
  • poisson: current behavior, n_samples ~ 1 + Poisson(rate)
@bmcfee bmcfee modified the milestones: 2.1.0, 3.0.0 Jul 16, 2019
@bmcfee
Copy link
Collaborator Author

bmcfee commented Jul 17, 2019

Quick summary of the analysis: say we have uniform weights over n_active active streamers. Each streamer has a rate limit that's sampled from a binomial distribution

n_samples ~ Binomial(rate * n_active / (n_active - 1), (n_active - 1) / n_active).

  • The expectation of n_samples is rate
  • Pick any active streamer, and let M denote the (random) number of samples we need to draw from the mux before that streamer gets replaced. This has expected value E[M] = rate * n_active.
  • The variance is Var[M] = rate * n_active**2.

These expectations match exactly the case where the number of samples is deterministic: n_samples = rate. The benefit of randomizing n_samples is that you don't get a mass extinction event every rate + K samples; the replacements are less concentrated.

In exhaustive mode, we want to know how long until we hit all the streamers. We can think of this as each streamer slot being replaced n_streams/n_active - 1 times, and the expected time cooks down to rate * (n_streams - n_active) / n_active (exactly what you'd hope for).

Finally, we'd want to know what the chance of a streamer not being replaced is. This can be calculated numerically using the negative binomial survival function (which is ugly AF), but we could also be lazy and use some concentration inequalities. If we think of rate * n_active as the average time of an 'episode', then we could ask what the chance of a streamer lasting some b>=1 episodes longer than average. This probability is upper-bounded by 1/(b**2 * rate); so with high probability, every streamer will be replaced quickly and the distribution should not become unbalanced. (Corollary: the approximation for time to exhaustion above should be pretty good.)

@lostanlen
Copy link
Contributor

Cool!

So is pescador going to be renamed to pascalador then?*

*(https://en.wikipedia.org/wiki/Binomial_distribution#History)

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jul 17, 2019

So is pescador going to be renamed to pascalador then?*

Negative.

@cjacoby
Copy link
Collaborator

cjacoby commented Jul 18, 2019

👍 💯

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jul 19, 2019

Quick update: I implemented this and compared poisson/binomial/constant using the notebook linked from #104. This basically computes the rate at which the muxed distribution approaches the full distribution. The idealized case with n_active = n_total is given in the right-most columns for context.

TLDR there's not much difference across the different schemes, but the scalloping #132 effects are obscured by error regions in these plots. They'll be more pronounced for binomial and constant; less for poisson. The randomized offset idea mentioned at the end of #132 could be a nice solution to that though, and is not tested here.

Results:

image

image

image

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jan 8, 2020

Would anyone object to me implementing this and pushing out a 2.2 release, rather than waiting for 3.x?

@cjacoby
Copy link
Collaborator

cjacoby commented Jan 8, 2020 via email

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jan 8, 2020

One irritating snag here: the binomial distribution explodes on us if n_active=1, because the rate parameter gets scaled up by n_active / (n_active - 1). This is a weird edge case that we never had to worry about before, since the poisson model didn't mix in the active set size.

For n_active > 1, the binomial model has the invariant that the expected number of samples per streamer is always rate, and the variance scales like rate / n_active. So large active sets mean low variance in the number of samples per streamer; small active sets should have high variance in the number of samples per streamer. When n_active=1, maintaining this would require a random variable with mean = variance = rate, eg a poisson variable.

Does it make sense to fall back on poisson for this case? Or should we simply require n_active >= 2 for dist='binomial'? The latter side-steps the issue, but makes the API somewhat inelegant.

@cjacoby
Copy link
Collaborator

cjacoby commented Jan 9, 2020 via email

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jan 9, 2020

My gut is that restricting n_active >= 2 is a worse use experience than falling back to poisson, but I'm not confident in that feeling right now.

I agree with that. I went and redid the analysis allowing for non-uniform streamer weights, and it came out a bit more nuanced. The end story is the same though, convergence to poisson makes sense when we only have one active streamer and still want a randomized refresh.

In implementing this, I ran into some snags with the unit tests. In the poisson rate model, we hacked a bias of +1 into the sampler to avoid having streamers that start and stop without generating data. This is mostly a desirable property in practice, but it messes with the analysis pretty badly. I'd like to avoid that in the binomial model. However, this means that the binomial model does need to handle the n=0 case gracefully. Since our current implementation never had to handle that, it was never explicitly tested .. unsurprisingly, it doesn't work. Will probably take a good amount of hacking to sort it out.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jan 14, 2020

quick update: the problem had to do with pruning empty streams.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jan 15, 2020

To clarify my earlier comment: the issue is that if we set n_samples_per_stream = 0 (randomly selected from binomial or poisson), then a stream can be activated, never produce samples, and then be deactivated and removed from the stream, even though it could produce new data.

What do you (all) think about removing the stream pruning behavior? Or at least, changing the default behavior to not prune. I think it's ultimately inconsistent with our design principles that streams should be restartable (that being a distinguishing point compared to iterables).

@cjacoby
Copy link
Collaborator

cjacoby commented Jan 17, 2020 via email

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jan 31, 2024

image

Since I'm finally getting around to doing some cleanup here for a 3.0 release, I'll quickly summarize how I think this should all go:

  1. Modify the poisson mode so that we do 1 + poisson(λ-1) instead of 1 + poisson(λ). This way, the expected value is λ but we never terminate a streamer without trying to pull from it at least once.
  2. Add a constant mode that always returns (up to) λ samples. Streamers can self-terminate of course, but there would be no randomness from the mux side in this mode.
  3. Add a binomial mode where the number of samples is 1 + Bin((λ-1)/(1-q), 1-q) with q as the (normalized) weight attached to that streamer. This avoids the problem identified earlier (pruning a stream prematurely) while preserving the expected value of λ.
  4. With the above, it won't matter so much if we retain the default (True) for pruning empty streams.
  5. We still need to handle the q=1 case, ie where there is only one active streamer, when in binomial mode. Under the definition above, the binomial would be undefined, but we can appeal to the Poisson limit theorem to replace it by a draw from Poisson(λ-1) as the limiting distribution as q→0 (and n→∞). (Hurray, binomial is still a little fishy, so the name isn't misleading!) I'm happy with this solution, as it behaves consistently and still provides the right kind of randomness in the edge cases.

bmcfee added a commit that referenced this issue Feb 1, 2024
@bmcfee bmcfee closed this as completed Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants