Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more transforms for simplex #42

Merged
merged 23 commits into from
Aug 12, 2022
Merged

Add more transforms for simplex #42

merged 23 commits into from
Aug 12, 2022

Conversation

sethaxen
Copy link
Collaborator

This adds the following to transforms for simplex, described in #41:

  • hyperspherical coordinates
  • logistic product (angles of hyperspherical mapped first to (0, 1) then to unconstrained via logistic)

adamhaber and others added 22 commits July 14, 2022 11:24
maintenance (#32)

more maintenance

vectorize

misc

misc

-

 -

make paper for simplex and add randomseed

fix indent

indent ugh

syntax

haha syntax again rip

scripts for slurm
* syncing stuff

* whitespace adjustment

* plot file fix

* stuff

* indent shit

* stuff

* indent shit

* indent

* add figure

* i hate indents

* whitespace

* -

* more changes

* fixmylife

* minor things

* fix bugs again

* add plots

* slight changes

* --

* add time

* replot

* plotmylife

* Revert "plotmylife"

This reverts commit bde7b39.
@sethaxen
Copy link
Collaborator Author

sethaxen commented Jul 23, 2022

For large $N$, e.g. $N=1000$, both of these transforms seem to have problems with initialization. My understanding is Stan initializes uniformly between [-2, 2] in unconstrained space. For $N=500$, the following shows the interval containing 99% marginal posterior intervals for the uniform distribution in the unconstrained space vs this initialization range. In general they don't overlap well. Maybe this isn't an issue. But the high index parameters have different posterior scales than the low index ones, and perhaps this is a challenge for initialization.

Hypersphere (where $y_i = \operatorname{logit}(\phi_i \frac{2}{\pi})$ ):

tmp_hypersphere

Logistic product:

tmp_logistic

@sethaxen
Copy link
Collaborator Author

Comparing with stick-breaking and softmax, which both align much better to Stan's initialization and have more uniform scales in unconstrained space

stick-breaking:
tmp_stick

softmax:
tmp_softmax

Perhaps there's a simple reparameterization that improves the geometry here.

@bob-carpenter
Copy link
Collaborator

bob-carpenter commented Jul 24, 2022

That's right. Stan uses uniform(-2, 2) inits in the unconstrained space. You can specify that bound. One of the things I've wanted to do is evaluate tail numerical stability. What if we move that to +/- 10 or +/- 100 or even 1000?

I shifted Stan's stick breaking prior do that a vector of zeros would initialize to the uniform distribution. Is there a way of doing that for the other parameterizations?

When you're talking about coverage, is that for the uniform distribution over simplexes? What about other simple dirichlet like dirichlet(0.1) or dirichlet(10)? I really like the idea of measuring tail coverage like this. It will complement measuring leapfrog steps to bulk of distribution, which is very sampler and implementation-dependent. In retrospect, I really wish we'd just chosen normal(0, 1) initializations in Stan version 1.0---those would line up perfectly with standardized posteriors.

Edit: We can emphasize the stability transforms like this in the write-up. It's not even so much about Stan's initialization as having something that's roughly standardized in unconstrained space for a uniform distribution. I don't know how to translate that into unbounded things like covariance matrices.

@mjhajharia mjhajharia merged commit aa0c73d into main Aug 12, 2022
@sethaxen sethaxen deleted the simplex_hyperspherical branch August 29, 2022 15:01
mjhajharia pushed a commit that referenced this pull request Sep 11, 2022
@mjhajharia
Copy link
Owner

mjhajharia commented Oct 22, 2022

That's right. Stan uses uniform(-2, 2) inits in the unconstrained space. You can specify that bound. One of the things I've wanted to do is evaluate tail numerical stability. What if we move that to +/- 10 or +/- 100 or even 1000?

ok even -10,10 fails and only things very close to 0 seem to work.

@sethaxen
Copy link
Collaborator Author

With the "logistic product" implementation in this PR, for large N, it failed for me often. But not after the update to "hyperspherical logit" in #55. Are you using the version in this PR or in #55?

@mjhajharia
Copy link
Owner

mjhajharia commented Oct 22, 2022

With the "logistic product" implementation in this PR, for large N, it failed for me often. But not after the update to "hyperspherical logit" in #55. Are you using the version in this PR or in #55?

yeah HypersphericalLogit.stan does work, i was just trying different inits on the previous one before discarding it. and yeah you were right about large N = 1000, all the other 6 combinations of parametrizations do end up sampling 1000 times without failing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants