-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate pass_to_descendant
and redistribute_block_pairs
shrinker passes
#3929
Conversation
if next_node.ir_type == "integer" and bits_to_bytes( | ||
node.value.bit_length() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to keep this condition, but otherwise the test for this fails: the ir tree for @given(st.integers(), st.integers()
looks like [integer {"max_value": 128}, integer, integer {"max_value": 128}, integer
], where the 128 draws are for choosing a forced endpoint. Without checking for equal sizes, we try to redistribute the endpoint draw to the actual draw, which of course does nothing.
As noted in the inline comment, there are plenty of other ways this pass can get tripped up! But we can look at improving or removing it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm tempted to modify the forced-endpoint logic so that it doesn't do this, e.g. by using the weights=
parameter instead of making two draws. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, perfect! Let's do that. Having an ir-strategy correspond to two ir nodes felt bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, problem...we quickly run into our self-imposed width limit of 255 for weights. I imagine this was imposed because Sampler
performance blows up otherwise. But we also definitely don't want to be constructing the 2**32 element list just to say "uniform everywhere except the endpoints", for memory reasons, among others (potentially float loss).
Maybe we need a better structure for expressing weights
, which allows us to specify "upweight just a few elements from this very large range"? Potentially specifying start and end indices where the weighting should apply uniformly to that range, instead of one weight per element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that - new interface idea: weights=
takes a mapping of {value: weight}
, such that len <= 255
and 0 <= sum-of-weights <= 1
. Then we use a Sampler to pick either a value, or the remaining probability mass which means "pick according to unweighted".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the right path forward. But it's fiddly to do so while maintaining correct invariants about forced and children count. Would you object to leaving it for a future PR? I suspect we may have some back and forth on this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, just write a note in #3921 and we can get to it later 👍
Interesting failures! CI succeeds again as a pseudo fuzzer 🙂. This failure can be addressed by making sure the second node we choose isn't forced. Will have to look at the other failure in more detail tomorrow. |
if next_node.ir_type == "integer" and bits_to_bytes( | ||
node.value.bit_length() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm tempted to modify the forced-endpoint logic so that it doesn't do this, e.g. by using the weights=
parameter instead of making two draws. Thoughts?
I can't reproduce the above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - let's merge this now, and revisit the integers refactoring later 👍
I think these are the last remaining directly translatable passes. The remaining two groups are "block programs" (roughly corresponding to new "node programs" with a single instruction
X
, no-
) and "everything else".