-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When is a start function's post-return option called? #100
Comments
Ooh, great question! So one thing that partially addresses the question is that, as currently written in Logically, it feels like Definitely open to other ideas, though. |
If the order of lowering is worrisome and it's fixed by consuming everything at once, how would that affect a start function producing 2 values which are then the arguments to 2 separate other start functions? Additionally for consuming everything at once all the results of one start function could go into the parameters of a different start function, but in a different order, so I think the order would be observable there? For the precise semantics of One example question I think that still lingers: if internally within a component there are two start functions which produce two values which are exported from the final root component, in what order are those post-return closures invoked? Is it the order that the embedder consumes the values? Or is there like a queue of pending post-return which gets drained in-order at some point? |
Yep, good point; "all at once" doesn't make sense. So another idea, which perhaps lines up with what you're hinting at too, was that the value index space should perhaps work like a value stack instead of like an index space, with And you're right, it'd be useful to cover this in the Python pseudo-code too; what's tricky (and why I didn't dig into it already) is that it requires taking on full module-linking (ultimately defining a little instantiation-time "linking interpreter" that executes component-level definitions as its instruction set). @pl-semiotics is actively working on a reference interpreter (that embeds the core wasm reference interpreter and is meant to correspond directly to the formal spec) which would include all this and execute proper .wast tests, so that should cover this gap before too long. |
Oh I wasn't thinking of a stack myself, I'm not actually sure what to do about this so I'm just asking questions! If there's only a stack, though, that would prevent consuming values in different orders, right? That would mean that if a start function produced 2 values you couldn't use them in the reverse order as arguments to a different start function? |
The idea is that the stack would ensure the order that the values are consumed, but one remaining question from what I described above is how this ordering is preserved when passing values between components through imports (and |
So I thought some more about what adopting a fully stack-based approach for values would look like over the weekend and I realized it would break our ability to freely union and intersect world files which is what would allow, e.g., a host to run "any component that targets any of these (unioned) world files". In particular, if two world files had different sets of ordered value imports, unioning them would not in general be able to produced a merged order that could run components targeting either world. That seems unfortunate, especially since I think this is something hosts would want to do and we otherwise would mostly get it for free from unordered+named imports/exports. I think this is also analogous to the classic "fragile base class". So, setting aside the idea of fixing the order of value lowering, I was able to think of a modest tweak to the existing proposal that (hopefully) gives us a clear semantics for when to run post-return. The idea is to say that every value in the value index space is a pair (value, post-lower), where post-lower is a closure that is semantically called immediately after the value is lowered. Then, all post-lower closures created from a single |
I think that should work, yeah. I'm also considering the case where a component exports a value to the embedding itself where we still don't want a sort of refcounted counter but I believe the solution there is to have some sort of step where the embedder reads all the values from the component and then, afterwards, has a list of post-return functions to run (e.g. instance flags "may enter" to reset). The dynamic-ness there can probably be papered over with a new JIT function/trampoline/etc, though if necessary. While I'm here and thinking about this, I think this might be a good place to discuss a somewhat tangential but still related question. One thought I had on this is "what to do with an unused value"? This naively isn't possible due to the linearity rules but there's a few cases I think that aren't covered by the current
In general I was under the impression that values were intended to be a small addition to the existing component model that largely fit within the rest of the system seamlessly. I don't think the linearity rule plays well with this, though, and I feel that the linearity on values makes them different enough that it may warrant an integration into components via a different means than the otherwise-standard import/export/alias integration. For example instantiation may want to produce values directly onto the value stack to avoid dealing with aliases. The initial values on the value stack at the start of instantiation could be defined via a new "value section" or something similar. I think things can have a similar flavor where items are named and therefore can be reordered and added to/removed as necessary but I'm becoming more convinced that due to linearity values shouldn't be "just another kind of item" |
Those are also good questions not adequately described by the explainers. The "linearity" requirement means that both of those two cases (in bullets) should be validation errors. Practically speaking, I think the way this would be implemented is that both the value index space and the exports of instances in the instance index space would attach a "used" flag to the value type, used to catch double-use and check use before the end of a component scope. But I agree that these extra complications raise the question of whether it's a bad idea to try to shoehorn values into index spaces or whether values should work like a stack instead (just like core wasm function bodies). I spent some time sketching out how this might look in the limit: e.g., components could have top-level But I suppose another point on the design spectrum would be to keep the order of value imports/exports out of the public component interface, but have the component's internal validation push its |
Sorry I think I may not be being very clear. I'm not trying to say we should have value index spaces or value stacks, I'm instead trying to come at this from a different angle of use cases that seem reasonable given the structure of components and ensuring that the design fits the use cases. Some use cases I'm thinking of are:
I have no horse in the race for index-space-vs-stack or how exactly to represent all this. I think as long as everything lines up in a way that is satisfactory to everyone then I don't really mind how the precise design all ends up. I believe the initial motivation for a stack as opposed to the index space that exists today was to solve the issue related to when to call post-return, but the given conclusion now I believe is that values are actually a (value, post-return) pair where the post-return has a shared mutable counter which is decremented when called. Given that adjustment I'm not sure that a stack is even necessary any more over the flexibility of an index space which allows consuming things in whatever order is necessary as opposed to maintaining a stack discipline. To me the major remaining design question is how you get a value out of a component instance. If it's with an |
Ah, that's a great point and indeed independent of stack-vs-index-space: if subtyping says I can compatibly ignore values, then not using an imported/exported value can't be a validation error. If we want to attach an observable effect to when the last value is consumed, I think this means that we need to associate an explicit point when unused values are implicitly "dropped" (which possibly calls the final post-return), keeping the values "linear". Independently, I had been wondering if, as an optimization for unnused values, we should have a For bullet 4, if we stick with the index-space approach, then the way I was imagining this working as a purely static validation constraint is that, when you alias a value export of an instance, the instance type (used to validate all subsequent definitions, including an Lastly, I agree with your analysis that the index-space is probably a win over the stack approach; I was just opening up the possibility for discussion. |
Yeah I think that I did not think of mutating the type but I'm wary of doing that since AFAIK this would be the first time that a type could be mutated after being introduced. One question that initially comes to mind is how to handle the case where a component is instnatiated and then that component instance is passed to two separate instantiations. Does the first instantiation "erase" all value exports in the instance type, leaving the rest of the instance type available for the next instantiation? In some sense a change like this would require instance types to become linear themselves, but only until they have no value exports I guess? |
Yeah, it's definitely a new thing and worth considering carefully. In programming languages, this sort of "type of a thing changing in response to that thing being used" shows up with, e.g., typestate analysis and session types, so you could think of what we're doing here as in that spirit. And again, stack-machine validation is doing this too, with each instruction transforming the "stack type", so this seems to be in the nature of linear types. The parallel work on formalizing the component model will be useful for precisely defining the validation rules and proving that they ensure what we think they ensure. To your question: after an instance |
At a conceptual level this seems like it could work, but I do want to emphasize again that the implementation will have no similar prior art to draw upon in wasm validators which want to implement this. While sure the stack can be manipulated in wasm functions I don't think that's the same thing because this is the type of a referenceable object. This means that a reference to "instance 1" will have a different type depending on when that reference is queried. I know at least historically in I don't think it's unimplementable by any means but I don't want to accidentally trivialize the difficulty implementation. Already values were I think originally billed as "should be easy to implement given everything else" but as this thread has shown there's a lot of tricky and subtle things and I don't believe they're going to be easy to implement by any stretch. |
Yes, I think that's fair. It may be a good idea to wait to implement this until after some more pressing features are done and reevaluate then. |
In thinking about start functions recently one question I wasn't sure how to answer is when the returned values are "invalidated" or otherwise when the component can be reentered. A better way to phrase this I found was to ask when is the post-return canonical abi option for a start function invoked? Unlike normal functions which have a clearly defined time when the results have been processed (e.g. lifted and then lowered back into the destination) start functions are different where their values are consumed possibly much later during a component's full instantiation process.
Another related but slightly different scenario: if a component exports both a function and a value, if the embedder supports invoking the function before reading the value then this forces the value to be removed from linear memory and copied to the host. Otherwise ideally the embedder would like to leave the value in linear memory but would require that the exported function is not invoked until the value export is read. This seemed like a pretty different model than the current "you get a bag of exports" model that core wasm has and I wanted to make sure this was intentional.
The text was updated successfully, but these errors were encountered: