Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Limited caller AuthZone access and AuthZone stack #297

Closed
wants to merge 9 commits into from

Conversation

devmannic
Copy link
Contributor

This is an implementation of the ideas I put forward in #285.

The high level summary: Caller's can optionally use an AuthZone which will be visible to Callee's. The callee's can optionally indicate the caller proofs they want access to by simply using the AccessRules as they already do. When both sides agree, authorization and follow-on business logic (ie. based on a Proof of NonFungableData) works seamlessly. The caller can specify their "pass by intent" by using an AuthZone to group their Proofs instead of seperate args. This is kind of like varargs for Proofs.

There are 2 complementary features that work together.

  1. AuthZones now live in a stack so actors can push the current one back putting a new empty AuthZone in it's place, or undo that. (instructions StartAuthZone, EndAuthZone and API ComponentAuthZone::start() and ComponentAuthZone::end())
  2. Components can access the caller's AuthZone proofs with the same AuthZone API (not push/pop) using CallerAuthZone::create_proof* as they would have used ComponentAuthZone

These features are joined up forming a nice usable enhancement but with safety restrictions:

  1. The CallerAuthZone is at most a "shadow" of only the top AuthZone in the stack, so it's fully controlled by the caller.
  2. Which proofs are allowed in to the shadow must be explicitly defined by the callee. This is done by using the same AccessRules they already use to protect method calls. When those HardAuthRules are checked, the shadow AuthZone is built up based on the matches which were already being visited. The only overhead is cloning the Proofs. So nothing happens on "allow_all", only explicit rule matches.
  3. And finally, if the caller is only using the default AuthZone, ie they've never done a single StartAuthZone instruction or AuthZone::start() to push a new AuthZone on the stack, then the shadow calculation does not happen at all, leaving the CallerAuthZone empty. So everything is completely opt-in by the caller and protects their tx signing virtual badges. Even the performance hit for the clones only happens when needed.

Would love feedback on the idea now that you can see it more concretely @russellharvey . I think the implementation is pretty straight forward, though there are a few places I still had questions. Also I'm not sure I love the vocabulary I used everywhere, start/end, caller, shadow, etc. A second opinion on if that all makes sense would be nice. Maybe @iamyulong or @talekhinezh can take a look at this PR when they have time (I realize there are other priorities). (See the "FIXMEs for some specific questions"). I also think adding try_create_proofs* APIs to return Option instead of panic would be really helpful to enable more use cases. I'd be willing to add those.

I don't expect this to be merged as is, but it is feature complete with tests and in decent shape. This is mostly meant to move that discussion forward and was a good way for me to dig into some of the Scrypto internals. If it's never accepted that's fine but I think there's room to iterate on this idea for sure at leaset. It was a good learning experience for me either way. I realize some of the SNode and wasm stuff is changing, and at minimum this PR should probably be rebased and fixed up for the develop branch so it certainly needs a little care before it's ready too.

My use case for wanting this is honestly not fully flushed out. But the friction for me came when I was trying to build up a more complex authorization model and interacting with multiple components. In this case I wanted some components to return badges to the caller and then have the caller use them against other components without having to specify them (because they might not know the resource address or ids if another component is in charge of authentication/authorization for a group of components). And of course they would be NFT badges. So this type of thing isn't possible without this PR because NFT badges have to be passed by intent right now so the callee can access the data.

@iamyulong
Copy link
Member

Thank you @devmannic for putting up this PR which introduces the idea of stack of auth zones and exposes top auth zone to the callee.

Before I start reviewing the code, could you help me understand the motivation? I want to make sure the changes are well justified because they adds non-trivial complexity to the system while providing more flexibility.

I was trying to build up a more complex authorization model and interacting with multiple components. In this case I wanted some components to return badges to the caller and then have the caller use them against other components without having to specify them (because they might not know the resource address or ids if another component is in charge of authentication/authorization for a group of components).

Are the badges returned as Bucket or Proof? Either way, the callers should be able to tell the resource address or non-fungible ids. They can also blindly put the proofs into the component auth zone, which can be used as authorisation when accessing other components.

NFT badges have to be passed by intent right now so the callee can access the data.

Yes, but it also depends on how the method logic is going to use the NFT data.

  1. If the method requires some NFT with specific data for access control, I think we should extend AccessRule to support such use cases;
  2. If the method logic depends on the NFT data, I think this is beyond the scope of authorisation and passing by intent is more appropriate (badge is part of the method input).

@devmannic
Copy link
Contributor Author

devmannic commented May 9, 2022

@iamyulong thanks for taking a look

Before I start reviewing the code, could you help me understand the motivation? I want to make sure the changes are well justified because they adds non-trivial complexity to the system while providing more flexibility.

Yes, of course. And let me just say again that think there's more than one way to go about solving the problem I'll describe. I'm open to alternatives. This was what I came up with as the most pleasing to me in terms of both the amount of complexity added and new mental model a developer would need to understand, compared against the features it provides and the implementation cost/complexity.

Are the badges returned as Bucket or Proof?

I was explicitly talking about Proofs, hence all the work around the AuthZone. (But I see the confusion, "badge" can really mean either, which I actually find a little annoying about how our community has created the vocabulary)

Either way, the callers should be able to tell the resource address or non-fungible ids.

And here's where I didn't give you enough information, sorry. That's true of course for a Component caller, but it is unfortunately not true for the transaction manifest "caller". My motivation here is for the tx signer to make a method call within the manifest which would return a bunch of Proofs, and then 1 or more call instructions which require those Proofs. And the tx signer does not need to know the resource or ids of the proofs they get because they are managed by the component for use in authorization to yet other components.

Some of this issue stems from the lack of symmetry between the Worktop and the AuthZone. We have this relation: Bucket is to Worktop as Proof is to AuthZone. But Proofs are really for more than method authorization of course. And we have call_method_for_all_resources to get all the Buckets on the Worktop but no analogous method such as call_method_with_all_proofs for the AuthZone. Having something like this could also be a potential solution. But then see my next paragraph about limiting what ends up in "all_resources" or "all_proofs". (As an aside, I like "with" better than "for" in these names).

Actually I think the AuthZone stack is a concept that is beneficial even if the rest of this PR (ie. accessing the caller's AuthZone) is not accepted. Likewise I would say having a Worktop stack in a similar way as I've created for the AuthZone here would be useful for composing different usages of call_method_for_all_resources. If composability is the killer feature of the manifest then having a way to limit/bound/group the Buckets in the current worktop for calls to call_method_for_all_resources and ways to limit the Proofs that are used for auth on a method call (or in my hypothetical call_method_with_all_proofs) seems to me to be necessary, but currently missing.

They can also blindly put the proofs into the component auth zone, which can be used as authorisation when accessing other components.

This is exactly the usage I'm trying to both utilize and improve upon. But exactly as you said it depends on following usage.

Yes, but it also depends on how the method logic is going to use the NFT data. ...

  1. If the method requires some NFT with specific data for access control, I think we should extend AccessRule to support such use cases;

This is an interesting idea. I think my need is maybe closer to your point "2" so this probably wouldn't solve it for me. Ultimately what's missing for me is that the callee needs access to the NFT data as in your "2" so just extending the AccessRule probably isn't enough. Though for some simpler designs it might be. I think the biggest issue I would have with this as the only solution is that upon failure the only option is panic, whereas more nuanced business logic based on how the NFT data matched or didn't match the AccessRule might be needed.

So on to probably your main point that might squash this PR:

  1. If the method logic depends on the NFT data, I think this is beyond the scope of authorisation and passing by intent is more appropriate (badge is part of the method input).

I don't disagree with this position at all for the general case. The problem is really about the limitations of the current pass-by intent-mechanism as it relates to the transaction manifest as the caller. Think about my use case as wanting to "pass by intent" the result of a previous method call. And critically, do that within the transaction manifest (since it can already be easily done by a component). The only other way to to implement this kind of thing right now is to just have each user instantiate their own Component to do this, but that too is even problematic because they would be required to send the initial badge bucket (the one that allows the first method to return a set of badge Proofs) to this Component, but then because of the proof movement restrictions this wouldn't work either.

If this still isn't quite making sense, or you think it's possible another way (maybe I've missed something) let me know. I can also try to add some more pseudo code, or more flushed out concrete example test case. As of now all the test code is more to prove the feature works but not exactly justify the full need for it.

Thanks for your time discussing this!

@iamyulong
Copy link
Member

Thanks for clarifying. I now get the main problem that you're trying to address.

We have some application logic that returns "dynamic" proofs (address generated at runtime) and we want to use the proofs in transaction manifest.

  1. With system-level auth, there is no way to retrieve the NFT data when the proof is non-fungible;
  2. With pass-by-intent, in manifest specifically, there is no way to pass the dynamic proof since the address can't be referenced.

The proposed solution is to have a stack of authzones (so users can divide proofs into different groups, for security reason I assume) and have the top one implicitly passed to the callee.

This works! My only concern is about the complexity of manipulating the stack in transaction manifest (essentially stack of stack of proofs) for the intended Scrypto developers.

Off the top of my head, do you think a simple create_proof_from_auth_zone <NO_ARGS> instruction would be sufficient for the use case? This will generate a "max" proof if there is only a single type of proofs on the auth zone, and panic if not. May be applicable to transaction worktop "take" as well.

Wait, can't we use pop_from_auth_zone to grab the last proof, although it will be tricky to deal with variable number of proofs.? (When designing the proof system, we also considered generating proofs from user defined proof list; but scrapped it worrying about complexity).

@devmannic
Copy link
Contributor Author

Thanks for clarifying. I now get the main problem that you're trying to address.

No problem. Yup, you've got it right.

This works! My only concern is about the complexity of manipulating the stack in transaction manifest (essentially stack of stack of proofs) for the intended Scrypto developers.

I wanted to avoid adding too much complexity too which is why I chose a stack of AuthZones instead of any more complicated way to reference groups of proofs in the manifest itself. The reason it isn't too complex for a developer IMHO is because they still only ever can access the top one using the same existing API and there's no movement of Proofs between zones either. They simply think of it as another way to get a clean AuthZone but without losing the Proofs they had. Which for composing method calls at the transaction level I think is really useful. It reuses the existing features so no new mental models for ways to group Proofs to impact developers' cognitive load too much. And they only need to know about it if they need it. Once they do want it, the only new instructions are "start" and "end" with no arguments even. When "end" is called any existing proofs are simply dropped.

Something more complex where Proofs "percolate up" to the previous AuthZone when a zone is ending could be done and maybe that could be a parameter to the "end" instruction. But that all seemed not quite worth it to me. At least I didn't want to worry about implement that kind of thing before I got this working well enough to explain clearly. I kind of like the simplicity as it is now, though maybe it's an opportunity to think of more use cases with the "dynamic" proofs and have a more complex transition when the zone ends. Similarly, a more complex "start" could be envisioned which pulls in some existing proofs into the new zone. But defining this list get's tricky as you alluded to.

Also, because I set things up to only pass the topmost AuthZone to the callee when it is not the default zone it makes everything completely opt-in which I really like both for simplicity and for security. (and the implementation for that is very straight forward so it's a nice win.)

Currently, the "end" instruction is even optional (though adding a validation step to enforce that it isn't is certainly possible but i'm not sure it's worth doing, nor did I know exactly how to go about it).

Off the top of my head, do you think a simple create_proof_from_auth_zone <NO_ARGS> instruction would be sufficient for the use case?

I like this idea some for when the caller actually knows/expects a single "dynamic" resource. It lets them enforce that only one Proof (or group that can correctly "max") is there, as a sort of safety check. Maybe a new type of "assert" instruction is better for that though, because on it's own, it is not sufficient. This is because I was expecting to need to handle a list of Proofs where they would potentially be different resources. And once you make a function such as create_proofs_from_auth_zone (plural, to make a Vec< Proof >) I would want a way to avoid sending the signer's virtual badge or indeed any other Proofs that happen to be in the AuthZone when trying to compose multiple method calls in a transaction manifest.

Wait, can't we use pop_from_auth_zone to grab the last proof, although it will be tricky to deal with variable number of proofs.? (When designing the proof system, we also considered generating proofs from user defined proof list; but scrapped it worrying about complexity).

Yeah, variable number is the problem otherwise that seems like it might work. I want to support the variable number because my scenario is having the first callee which returns the proofs be managing a access via set of proofs for possibly different components. So it seems likely they could end up with multiple different resources. But I agree having a user defined list sounds confusing. This is why I like just reusing the AuthZone as it already exists to effectively be that user-defined list. It's just user-defined based on where they start and end the new zone and what happens in between instead of something more explicit. I also wanted to avoid any new "proof container" in the manifest since it seemed a big change both conceptually and possibly at the implementation level.

I dont' have any tests with manifests since I wasn't exactly sure the best way to integrate them in the codebase, but you can see the tests that use the transaction builder as basic examples.

(Also if you haven't looked at the diff yet, don't get put off by the line count. I decided to cargo fmt first. The commits are pretty well separated to review individually as a first pass.)

@devmannic
Copy link
Contributor Author

@iamyulong what do you think are the next steps for this PR?

@iamyulong
Copy link
Member

iamyulong commented May 25, 2022

Hi @devmannic, sorry for the late reply.

I've brought this up the our weekly team meeting. Unfortunately, the team believed that changing from "a single auth zone" to "a stack of auth zones" would complicate the system modelling and only solve a rare use case. While it can be an "opt-in" feature (as developers can continue to use a single AuthZone), it would be hard to explain the system (and all additional operations associated with stack) to less experienced developers.

It is a high-quality PR, so my suggestion would be to re-base it on develop branch (where formatting is enforced) to have an easier-to-read diff. We keep this PR open and see if more developers run into similar issue. We can definitely revisit it with more data points.

What do you think?

@devmannic
Copy link
Contributor Author

Hi @iamyulong

Thanks for explaining the team's decisions. It is quite disappointing. I'm most disappointed by the reasoning given behind not accepting it. Being "too complicated" just sounds like the team does not understand what I've implemented, why it's needed, or what I'm suggesting. With everything else that goes on in the Radix Engine suggesting that this particular feature -- which is needed to solve a real problem, and does so rather elegantly, is completely optional, and builds upon existing features without adding new ideas -- is "too complicated" just seems to be a very naive response. Maybe there was more to it than that?

Here's some further thoughts expanding on that: I fail to see how adding a stack which has no operations on it other than, push (which doesn't push a specific AuthZone, just creates a new one) and pop (which simply destroys one and drops all proofs) is complex by any definition. To me this is not complex in the implementation nor in the design. In fact it is one of the most simple data structures in computer science. If the team thinks the developers they want to learn Scrypto wont understand it then I really have no idea how they could think these same developers are going to build secure DeFi logic or make use of the AuthZone as it is right now anyway. It just does not compute. I understand wanting to keep the bar low for lots of good reasons, but that's one of the reasons this is optional. There is plenty in the RE implementation that is not well explained but doesn't effect less experienced developers but gives more experienced developers tools they might use. How many developers know how to create their own data types with SBOR encoding/decoding? How many devs even understand how the blueprint macro works and the implications for their Rust functions. This is "advanced scrypto" in my eyes and it's well hidden (which is good) but it's there and not "too complex" to include.

"Modeling" the system? I don't know what exactly you mean by that. There is no (public) code nor have I ever seen discussion of any type of formal modeling that this might refer to. Maybe there's something I don't yet know. This is simply a way to group proofs together in a manifest and elsewhere in a very natural and actually probably the least complex way possible. Maybe you're referring to wallets describing a manifest with this feature to users? The old REv1 had blocks for groups of instructions. This is not so different to that, like a (contiguous) "block" of instructions that share an AuthZone.

Regarding the state of this PR then:

There's no motivation for me to update this PR unless someone is going to review it with real potential to integrate it. I'd rather focus my time on useful things. I'd consider updating it just so the team has a better understanding of it's (lack of) complexity though. But you'd have to tell me if anyone would actually look at it. I get the feeling it wont be reviewed. At the moment it's easy enough to review the diff in git skipping over the formatting to see the actual changes. Migrating to develop is non-trivial with the ongoing changes. I think after the next Scrypto release I could try to rebase it to the new main branch as a better target if you think there's still room for more discussion.

End note:

@russellharvey This is more high level feedback. This PR is motivated by a real problem introduced by the asymmetry between fungible and nonfungible tokens, a solution has been presented, no other solutions have been presented, and the team's critique for this is rather lacking in it's response. To say again, I don't really mind that this was given a "no". But the reasoning for this decision as it's been described really makes me concerned about the decision making processes involved, and the design priorities for features and processes for accepting changes. To me this PR is an example of how RDX handles non-trivial contributions. The discussion with @iamyulong have been great and quite open and inviting. I don't mean to point any negativity there at all. However, the actual result from the team leads me to believe that while contributions may be wanted in theory, the organization may not yet be in a place to really receive them. Maybe this sort of change is "too big" and needs more processes to support a real change of this nature. Or maybe other's just don't like this PR's design/implementation. But the reasoning is underwhelming and it seems there's no room for discussion while I've put a lot of effort into explaining the problem and a potential solution after making sure to ask whether this sort of change would really be considered. The "too complicated" answer just seems like it's not really being considered. This is just my opinion based on quite limited information, but thought it worth mentioning that this is how it seems to an outsider even if it is not reflective of the reality. As someone that would like to continue contributing to Scrypto/Radix development this is seriously demotivating. Building a development community is hard. Harder still if it's difficult for developers to meaningfully contribute. There's no silver bullet here, but I wanted to explain this viewpoint to hopefully give RDX Works some motivation to improve in the future. Thanks.

@iamyulong
Copy link
Member

Hi @devmannic , sorry for the disappointment.

Liked I mentioned above the team is mainly concerned with the complexity associated with stack of auth zones which only addresses a rare use case. We need to educate developers with right mental model (stack vs singleton), introduce all the instructions and system apis to support reading/writing the stack, update system auth to support the new model, and make wallet be aware of and support this feature.

The team is definitely happy to revisit this if more people run into similar problem. We just need more data points to decide whether/when to introduce such a change and now is not a good time yet.

For anyone else who is interested, I merged the changes with develop to a temporary branch. You can see the diffs here. More work is required to make it functional, but hopefully it provides some insight into what the scope of change is.

@iamyulong
Copy link
Member

Closing for now as this diverges from the main development.

@iamyulong iamyulong closed this Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants