Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Functions Send + Sync #55

Merged
merged 2 commits into from
Jun 26, 2024
Merged

Conversation

lucperkins
Copy link
Contributor

@lucperkins lucperkins commented Jun 25, 2024

This makes implementations of Function usable in multi-threaded contexts.

I'm happy to put this behind a feature flag if desired.

@clarkmcc
Copy link
Owner

Hm, what's an example of something that doesn't work without this PR? In this example, is the goal to share a Context between two threads?

@lucperkins
Copy link
Contributor Author

@clarkmcc Yes. Specifically, in my case I want to use a Context with Axum, and if you want to bind something to your server as state, it needs to implement Send and Sync. I think it's a pretty straightforward use case.

@clarkmcc
Copy link
Owner

Why would you want to bind Context to Axum state? Context is not intended to be thread-safe. There's really two concepts in this CEL implementation:

  • Program - This is the CEL program that has been parsed and is ready for execution. It is thread-safe, allowing you to use the same program over and over and across the thread boundary. It doesn't require mutable shared access because nothing in a program is mutable.
  • Context - This is where functions and variables are stored. It is not thread-safe because the CEL spec doesn't define the behavior of changing a variable used by a program midway through executing that program.

In the context of an Axum server, let's say that you were implementing some sort of auth validation or something with CEL. The program could be shared across Axum handlers (this would be the CEL logic for validating), but on each Axum request, you'd create a new Context and add variables like the request path, the client IP, etc which could then be reference in the thread-safe program.

This is sort of the mental model that I have for the difference between a Program (thread-safe) and a Context (not thread-safe).

Can you elaborate more on how your use-case differs from the design here?

@lucperkins
Copy link
Contributor Author

lucperkins commented Jun 25, 2024

Basically I want to be able to make authorization decisions for HTTP calls. Policy conditions, expressed as CEL, are supplied by admins and stored somewhere (database, in-memory, etc.). When users want to do something, a Program is created on the fly and evaluated using a Context. That Context would consist of a corpus of built-in functions and variables but would be enriched with specific values for specific evaluations. So let's say admin Big Boss has created the CEL policy username != "bad-actor-420" for some action. Then, one of the admin's employees wants to perform that action. The server determines that the employee works for Big Boss and fetches Big Boss's CEL condition. The Context is enriched with a value for that user's username and then the Big Boss's CEL policy is evaluated, with the result of the evaluation varying based on the user (so good-guy-steve is allowed to perform the action but bad-actor-420 is not). So in this case, the Program varies and, preferably, the Context stays mostly the same.

Now, I'm not wedded to this model and if you think there's a better way to do things given those goals, I'm all ears. The only alternative that really comes to mind is building a brand new context for every request, which I would prefer not to do.

@clarkmcc
Copy link
Owner

clarkmcc commented Jun 26, 2024

@lucperkins thanks for the explanation of the use-case. It makes perfect sense. In addition to making the functions Send + Sync, we'd also need to make the Context itself Send and Sync, correct? What would you propose there? Wrapping everything in a Mutex?

You know what could be interesting... the context can either be a root context, or a child context (see blog post about it) where the child context just has a reference to it's parent which could be another child context, or it could be the root context. This is how we support variable shadowing in CEL macros like .map. When a variable is not found in a child context, we traverse upwards until we get to the root.

We could look at implementing copy-on-write semantics in order to make Context Sync, where you can initialize your functions, your global variables that never change, etc. But then as soon as you add a variable, under the hood you get back a new child context that references the original parent context, allow you to share all the functions and globals, but be able to get a new context with just the additional variables you need for the request.

The only other question is whether we'd run into lifetime issues where the compiler won't be able to prove that the root context outlives the child contexts passed to the Axum handlers.

Anyways, that's sort of a brain-dump. Thoughts?

@lucperkins
Copy link
Contributor Author

lucperkins commented Jun 26, 2024

@clarkmcc First off, thanks a ton in turn for entertaining my suggestion and thinking it through so thoroughly. @cole-h has a maybe-solution to this where he has a struct that wraps Context:

struct PolicyDecider(Context<'static>);

impl PolicyDecider {
    fn new() -> Self { Self(Context::default()) }

    fn evaluate(&self, condition: &str) -> Result<Value, ExecutionError> {
        let mut inner_context = self.0.new_inner_scope();
        inner_context.add_variable_from_value("someVar", Value::from("someVal"));
        let program = Program::compile(condition)?;
        program.execute(inner_context)
    }
}

So basically using the mechanism that you're currently using for map and others. Seems to be working fine in testing and at least seems to avoid any potential mutability-related issues. Does this seem reasonable to you? Anything stand out? If it does seem reasonable, I'd be happy to add some docs around this because I suspect others will have similar goals.

@clarkmcc
Copy link
Owner

@lucperkins yes, this is exactly what I had in mind.

If this meets your needs, I think I'd also like to clean up the API a bit so that it's more obvious what functions to call to do exactly this. I'm also trying to think through whether we can just make this implicit behind the scenes -- i.e. context is always immutable and when you add a variable, it creates a child context on the spot and gives it back. I would need to figure out the best way to do that with support for batching variables into the context, and I would also need to figure out if doing this implicitly presents lifetime challenges for the user.

@lucperkins
Copy link
Contributor Author

@clarkmcc I'd say this indeed meets our needs for now. More than happy to help out re: API design, so please don't hesitate to ping us. I'm closing this specific PR for now as I definitely agree about Send + Sync. I continue to be impressed by this project!

@lucperkins lucperkins closed this Jun 26, 2024
@lucperkins lucperkins deleted the function-send-sync branch June 26, 2024 20:32
@lucperkins lucperkins restored the function-send-sync branch June 26, 2024 20:38
@lucperkins lucperkins reopened this Jun 26, 2024
@lucperkins
Copy link
Contributor Author

@clarkmcc I take it back. I conferred with Cole again, and actually we want to re-propose this specific change. Even when carefully handling mutability as in the pattern I laid out above and using Arc<PolicyDecider>, not having Function be Send + Sync breaks a lot of stuff in Axum (like error handling). We'll continue to explore other solutions here but for now, this would be a helpful unblocker for us.

@clarkmcc
Copy link
Owner

clarkmcc commented Jun 26, 2024

Would this PR be sufficient though? Context is still not Sync even with these changes I believe.

@clarkmcc
Copy link
Owner

And if I could throw out something else for consideration: there's probably a good argument for registering functions with a Program not with a Context. The functions that are required are going to be coupled with the Program that you're running, and there's not a good reason that I can think of that you'd want to take the same program and execute it multiple times with different functions in the context.

If we were to move functions to the Program, and then you initialized a new Context for each Axum request, putting into it only the required variables, would that meet your needs, or are there still benefits to sharing large portions of the Context outside of functions?

@lucperkins
Copy link
Contributor Author

@clarkmcc In our case, it's okay for Context to not be Sync.

The issue with your suggestion, I think, is that we don't know what the Program will be until the request arrives. We ascertain who the user is and what they want to do, then we figure out which condition (CEL string) applies to them, then we update the context with values required for the evaluation, and then we evaluate. So we basically have a big chonky initial Context that we enrich with a few variables on demand and the Program is just a simple string that we pull from a database.

@clarkmcc clarkmcc merged commit 5ecee90 into clarkmcc:master Jun 26, 2024
2 checks passed
@lucperkins lucperkins deleted the function-send-sync branch June 26, 2024 22:02
@clarkmcc
Copy link
Owner

@lucperkins no big deal, but if you have a spare moment, it would be great if you could add a minimum reproducible example of your use-case to the examples directory. Since we're pre 1.0, I'd like to capture as many use-cases as possible so that any breaking changes going forward still support the legitimate use-cases that exist in the wild.

If you don't have the time, not a big deal at all.

@lucperkins
Copy link
Contributor Author

@clarkmcc More than happy to!

@lucperkins lucperkins mentioned this pull request Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants