Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CEL expressions #112

Closed
eguzki opened this issue Oct 16, 2024 · 19 comments · Fixed by #113
Closed

CEL expressions #112

eguzki opened this issue Oct 16, 2024 · 19 comments · Fixed by #113
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed size/medium

Comments

@eguzki
Copy link
Contributor

eguzki commented Oct 16, 2024

Currently, conditions, matches (or whatever you want to call them), are expressed as a PatternExpressions

pub struct PatternExpression {                           
    pub selector: String,                                
    pub operator: WhenConditionOperator,                 
    pub value: String,                                   
                                                         
    #[serde(skip_deserializing)]                         
    path: OnceCell<Path>,                                
    #[serde(skip_deserializing)]                         
    compiled: OnceCell<CelExpression>,                   
}                                                        

Which in configuration looks like:

matches:
    - selector: request.url_path
      operator: startswith
      value: /get

The ask here is to leverage the Common Expression Language to implement predicates that evaluate to bool.

Something like

predicates:
    - "request.url_path.startsWith('/get')"

Note that data type on the configuration: CEL expressions will be strings that need to be parsed.

@eguzki eguzki added enhancement New feature or request help wanted Extra attention is needed size/medium labels Oct 16, 2024
@eguzki
Copy link
Contributor Author

eguzki commented Oct 16, 2024

The envoy attriubutes are available through a WASM-ABI, which in rust, and for Envoy, looks like

pub fn get_property(path: Vec<&str>) -> Result<Option<Bytes>, Status>

There is no way to add all the attributes to the CEL context and make then available to expressions directly. So the question is about how to make envoy attributes (and I underline Envoy here) on the CEL expressions.

Approach 1: CEL functions to the rescue

predicates:
    - "get_property(request.url_path).startsWith('/get')"

Here. we have defined a custom function get_property, that will bind input param(s), i.e. envoy attribute path, to the WASM-ABI actual pub fn get_property(path: Vec<&str>) -> Result<Option<Bytes>, Status> method.

PROS: we can do this now
CONS: expressions are "uglier", IMO though.

Approach 2: CEL Lazy values Or CEL dynamic values

I came across these two RFE on the cel-rust repo:

I would say both would enable us to implement CEL expressions in a clean way, i.e.

predicates:
    - "request.url_path.startsWith('/get')"

PROS: Clean and straightforward
CONS: Not available today from cel-rust

The Envoy locking here is undesirable, but that's what it is in the current state and context.

@alexsnaps
Copy link
Member

There is a third option, which is what actually happens today already:

parse the expression and replace member navigation in the AST directly. Today, when you have a condition that says request.url_path, it gets flatten to a single binding: attribute.

Also, there is more work that I think should happen as part of this: align authorino & limitador's lingo (or at least at the Kuadrant level). I was thinking having the so-called context: AttributeContext struct be reflected through all our CEL environments. wdyt?

@alexsnaps
Copy link
Member

Also, I think having option 1, probably more like get_property("request", "url_path") tho - possibly type safe too, is something I'd do in any case. It's real straight forward, as it is exposing a sugar coated function from the abi to CEL, but mostly gives us a good fallback in case we need it. Finally, tho still be discusses how the Kuadrant Operator will inject these expressions in the config, that replacement could also be done prior configuration the wasm-shim...
Which would then mean very different looking config too possibly tho

@maleck13
Copy link

I like the function option. I don't believe adding this precludes us from allowing request.url_path.startsWith('/get') in a future release as we would maintain the function option but enhance our CEL support with the second option. I wont pretend to know everything about the innards of this, but I don't see any particular issue with the function as the first way to achieve this.

@alexsnaps
Copy link
Member

And this is NOT public API here. We can absolutely replace request.url_path as it'd be expressed in a user facing Policy with get_property("request", "url_path") in the wasm's configuration as part of the validation happening in the Kuadrant controller.

@alexsnaps
Copy link
Member

I think I have a contingency plan here: just have everything become cel expressions (but the routeRuleConditions (which will become routeRulePredicates then? that probably should use the get_property([]) function directly...). On parsing the expressions, we should calculate an "execution plan" for member navigation, i.e. what are all the props accessed and populate a Map with them... request.url_path can navigate such a map just as well as a Struct, so all that needs to be done is:

  • Parse expression
  • Get all the Ident & member navigation from the AST
  • Store that along side the Expression and
  • Populate the Map (of Maps, of... potentially) and inject it in the Context prior evaluation

@eguzki
Copy link
Contributor Author

eguzki commented Oct 16, 2024

Something like this? From https://docs.rs/cel-parser/0.7.1/cel_parser/ast/struct.ExpressionReferences.html

let expression = parse("foo.bar == true").unwrap();
let references = expression.references();
assert_eq!(vec!["foo"], references.variables());

Interesting idea... could work. But we would need the entire variable instead of the first level. As in the example "foo.bar == true", the references.variables() method returns vector with one element: foo. We cannot infer the full path from it. Can we?

@eguzki
Copy link
Contributor Author

eguzki commented Oct 16, 2024

Also, there is more work that I think should happen as part of this: align authorino & limitador's lingo (or at least at the Kuadrant level). I was thinking having the so-called context: AttributeContext struct be reflected through all our CEL environments. wdyt?

Interesting about AttributeContext. I guess the Wasm ABI would allow that with some API methods like get_http_request_headers and get_property. It would imply building a full request context for only needing few attributes, though. But why not :) benchmarking should tell.

EDIT: we are already doing that for auth service https://github.com/Kuadrant/wasm-shim/blob/main/src/service/auth.rs#L55

Makes sense to me

@alexsnaps
Copy link
Member

alexsnaps commented Oct 17, 2024

So here's how this looks like:

let exp = cel_parser::parse(
   "request.headers.all(n, n in auth.allowedHeaders) && auth.identity.secure" // our predicate
).unwrap();

let mut all = Vec::default();
properties(&exp, &mut all, &mut Vec::default()); // gets all member navigation from the AST

assert_eq!(all[0], vec!["request", "headers"]);
assert_eq!(all[1], vec!["auth", "allowedHeaders"]);
assert_eq!(all[2], vec!["auth", "identity", "secure"]);

assert_eq!(all.len(), 3);

So the idea is to use that all: Vec (calculated once at parse time and stored along side the compiled expression), to populate a ValueType::Map that would contain the values of all the paths navigated.

This is from an actual passing test locally. Am now working on the second part, which is lazily populating all that data.

@alexsnaps
Copy link
Member

@chirino I think this covers what you were trying to address with this PR... Anything you see that wouldn't be covered?

@alexsnaps
Copy link
Member

Also, on that front, I wasn't up to date:

I was thinking having the so-called context: AttributeContext struct be reflected through all our CEL environments. wdyt?

We are NOT having context as root binding. I deleted it from the authorino CEL support. It's straight to metadata, request, source, destination & auth.

@chirino
Copy link

chirino commented Oct 17, 2024

@chirino I think this covers what you were trying to address with this PR... Anything you see that wouldn't be covered?

Yeah. I think this would work. If understand correctly, you would use this to minimize the values you add into the cel context right?

@alexsnaps
Copy link
Member

Yeah. I think this would work. If understand correctly, you would use this to minimize the values you add into the cel context right?

Yes, using a Map instead of a "lazy Struct, as the . dot member navigation works for both (i.e. map.key is the same as struct.field accessor, so we pretend the Map is the Struct ...)

@eguzki
Copy link
Contributor Author

eguzki commented Oct 17, 2024

@alexsnaps I was thinking on start working on this issue. It is being unassigned, but I feel you already started. What's the current state?

@alexsnaps alexsnaps self-assigned this Oct 17, 2024
@alexsnaps
Copy link
Member

Yeah, I'm on it, should be done by EOD

@eguzki
Copy link
Contributor Author

eguzki commented Oct 18, 2024

Looking forward to seeing the implementation of

properties(&exp, &mut all, &mut Vec::default());

@alexsnaps
Copy link
Member

In rebase hell, but... I won't have you wait any longer ;)

fn properties<'e>(
    exp: &'e Expression,
    all: &mut Vec<Vec<&'e str>>,
    path: &mut Vec<&'e str>,
) {
    match exp {
        Expression::Arithmetic(e1, _, e2)
        | Expression::Relation(e1, _, e2)
        | Expression::Ternary(e1, _, e2)
        | Expression::Or(e1, e2)
        | Expression::And(e1, e2) => {
            properties(e1, all, path);
            properties(e2, all, path);
        }
        Expression::Unary(_, e) => {
            properties(e, all, path);
        }
        Expression::Member(e, a) => {
            if let Member::Attribute(attr) = &**a {
                path.insert(0, attr.as_str())
            }
            properties(e, all, path);
        }
        Expression::FunctionCall(name, target, args) => {
            if let Some(target) = target {
                properties(target, all, path);
            }
            for e in args {
                properties(e, all, path);
            }
        }
        Expression::List(e) => {
            for e in e {
                properties(e, all, path);
            }
        }
        Expression::Map(v) => {
            for (e1, e2) in v {
                properties(e1, all, path);
                properties(e2, all, path);
            }
        }
        Expression::Atom(_) => {}
        Expression::Ident(v) => {
            if path.len() > 0 {
                path.insert(0, v.as_str());
                all.push(path.clone());
                path.clear();
            }
        }
    }
}

@eguzki
Copy link
Contributor Author

eguzki commented Oct 18, 2024

From https://github.com/clarkmcc/cel-rust/blob/master/parser/src/ast.rs#L155-L203, I see.

Looks like a good RFE for the Expression public API in the https://github.com/clarkmcc/cel-rust/ so we do not need to maintain this ;)

@alexsnaps
Copy link
Member

Yeah... that's been an on-going discussion again lately... tho, really unsure how the "laziness problem" is best tackled from the interpreter's perspective. Plus there is bunch of work that I think should happen before that, e.g. proper protobuf interop. Which, for one would be a huge benefit to us, but would probably also impact how to best go about this... anyways... problem for future us, let's focus on our release for now 😛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed size/medium
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants