Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather use cases for service accounts as selectors #274

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tssurya
Copy link
Contributor

@tssurya tssurya commented Nov 19, 2024

This came from discussions at KubeCon, I want to make sure they are captured here correctly.
Closes: #173

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 19, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tssurya
Once this PR has been reviewed and has the lgtm label, please assign aojea for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from Dyanngg November 19, 2024 10:01
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 19, 2024
Copy link

netlify bot commented Nov 19, 2024

Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name Link
🔨 Latest commit c53903f
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-network-policy-api/deploys/674575a1f736e20008a0987a
😎 Deploy Preview https://deploy-preview-274--kubernetes-sigs-network-policy-api.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.


## Non-Goals

* There might be other identity management constructs like SPIFFE IDs which are outside the scope of this enhancement. We can only provide a way to select constructs known to core Kubernetes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementor bottlenecks - since implementations might not want to consume the whole CRD...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a conclusion at kubecon that it makes more sense for it to be a new CRD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about L7 layer CRD instead of the L3/L4 matches since if its in the policy its usually for L4/L3 IP based matches

Copy link

@mikemorris mikemorris Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typical service mesh pattern is required/expected identity-based authorization policy at L4 with:

  • some provision for configuring a default deny policy
    • deny all connections except those coming from an allowlisted set of identities
    • only applied logically by meshes after the bits have been allowed to flow at L3
      • some mesh implementations with a CNI component may collapse this logic to enforce at L3
      • potentially some use case to restrict L3 based on ServiceAccount in NetPol to drop connections before mTLS can even be established, but writing this in two separate places for defense in depth feels like a burdensome configuration for cluster operators - curious if a single vendor-neutral policy in a separate CRD can/should be applied by multiple implementations in a complementary manner
  • ideally an "audit" mode to preview the impact of default deny after some granular "allow" rules have been configured
  • allowing TCP connection based on identity only if a lower level like CNI did not already drop/block the attempt at L3
  • support for additional constraints at L7 (such as only allowing requests from a given identity on specific paths or with specific HTTP/gRPC methods)
    • many (most?) meshes support this now, but historically this capability was often added years later than initial L4 AuthZ policy implementations
    • ideally this is an extra optional field on an existing AuthZ policy, not a separate CRD from the L4 identity-based policy

Copy link
Contributor Author

@tssurya tssurya Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mikemorris for the insights here! very valuable, I have tried to capture some parts of this point and the point @danwinship mentioned above under "unresolved discussions" so that its written down somewhere..

some mesh implementations with a CNI component may collapse this logic to enforce at L3

would you know some sample implementations for mesh that consolidates this with the CNI level? I am guessing Cilium is one?

@@ -0,0 +1,440 @@
# NPEP-133: FQDN Selector for Egress Traffic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this related?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it looks like this is the rendered website source for a prior NPEP, probably need to just regenerate the current website content in a new PR and rebase this after that merges.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah its not related at all, running make docs generated all this extra stuff, I can split it into a new PR

Ability to express subjects and peers in (Baseline)AdminNetworkPolicy API
using [Kubernetes service accounts](https://kubernetes.io/docs/concepts/security/service-accounts/).

## Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the main goal is to provide network policies based on identities, and service accounts are a good primitive in Kubernetes to do that.
I also think that we want to look for this new model to be more scalable to avoid implementation having to track all the existing Pod IPs

@thockin your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the interest in describing this as being "based on identities", particularly if you say that arbitrary CIDRs can be "identities". How is calling this "based on identities" better than saying "you can select CIDRs or service accounts or pod labels"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

particularly if you say that arbitrary CIDRs can be "identities"

I don't think arbitrary CIDRs should be considered as identities - there's a bit of nuance in that some implementations of identity-based authorization policies may ultimately resolve those identities to IPs for enforcement, but an end-user would not be interleaving IP-based policy rules with identity-based policy rules even if the dataplane may collapse them together. https://www.solo.io/blog/could-network-cache-based-identity-be-mistaken and https://cilium.io/blog/2024/03/20/improving-mutual-auth-security/ have more specific details on the potential lossiness of this approach and possible mitigations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I have added the scale and consistency user story to the next section under user stories, maybe I can change the goals to reflect "identity" or refactor the way I am writing "service account as a way to select pods"

Depending on whether we feel this is a new API CRD or part of ANP -> in which case we need to decide also the points Dan is raising: https://github.com/kubernetes-sigs/network-policy-api/pull/274/files#r1850516386 we might need to tweak the TLDR/Goals sections here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to change the TLDR and Goals to reflect the high level abstract ones instead of the specifics I had in mind in putting them into ANP/BANP based on the discussion we had in the sig-network-policy-api meeting on tuesday (19th) as its not really clear if this is going to be part of netpol API or a new API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second mike here. The main benefits I can see are;

  1. Users dont interact with CIDRs/IPs.
  2. Some implementations of the API dont need to track IPs to provide this functionality (when using identities).
  3. Serviceaccounts are more secure than labels. e.g they dont span namespaces, can not be changed at runtime.

We dont have to be opinionated on how the API would be implemented, and service account as a way to select pods can just be an implementation detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments were based on Cilium's definitions, where (AIUI) everything you can apply a network policy toward is considered an "identity". (e.g., this blog post talks about how it turns podSelectors into "identities").

I had initially assumed people were talking about the same thing here. In particular, I had assumed that people were still talking about having a mix of "traditional" NetworkPolicy rules and new service-account-based NetworkPolicy rules, but calling the whole thing "identity-based", which seemed pointless.

But it sounds like what most of the people here actually want is a separate parallel network policy stack, which only uses "identity-based" rules based on service-account/mTLS/SPIFFE/whatever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments were based on Cilium's definitions, where (AIUI) everything you can apply a network policy toward is considered an "identity". (e.g., this blog post talks about how it turns podSelectors into "identities").

I

cilium identities are clear a no goal, and now I realize that makes the terminology confusing.

If we consider identity as "the set of characteristics that uniquely identify an application (running in a Pod)".
IPs can uniquely identify a Pod, but if a deployment is rolled out the IPs will change (pods don't restart are recreated), so the rules applied directly to IPs will not be identity based.

We can use the labels to identify the pods of the deployments/daemonsets but this identity system seems mostly based on the implementation details of kubernetes, and labels can be added arbitrarily.

Service Accounts by the kubernetes docs definitions are "a type of non-human account that, in Kubernetes, provides a distinct identity in a Kubernetes cluster. Application Pods, system components, and entities inside and outside the cluster can use a specific ServiceAccount's credentials to identify as that ServiceAccount. "

I think that ServiceAccounts can work independently on how we translate the API to the implementation, map ServiceAccount to IPs, use ServiceAccount Tokens or Certificates , .... and also allow us to define cross-cluster or cross-services policies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm my understanding was the same as Dan's which is why I called out the non-goal here but seems like that is one of the main goals looks like :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we discussed this during yesterday's netpol api meeting and agreed that SPIFFEEID can be a non-goal for V0 maybe but maybe good to track the user stories for all of this first


## User-Stories/Use-Cases

1. **As a** cluster admin **I want** to select pods using their service accounts instead of labels **so that** I can avoid having to setup webhooks and validation admission policies to prevent users from changing labels on their namespaces and pods that will make my policy start/stop matching the traffic which is undesired
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to "select pods" as using the service accounts as an aggregation of pods, I think we want to provide identities to the applications and then implement policies based on those identities, how to implement it is a different thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm during the kubecon meeting I thought for the purpose of this working group if we were to put it into (admin) network policies then it has to be pods because we don't really have any other identity in core kubernetes that the API recognises and do we define abstract things in the API? Plus the L3/L4 is only where we implement policies not at L7,
however if we plan to go with new CRD for expressing this then we could call it an identity but even then service account today is only for pods right? what are the other identites we plan to cover use cases for? I called our SPIFFE use case as non goal since we don't plan to address that in this API right? I can move it out of the non-goals if the word "identity" is better suited than "serviceaccounts of pods"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think we should have a separate API for this that is strictly focused on identities. The starting point has to be identities based on Service Accounts that both mesh and CNI implementations would have a path to implement.

however if we plan to go with new CRD for expressing this then we could call it an identity but even then service account today is only for pods right? what are the other identites we plan to cover use cases for? I called our SPIFFE use case as non goal since we don't plan to address that in this API right? I can move it out of the non-goals if the word "identity" is better suited than "serviceaccounts of pods"

I'd personally rather focus on identity, with ServiceAccount as our starting point. Even though that's roughly the same thing, it leaves room for future expansion into other identities, potentially following a SPIFFE format. That seems like one of the key benefits of an identity-based policy, so I'd like to ensure we're leaving room in the API to eventually support identities outside of the local cluster.

## User-Stories/Use-Cases

1. **As a** cluster admin **I want** to select pods using their service accounts instead of labels **so that** I can avoid having to setup webhooks and validation admission policies to prevent users from changing labels on their namespaces and pods that will make my policy start/stop matching the traffic which is undesired
2. **As a** cluster admin **I want** to select pods using their service accounts instead of labels **so that** I can avoid the scale impact caused due to mutation of pod/namespace labels that will cause a churn which makes my CNI implementation react to that every time user changes a label.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this ends on translatiing service accounts to Pods, the scale problem is going to be worse, because CNI/Netwokr plugins will need to watch both service accounts and pod objects, today they just watch one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the idea is that they'd just match on pod.spec.serviceAccountName. They don't have to look at the actual ServiceAccounts at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I think we all agreed on that day "using labels to select serviceAccounts" should be dropped cause then it just re-introduces the problems and instead directly match on the name of the account, but yea that is an implementation detail

## Goals

* Implement service account as a way to select pods in the policy's subject
* Implement service account as a way to select pods in the policy's peers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were two completely independent sets of goals in the discussion:

  1. Use service accounts / "identities" as a more scalable / less accidentally-explodable way of describing relevant policies about pods.
  2. Provide a standard way to configure/describe the service mesh behavior of intercepting traffic and deciding whether to allow it based on information in the TLS packets.

The latter part was much more complicated and problematic, and I'm not sure we came up with any good answers. In particular, there seemed to be no good way to say "deny all, except allow particular serviceAccount-to-serviceAccount connections" if the "deny all" was going to be implemented by the network plugin but the "allow based on serviceAccount" was going to be implemented by the service mesh. It seemed like the best we could do was to have the network plugin implement that as "deny all except allow particular pod-to-pod connections by matching serviceAccounts up to pod IPs" and then having the service mesh implement "allow TLS connections by matching serviceAccounts up to TLS certificates", which would, in the normal case, be entirely redundant because the service mesh is already ensuring that all pod-to-pod traffic uses exactly the expected TLS certificates, so the TLS checking rule would end up allowing everything since the IP-checking rule would already have filtered out any disallowed connections.

But I feel like, as the meeting ended, we had not yet given up on that idea? (Partly because the meeting ended abruptly because we ran out of time...)

Copy link

@mikemorris mikemorris Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the "deny all" was going to be implemented by the network plugin but the "allow based on serviceAccount" was going to be implemented by the service mesh

Yea I think that should not be expected to function seamlessly - the "deny all" behavior needs to be implemented by the mesh layer, and is only applied if the IP-based connection has been allowed by the CNI layer. Some cluster operators with a particularly "defense in depth" posture may stack policies like "deny all except port 443" or even want a ServiceAccount selector on L3 NetPol (for use case 1) in addition to cryptographic authorization provided by a mesh, although that would be particularly burdensome to operate as it would require coordination between networking and app dev teams (who typically own AuthZ policies for their own services) to update authorization policies and network policies (which can still be a familiar pattern in more traditional organizations).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but it's not necessarily "deny all". You could have an AdminNetworkPolicy that says:

egress:
  - action: Deny
    to:
      - pods:
          podSelector: ...
  - action: Allow
    to:
      - serviceAccount: foo
  - action: Deny
    to:
      - pods:
          podSelector: ...

where the identity-based part of the policy is sandwiched between two non-identity-based parts.

There does not seem to be any plausible way to have the podSelector parts of that rule be implemented by the network plugin but have the serviceAccount part be implemented by the mesh (without implementation-specific coordination between them, or, obviously, if the network plugin is the mesh).

More generally, it feels like all network policy (in the allow/deny sense) needs to be implemented by a single component. Or at least, for every pair of destinations A and B, there must be only a single component implementing network policy between them. (eg, you could potentially have a service mesh doing TLS-based policy for pod-to-pod traffic, with another network plugin doing IP-based policy for pod-to-external and external-to-pod traffic).

Copy link

@mikemorris mikemorris Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does not seem to be any plausible way to have the podSelector parts of that rule be implemented by the network plugin but have the serviceAccount part be implemented by the mesh

Yea, absolutely agreed - that was a significant motivation for the conclusion we somewhat reached at the end of the discussion that a new CRD specifically for ServiceAccount/identity-based authorization policy only would likely be preferable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good "UNRESOLVED NEED TO THINK ABOUT" discussion that I should probably capture in one of the subsections, will add a unresolved discussions section and start putting these there

@robscott
Copy link
Member

Thanks for writing this up @tssurya!

/cc @robscott @LiorLieberman

@thockin
Copy link

thockin commented Nov 25, 2024

I will take a look at this ASAP, too (rough with US holiday this week).

@tssurya tssurya force-pushed the serviceAccountMatch branch from 23d6732 to c53903f Compare November 26, 2024 07:15
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 26, 2024
Comment on lines +33 to +37
1. **As a** cluster admin **I want** to select pods using their service accounts
instead of labels **so that** I can avoid having to setup webhooks and
validation admission policies to prevent users from changing labels on their
namespaces and pods that will make my policy start/stop matching the traffic
which is undesired
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand this user story: the only difference between labels vs serviceAccount here is that labels can be changed after pod creation, and serviceAccount can't. But the user can still create a new serviceAccount and a new pod using this account, that will not "match" a policy.
This user story kinda suggests that admin sets up ANPs using pod labels, and also creates or labels all pods in the cluster to match created ANPs, and doesn't want the users to be able to change those pod labels?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a huge different IMO.

  1. This is as much a human issue rather than a technical one, but users are very used to using serviceAccount as a security boundary -- it serves essentially no purpose other than security, so its heavily scrutinized. Both in human reviews ("why are you changing the SA") and in validation tooling
  2. Its far more likely to have 'update pod labels' permission on a workload than 'create a pod'. I'd wager >50% of clusters have a DaemonSet with that permission (which implies that if ANY node is compromised, any pod-label-based network policy can be bypassed).
  3. The cross-namespace label selection aspect is easy to get wrong. For example "allow: role=admin" seems fine, except I gave my devs access to a namepace dev-blah and they have arbitrary label access. Oops! This can be avoided by using a namespace selector as well, but the API doesn't enforce it. With SA it would (presumably) be impossible to not specify the namespace

Some more discussion in https://blog.howardjohn.info/posts/netpol-api/ and https://blog.howardjohn.info/posts/node-zerotrust/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only difference between labels vs serviceAccount here is that labels can be changed after pod creation, and serviceAccount can't. But the user can still create a new serviceAccount and a new pod using this account, that will not "match" a policy.

yes and that is exactly the use case I am trying to cover which is the fact that labels are mutable on pods and service accounts are not - create and delete of pods for sure is going to cause the same churn for both selectors, but updates of pods needs that extra step/precaution for labels.
happy to reword this based on suggestions if something is not clear

Comment on lines +54 to +57
between the CNI plugin that implementations the network policy at L3/L4 and
service mesh implementation that implements the policy at L7?
* One way is to probably split the implementation responsibility:
1. CNI plugin can take of implementing denyAll connections and
Copy link
Contributor

@aojea aojea Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is assuming that is either CNI plugin or service mesh, and can perfectly be another component implementing it, see kube-network-policies per example ... I think we need to be agnostic of the implementation and focus on the behaviors we want to implement ... "we want to express that a group of pods using a specific service account can or can not communicate with another group identified by another service account" ... at what point happens the policy enforcement should not be in the scope and left to the implementation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how a non-CNI can implement it then. If both the CNI and the mesh are trying to enforce the same rule, it is either redundant (they both apply the same logic) or broken (one accepts, one denies).

For instance, I may have kube-network-policies and Istio, and say "allow SA foo". I get a TLS request from out-of-cluster with a TLS SAN mapping to foo. Istio would allow it but kube-network-policies would deny it.

IMO we need something like networkPolicyClass to specifically select which layer handles it

with other types of selectors like namespaces, pods, nodes, networks, fqdns and
expecting mesh implementations to implement the full CRD just to get the identity
based selection might not be practical
* Making it part of new API: We fall back to compatibility problem of the different
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why compatibility issues, the different policies applies should be an AND, in which sense is this incompatible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see compatibility issues but I can see terrible UX if we end up with another API which sits on top of netpol. This means that as a user, to allow point A to talk with point B on a cluster with default deny I need to:

  1. add a netpol to allow the traffic
  2. add an identity-based policy to allow it
  3. potentially add an L7 policy if I want L7 attributes based policy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a non-trivial set of implementations that would like to support an identity-based version of network policy, but would not support the additional set of APIs based on label selectors. To me, that suggests that it would be cleaner to have a new API focused on identity-based APIs.

I agree that @LiorLieberman's example is not a great UX, but I'd hope that most users would be fully served by only one of these APIs and would not need to layer like that. Could we express default deny within this new policy to avoid layering?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in Istio we would ideally have a policy that consists only of SA/identity and not pod label selectors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why compatibility issues, the different policies applies should be an AND, in which sense is this incompatible?

probably compatibility is not the right word here, but I meant the layering problem, but if we agree "AND" of the APIs is what we are after wherein L3/L4 netpol API is denying a connection but the new identity API one wants to allow that (for podsSAs) then we deny cause we need it to be allow on all APIs and there is one API denying the connection even if the other API explicitly allows it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we express default deny within this new policy to avoid layering?
@aojea interested in your thoughts regarding default deny with this

CNI component in them that collapses the logic of enforcement - so maybe
that might become a requirement that the implementation be able to handle
end to end enforcement of the full policy
* We might also want to express additional constraints at L7 (such as only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should put this in scope, L7 is a totally different layer with much more complexity and infinite number of protocols with different semantics

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think we should not care about it and leave it to Gateway/mesh implementations. Most probably will be a different API.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree. FWIW Istio literally splits these layers into two components (so there are really 3 layers in the system: L3/L4 CNI, L4 TLS, L7 TLS + HTTP)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack; if we shouldn't worry about additional L7 requests I can remove this point, I think I was trying to capture some talking points from #274 (comment) but I can remove this given we probably don't want to club that here

Copy link
Member

@LiorLieberman LiorLieberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for capturing the notes @tssurya! I left a few comments

Ability to express subjects and peers in (Baseline)AdminNetworkPolicy API
using [Kubernetes service accounts](https://kubernetes.io/docs/concepts/security/service-accounts/).

## Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second mike here. The main benefits I can see are;

  1. Users dont interact with CIDRs/IPs.
  2. Some implementations of the API dont need to track IPs to provide this functionality (when using identities).
  3. Serviceaccounts are more secure than labels. e.g they dont span namespaces, can not be changed at runtime.

We dont have to be opinionated on how the API would be implemented, and service account as a way to select pods can just be an implementation detail.

This NPEP tries to capture the design details for service accounts as selectors
for policies.

## User-Stories/Use-Cases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the goals/motivation is also to decrease the implementation complexity and allow larger scale. I think we want to cover this here or in the goals section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiorLieberman: check user story2 does that cover what you had in mind? If its not clear and requires clarity I can try to reword it based on suggestions

CNI component in them that collapses the logic of enforcement - so maybe
that might become a requirement that the implementation be able to handle
end to end enforcement of the full policy
* We might also want to express additional constraints at L7 (such as only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think we should not care about it and leave it to Gateway/mesh implementations. Most probably will be a different API.

Comment on lines +51 to +60
* NetworkPolicies apply on L3 and L4 while Meshes operate at L7 mostly. So when
trying to express "denyAll connections except the allowed serviceAccount to
serviceAccount connections"; how do we split the enforcement in this scenario
between the CNI plugin that implementations the network policy at L3/L4 and
service mesh implementation that implements the policy at L7?
* One way is to probably split the implementation responsibility:
1. CNI plugin can take of implementing denyAll connections and
allowed serviceAccount to serviceAccount connections upto L3/L4
2. service mesh implementation can implement the allowed serviceAccount to
serviceAccount connections on L7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This concerns me the most. Cant think of a good way to divide the responsibility between mesh and cni in the same API without creating an extremely complex UX (and perhaps less portable experience)

with other types of selectors like namespaces, pods, nodes, networks, fqdns and
expecting mesh implementations to implement the full CRD just to get the identity
based selection might not be practical
* Making it part of new API: We fall back to compatibility problem of the different
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see compatibility issues but I can see terrible UX if we end up with another API which sits on top of netpol. This means that as a user, to allow point A to talk with point B on a cluster with default deny I need to:

  1. add a netpol to allow the traffic
  2. add an identity-based policy to allow it
  3. potentially add an L7 policy if I want L7 attributes based policy.

service mesh implementation that implements the policy at L7?
* One way is to probably split the implementation responsibility:
1. CNI plugin can take of implementing denyAll connections and
allowed serviceAccount to serviceAccount connections upto L3/L4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this, as I was saying in #274 (comment), is that it seems pointless. If this is going to work correctly, without accidentally allowing any connections that aren't supposed to be allowed, then the ANP/NP implementation needs to let through exactly the set of connections that the mesh implementation is going to inspect, which implies it needs to understand "serviceAccount selectors" and how they correspond to pod IP addresses, but at that point, it can just validate the serviceAccount selectors itself based entirely on IPs, and the semantics of the service mesh guarantee that this is exactly equivalent to validating based on the TLS certificates, so the mesh doesn't actually have to do anything.


So we've been thinking about the TLS-based validation as being something roughly parallel to the "NetworkPolicy" step of our policy evaluation chain:

          +--------------------+    
          | AdminNetworkPolicy |    
          +--------------------+    
                    /\              
                  /    \                        
                /        \                      
  +---------------+    +------------------+
  | NetworkPolicy |    | TLS-based policy |
  +---------------+    +------------------+                                          
                \        /          
                  \    /            
                    \/              
      +----------------------------+
      | BaselineAdminNetworkPolicy |
      +----------------------------+                                    

but what if instead it was parallel to the BANP step?

          +--------------------+    
          | AdminNetworkPolicy |    
          +--------------------+    
                     |              
                     |              
                     |              
             +---------------+                                         
             | NetworkPolicy |                                         
             +---------------+                                          
                    / \             
                  /     \           
                /         \         
           +------+    +-----------+
           | BANP |    | TLS-based |
           +------+    +-----------+

We'd say, if you want to use TLS-based policy, then that becomes the baseline (at least for pod-to-pod traffic). Then we don't have to worry about interleaving IP-based and TLS-based policy, because you'd only reach the TLS-based policy step if you had already exhausted the IP-based steps. (This doesn't mean you can't have "deny all unless accepted by serviceAccount match", it just means that the "deny all" part has to be implemented by the TLS-based policy implementation rather than by the core ANP/NP implementation in this case.)

This may go with the idea I had vaguely tossed out about replacing BaselineAdminNetworkPolicy with "ClusterDefaultPolicy" as part of trying to make tenancy make sense...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at that point, it can just validate the serviceAccount selectors itself based entirely on IPs, and the semantics of the service mesh guarantee that this is exactly equivalent to validating based on the TLS certificates, so the mesh doesn't actually have to do anything.

Isn't there some eventual consistency at play here? With a TLS certificate, I know the identity of the Pod with confidence. With an IP address, I'm assuming it matches the Pod that had that IP based on the most recent data I have, but if my Pod -> IP mapping is out of date, this isn't as strong of a guarantee as a TLS cert is.

This doesn't mean you can't have "deny all unless accepted by serviceAccount match", it just means that the "deny all" part has to be implemented by the TLS-based policy implementation rather than by the core ANP/NP implementation in this case.

+1, I think I was trying to get to a similar point in another comment. "Deny all" could also be implemented at the identity-based policy layer.

This may go with the idea I had vaguely tossed out about replacing BaselineAdminNetworkPolicy with "#178 (comment)" as part of trying to make tenancy make sense...

It would definitely be worth exploring a way to express "default deny" in a sufficiently generic way that any category of implementation could support it. For example, that seemed to be focused on namespace granularity, which could be some common ground in terms of implementability.

Comment on lines +33 to +37
1. **As a** cluster admin **I want** to select pods using their service accounts
instead of labels **so that** I can avoid having to setup webhooks and
validation admission policies to prevent users from changing labels on their
namespaces and pods that will make my policy start/stop matching the traffic
which is undesired

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a huge different IMO.

  1. This is as much a human issue rather than a technical one, but users are very used to using serviceAccount as a security boundary -- it serves essentially no purpose other than security, so its heavily scrutinized. Both in human reviews ("why are you changing the SA") and in validation tooling
  2. Its far more likely to have 'update pod labels' permission on a workload than 'create a pod'. I'd wager >50% of clusters have a DaemonSet with that permission (which implies that if ANY node is compromised, any pod-label-based network policy can be bypassed).
  3. The cross-namespace label selection aspect is easy to get wrong. For example "allow: role=admin" seems fine, except I gave my devs access to a namepace dev-blah and they have arbitrary label access. Oops! This can be avoided by using a namespace selector as well, but the API doesn't enforce it. With SA it would (presumably) be impossible to not specify the namespace

Some more discussion in https://blog.howardjohn.info/posts/netpol-api/ and https://blog.howardjohn.info/posts/node-zerotrust/

3. **As a** cluster admin my workloads have immutable identities and **I want**
to apply policies per workloads using their service accounts instead of labels
**since** I want to have eventual consistency of that policy in the cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plausibly: As a cluster admin, I want to assert policies against workloads outside of the cluster

Whether this is to other k8s clusters or non-k8s, having SA enables this if you make the implementation choice that SA maps to a transport identity (i.e. in Istio it maps to TLS SAN)

Copy link
Contributor Author

@tssurya tssurya Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm so this would be the non-goal I had listed out but we can revisit this and I can try to add this user story as well as a goal - given maybe the entity like you said is on another k8s cluster, let me learn more about the istio use case here and see if for each of my above use cases I can come up with concrete examples for the user story

serviceAccount connections"; how do we split the enforcement in this scenario
between the CNI plugin that implementations the network policy at L3/L4 and
service mesh implementation that implements the policy at L7?
* One way is to probably split the implementation responsibility:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative option is to have a networkPolicyClass (or similar).

Comment on lines +54 to +57
between the CNI plugin that implementations the network policy at L3/L4 and
service mesh implementation that implements the policy at L7?
* One way is to probably split the implementation responsibility:
1. CNI plugin can take of implementing denyAll connections and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how a non-CNI can implement it then. If both the CNI and the mesh are trying to enforce the same rule, it is either redundant (they both apply the same logic) or broken (one accepts, one denies).

For instance, I may have kube-network-policies and Istio, and say "allow SA foo". I get a TLS request from out-of-cluster with a TLS SAN mapping to foo. Istio would allow it but kube-network-policies would deny it.

IMO we need something like networkPolicyClass to specifically select which layer handles it

CNI component in them that collapses the logic of enforcement - so maybe
that might become a requirement that the implementation be able to handle
end to end enforcement of the full policy
* We might also want to express additional constraints at L7 (such as only

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree. FWIW Istio literally splits these layers into two components (so there are really 3 layers in the system: L3/L4 CNI, L4 TLS, L7 TLS + HTTP)

with other types of selectors like namespaces, pods, nodes, networks, fqdns and
expecting mesh implementations to implement the full CRD just to get the identity
based selection might not be practical
* Making it part of new API: We fall back to compatibility problem of the different

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in Istio we would ideally have a policy that consists only of SA/identity and not pod label selectors.

@tssurya
Copy link
Contributor Author

tssurya commented Dec 3, 2024

lots of comments here! :) will look through them and try to cover/reflect the discussions or at least write it down in the next update I push

Comment on lines +56 to +61
* One way is to probably split the implementation responsibility:
1. CNI plugin can take of implementing denyAll connections and
allowed serviceAccount to serviceAccount connections upto L3/L4
2. service mesh implementation can implement the allowed serviceAccount to
serviceAccount connections on L7
* NOTE: There are some service mesh implementations which have a
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic allow/deny on service account interops fairly well:

  • In dataplane, you match on IPs associated with pods that have the given service account.
  • At L4-7 you actually check the TLS credentials match the service account.

Calico has had this capability for a long time.

Comment on lines +65 to +68
* We might also want to express additional constraints at L7 (such as only
allowing requests from a given identity on specific paths or with specific
HTTP/gRPC methods) - Ideally we would want these extra fields to be coexisting
with the L3/L4 Authorization policy
Copy link

@fasaxc fasaxc Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where it gets much harder to reason about. Assuming the network layer gets the first look at the packet, it has to let through anything that might match the L7 rule. Given that rules are ordered, that means that

allow spiffe://qwerty to path /x/y/z 
allow from selector x == "y"
deny all

is problematic; at the network layer, we'd have to interpret that as "allow all" on the assumption that L7 will kick in and implement the first rule.

Calico does this too (not with spiffe IDs, but with HTTP path/method) with our L7 policy but it's not pleasant; it means that we have to maintain a user-space policy engine that can handle, not only the L7 rule at the start but also the second and third rules. The user doesn't know if a packet will be blocked early in BPF/iptables or if it'll go all the way to the L7 engine and be dropped in user space (at presumably higher cost).

Even though we tried out this idea(!), I'm inclined to say that, with the exception of service account selector (which is meaningful to both), it makes sense to push the spiffe/HTTP/L7-only stuff out into a separate policy. Seems likely that a user would write something like this:

allow from <LAN subnet where I expect the external SPIFFE client to live>
allow from selector x == "y"
deny all

and then a separate (hypothetical) L7 policy

allow spiffe://qwerty to path /x/y/z 
deny all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Support for ServiceAccount Match