Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement inline CIDR block egress peer #185

Conversation

tssurya
Copy link
Contributor

@tssurya tssurya commented Dec 30, 2023

This PR adds support for implementing inline CIDR
peer blocks.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 30, 2023
Copy link

netlify bot commented Dec 30, 2023

Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name Link
🔨 Latest commit 6ff2bac
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-network-policy-api/deploys/65e720e34ff0e500086fd515
😎 Deploy Preview https://deploy-preview-185--kubernetes-sigs-network-policy-api.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 30, 2023
// +optional
// +kubebuilder:validation:MinItems=1
// +kubebuilder:validation:MaxItems=100
Networks []string `json:"networks,omitempty" validate:"omitempty,dive,cidr"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to come back to how we want the validation of the cidr's to be done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tssurya
Copy link
Contributor Author

tssurya commented Dec 31, 2023

/test pull-network-policy-api-verify

@astoycos
Copy link
Member

astoycos commented Jan 3, 2024

@tssurya You just need to run gofmt here

Verifying gofmt
diff ./apis/v1alpha1/shared_types.go.orig ./apis/v1alpha1/shared_types.go
--- ./apis/v1alpha1/shared_types.go.orig
+++ ./apis/v1alpha1/shared_types.go
@@ -176,11 +176,11 @@
 	// +optional
 	Nodes *metav1.LabelSelector `json:"nodes,omitempty"`
 	// Networks defines a way to select IPs and CIDR blocks that represent
-    // entities that live outside the cluster as a peer.
+	// entities that live outside the cluster as a peer.
 	// It is the list of NetworkCIDR (both v4 & v6) that can be used to define
 	// external destinations.
-    // You could use this to select internal cluster networks like podCIDR blocks, but it is
-    // recommended to use namespaces and pods peers instead in those cases.
+	// You could use this to select internal cluster networks like podCIDR blocks, but it is
+	// recommended to use namespaces and pods peers instead in those cases.
 	//
 	// Support: Extended
 	//
Please run make fmt to fix the issue(s)
�[0;31mTest FAILED: hack/../hack/verify-gofmt.sh 

@astoycos
Copy link
Member

astoycos commented Jan 8, 2024

/retest

@tssurya
Copy link
Contributor Author

tssurya commented Jan 18, 2024

/hold
@astoycos : yeah this is actually only WIP PR, adding hold here, its a PoC for the NPEP PR: #144 which needs to get merged first.

Need your help in getting PR 144 merged asap.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 18, 2024
@tssurya tssurya changed the title Implement inline CIDR block egress peer WIP: Implement inline CIDR block egress peer Feb 9, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 9, 2024
@tssurya tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 5 times, most recently from f84e679 to 8bfeda6 Compare February 13, 2024 11:23
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 13, 2024
@tssurya tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch from 68943e0 to 96a7cc3 Compare February 13, 2024 20:38
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 18, 2024
@tssurya tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 2 times, most recently from 6e9cb97 to 68ec2c3 Compare February 20, 2024 13:51
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 20, 2024
// <network-policy-api:experimental>
// +optional
// +kubebuilder:validation:MinItems=1
// +kubebuilder:validation:MaxItems=45
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the moment I set this to 50 I exceed the cost quota for the CRD
I think its because each CRD can have 100 egress rules and each rule can have 100 egress peers and each peer can have 50 or 100 networkCIDRs, its blowing out of proportion for CEL validation. Thus the max we can do is 45 here for a length of 48.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either ways this can be done only from 1.31 onwards, we need to go with regex until then

Copy link

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this is pretty simple, but I think you need to understand if IP 4-in-6 addresses are supported or not. I would suggest to either simplify the regex to prevent this, or, prevent the presence of . and : in the same string by adding a CEL validation

+kubebuilder:validation:XValidation:rule="self.contains(':') != self.contains('.')",message="CIDR must be either an IPv4 or IPv6 address. IPv4 address embedded in IPv6 addresses are not supported"

@tssurya
Copy link
Contributor Author

tssurya commented Feb 28, 2024

ack thanks @JoelSpeed : I will double check if we want to do v4 in v6 support and then a change things accordingly. What about isCIDR() implementation, does that support v4 in v6 or will it consider that as invalid?
found this one: golang/go#51906 (comment) which was interesting

@JoelSpeed
Copy link

JoelSpeed commented Feb 28, 2024

I will double check but I'm pretty sure when implementing it we excluded 4-in-6 support

Edit: 4in6 explicitly disallowed in CEL

@tssurya tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 4 times, most recently from 00a5c80 to 5658b95 Compare February 28, 2024 13:51
@danwinship
Copy link
Contributor

I will double check if we want to do v4 in v6 support

Oh, yeah, we don't. (isCIDR is exactly correct for what we want.)

// has no impact on the behavior of the policy applied to the peer
// trying to talk to the Service. It will work in the same way as if the
// Service didn't exist since policy is applied after ServiceVIP (clusterIP,
// externalIP, loadBalancerIngressVIP) is rewritten to the backendIPs.
Copy link
Contributor

@danwinship danwinship Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this isn't quite right; ClusterIPs and ExternalIPs are always rewritten before NP enforcement, but load balancer IPs sometimes aren't; in some cases, pod-to-LoadBalancer traffic must be sent to the load balancer rather than being processed locally. (This would happen either if the CloudProvider sets hostname rather than ip in the service's LoadBalancerStatus (like AWS), or if it sets ip but also uses the new ipMode: Proxy.)

And once a packet leaves the node and goes to an external load balancer, then all bets are off. When the packet comes back into the cluster, it's not even theoretically possible for the ANP implementation to reliably figure out its original source IP (since the LB may have transformed the connection in arbitrary ways).

(And of course, if the Service has endpoint IPs that are cluster-external, then a pod-to-LoadBalancer connection that goes to the LB would presumably then go directly to the cluster-external IPs, without coming back into the cluster at all, so there's no way the ANP implementation could enforce a Networks selector that blocked cluster-external IPs at that point.)

So I can think of four options:

  1. Require LoadBalancer implementations to participate in ANP enforcement, so that we can guarantee that pod-to-LB connections will have the same semantics as pod-to-ClusterIP. This seems really really unlikely to happen, but I wanted to list it just for completeness.
  2. Require ANP implementations to "pre-approve" pod-to-LB traffic by figuring out "if I were to rewrite this traffic myself, would it then pass ANP?", and if not, then drop it rather than forwarding it to the LB. Note that in the AWS-like case this requires using FQDN rules to recognize that traffic is headed for an LB. And then there are theoretical edge cases like "what if one of the service's endpoints is allowed but another one isn't?". So this also doesn't seem great.
  3. Declare that it's undefined whether you can match post-LB traffic (IOW, a Pods selector may or may not catch pod-to-LB-to-pod traffic, and a Networks selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.). If we were going to do this, we would probably want to say that you can block pod-to-externalIPs traffic too, for consistency, even though we don't run into the same "we don't know what the cloud infra is doing" problem there. (In fact, we could potentially say that this applies to clusterIPs too... we'd have to think that through. This is potentially the simplest solution though, because then the rule just becomes "if the packet got DNATted, then check both the original and the new dest IP against all policies, rather than only checking the new dest IP".)
  4. Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters. (This is easiest for implementors, but worst for users. And of course, this is what NP does because we hadn't noticed the problem until it was too late and there were already conflicting implementations.)

FTR, note that in the cluster-ingress case (which is even worse), I feel like there's a chance the answer will end up being "delegate policy enforcement to the Gateway rather than trying to do it in ANP". I'm not sure if that has any implications for trying to decide what to do here.

(People may also want to consider how they expect ANPs to interact with other implementation-specific egress functionality (eg, EgressIPs), and whether that implies anything about the best choice for LB behavior.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this isn't quite right; ClusterIPs and ExternalIPs are always rewritten before NP enforcement, but load balancer IPs sometimes aren't; in some cases, pod-to-LoadBalancer traffic must be sent to the load balancer rather than being processed locally. (This would happen either if the CloudProvider sets hostname rather than ip in the service's LoadBalancerStatus (like AWS), or if it sets ip but also uses the new ipMode: Proxy.)

And once a packet leaves the node and goes to an external load balancer, then all bets are off. When the packet comes back into the cluster, it's not even theoretically possible for the ANP implementation to reliably figure out its original source IP (since the LB may have transformed the connection in arbitrary ways).

OH I completely forgot about the aws LB use case.. :/ great call out.

(And of course, if the Service has endpoint IPs that are cluster-external, then a pod-to-LoadBalancer connection that goes to the LB would presumably then go directly to the cluster-external IPs, without coming back into the cluster at all, so there's no way the ANP implementation could enforce a Networks selector that blocked cluster-external IPs at that point.)

YEA,,,,!

So I can think of four options:

  1. Require LoadBalancer implementations to participate in ANP enforcement, so that we can guarantee that pod-to-LB connections will have the same semantics as pod-to-ClusterIP. This seems really really unlikely to happen, but I wanted to list it just for completeness.
  2. Require ANP implementations to "pre-approve" pod-to-LB traffic by figuring out "if I were to rewrite this traffic myself, would it then pass ANP?", and if not, then drop it rather than forwarding it to the LB. Note that in the AWS-like case this requires using FQDN rules to recognize that traffic is headed for an LB. And then there are theoretical edge cases like "what if one of the service's endpoints is allowed but another one isn't?". So this also doesn't seem great.

yeah except AWS case other ingress.VIP types will work well with this solution. Even with the way "Services" are implemented there is inconsistency with LBs because some are hairpinned within the cluster, some are sent out etc.. but I guess even in that case we didn't really define what is correct for implementions to follow...

  1. Declare that it's undefined whether you can match post-LB traffic (IOW, a Pods selector may or may not catch pod-to-LB-to-pod traffic, and a Networks selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).

yea... at least in ovnk.. we won't block preLB traffic..gonna be applied after the DNAT except in AWS case where we will :D

If we were going to do this, we would probably want to say that you can block pod-to-externalIPs traffic too, for consistency, even though we don't run into the same "we don't know what the cloud infra is doing" problem there. (In fact, we could potentially say that this applies to clusterIPs too... we'd have to think that through. This is potentially the simplest solution though, because then the rule just becomes "if the packet got DNATted, then check both the original and the new dest IP against all policies, rather than only checking the new dest IP".)

  1. Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters. (This is easiest for implementors, but worst for users. And of course, this is what NP does because we hadn't noticed the problem until it was too late and there were already conflicting implementations.)

I like this the BEST. :D But perhaps we are then following in the footsteps of something NetPol did badly and we don't necessarily want to do that to users?

FTR, note that in the cluster-ingress case (which is even worse), I feel like there's a chance the answer will end up being "delegate policy enforcement to the Gateway rather than trying to do it in ANP". I'm not sure if that has any implications for trying to decide what to do here.

(People may also want to consider how they expect ANPs to interact with other implementation-specific egress functionality (eg, EgressIPs), and whether that implies anything about the best choice for LB behavior.)

yeah good point for cross-feature functionality user experience.

Copy link
Contributor

@danwinship danwinship Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3. Declare that it's undefined whether you can match post-LB traffic (IOW, a Pods selector may or may not catch pod-to-LB-to-pod traffic, and a Networks selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).

yea... at least in ovnk.. we won't block preLB traffic..gonna be applied after the DNAT except in AWS case where we will :D

ugh...

So what I was trying to say with "check both the original and the new dest IP against all policies, rather than only checking the new dest IP" was that, if you have a rule saying "Deny to 192.168.0.0/24", then instead of translating that into a single rule that only checks the real/current dest IP, you'd translate it into two rules. eg:

# iptables
... -d 192.168.0.0/24 ... -j DROP
... -m conntrack --ctstate DNAT --ctorigdst 192.168.0.0/24 ... -j DROP

# OVS
... nw_dst=192.168.0.0/24, actions=drop
... ct_state=+dnat,ct_nw_dst=192.168.0.0/24, actions=drop

(except more optimized)

But it looks like OVN ACL expressions don't expose OVS ct_nw_dst... (ct.dnat is the equivalent of ct_state=+dnat, but there's no equivalent of ct_nw_dst 🙁 )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters....

I like this the BEST. :D But perhaps we are then following in the footsteps of something NetPol did badly and we don't necessarily want to do that to users?

That's how I feel. I mean, think of the user stories that would have had to be there to get you to choose this behavior on purpose. "As an administrator, I need to be able to block certain users from accessing X. But if the user creates a LoadBalancer Service pointing to it, then whatever, I guess I don't care in that case."

(Admittedly, with the current default RBAC permissions, users cannot create Services that point to cluster-external IPs or to another user's pods... So... maybe we can get away with that?)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial inclination is that wherever possible we should try to avoid CIDR-based matches for rich k8s constructs, because it means that every time something relocates elsewhere on the network, the user must update their network policies. The beauty of policies for Pods is that the Pods can churn as much as they like, move IPs, redeploy elsewhere, etc, and this doesn't impact the user's specification of the policy. It's just up to the implementation to learn the right information and apply the policy correctly. (And sure, we can always paper over this with controllers that react to stuff in the network and write the ANPs, but that seems like the implementation detail rather than the user-facing API; implementations should have the freedom to choose if that's the way they want it to work).

Whether this "avoid CIDR applying to services" idea means aiming for the node to do a best-effort translation and apply policy based on that, or whether to have an explicit selector that is specifically for selecting traffic towards services by the service's labels, or one of the other options... I'm not sure.

declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).

If we're talking "How to apply the policy in cases where the CNI is not responsible for service translation" then I think this doable in Cilium without a huge amount of effort, but if this is intended to apply to all services including ones that the CNI is responsible for translating, then this seems to equate to building out a whole extra policy calculation + enforcement stage. So yes, a pain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we say that CIDR-based policy enforcement should happen with the original packet sent by the pod (pod-selector rules can still be applied after svc ip resolution). Then the only thing that matters is what the pod sends/receives as dst/src ip. Considering subject is always a pod, it kinda makes sense to think about ips in terms of what pod sees, and not how the packet will be tweaked/routed/etc. This also could be enriched later with extra lb/gateway/etc policies?

apis/v1alpha1/shared_types.go Outdated Show resolved Hide resolved
apis/v1alpha1/shared_types.go Outdated Show resolved Hide resolved
apis/v1alpha1/shared_types.go Outdated Show resolved Hide resolved
apis/v1alpha1/shared_types.go Outdated Show resolved Hide resolved
apis/v1alpha1/shared_types.go Show resolved Hide resolved
@tssurya tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 3 times, most recently from 1c00da1 to 7c08fb6 Compare February 28, 2024 19:04
Comment on lines 202 to 203
// such a VIP. For example, a Networks selector that denies traffic to the entire
// service CIDR will not actually block any service traffic.
Copy link

@joestringer joestringer Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the VIP has no backends/endpoints? The implementation can't translate the traffic, so at that point is the traffic subject to CIDR policy or not? I believe that the behaviour in Cilium today is to drop this traffic with an explicit error code, but not motivated by policy at all (before the policy check).

(Other than that, the "translate first, apply policy after" pattern is how Cilium works with respect to policy applying to service traffic.)

Copy link

@joestringer joestringer Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I think this question is also related to the LoadBalancer behaviour question above 😅 . If we consider that the LoadBalancer "doesn't have any backends", because actually the implementation of the LB is external to the current node, then how would we apply that policy. Interesting. #185 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes right there is a discrepancy with LBs will open a new issue to track that.

@tssurya
Copy link
Contributor Author

tssurya commented Feb 29, 2024

FYI for reviewers, this PR's scope is to get the CIDR peer in and I will add a TODO linking to #203 to clarify the issues around load balancers...

// https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/
// The resulting regex is an OR of both regexes. IPv4 address embedded in IPv6 addresses are not supported.
// TODO: Change the CIDR's validation regex to use CEL isCIDR() in Kube 1.31 when it is available.
// +kubebuilder:validation:Pattern=`(^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(3[0-2]|[1-2][0-9]|[0-9]))$)|(^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$)`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this regex completely wrong? I think all backslash characters got dropped on that random blog page it was taken from 😬 Look e.g. at IPv6 address regex there, it starts with ^s* instead of ^\s* which makes much more sense... And in this particular one, look at e.g. 1dd - I guess it should have been 1\d\d instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mkow ! thanks for the review:

(^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$)

is the v6 CIDR regex that I have here..
The blog: https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/ has

^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$

so do you mean the regex on that blog is also wrong and we need backslash ? Almost everywhere I saw online I didn't see backslashes having said that I am no expert here. Could you please provide me with your alternative source?
cc @JoelSpeed we do the same in OCP here: https://github.com/openshift/api/blob/ce10821dc9997694d8058564308819011d27383e/config/v1/types_infrastructure.go#L1862 which is where I got it from, I plan to write some tests on this for sure but I'd think it can't be completely wrong if we are using it there unless I am missing something very obvious here? cc @mkowalski

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey all, this is regex we use in OCP and of course there may be a dozen of corner cases it is not catching. Given how tricky the syntax of IPv6 addresses is, I am not sure if is possible to have entirely correct regex (but proof would be needed).

@mkow can you please provide an example of IPv6 address that is correct and is not catched by this regex? Or can you provide an example of non-IPv6 string that is passing this regex? That would definitely help

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so do you mean the regex on that blog is also wrong and we need backslash?

I would rather say: don't blindly copy-paste things without understanding them ;) Yes, the blog is completely wrong.

Could you please provide me with your alternative source?

I don't have any, sorry. I guess, just read the IPv6 CIDR specification and build the regex carefully yourself. Or don't use regex for this purpose.

there may be a dozen of corner cases it is not catching

It's not about corner cases, this regex is just some garbage which (I guess) got mangled when the author of that blog tried to add it to html on their blog. Most of it doesn't make any sense, just read it yourself.

@mkow can you please provide an example of IPv6 address that is correct and is not catched by this regex? Or can you provide an example of non-IPv6 string that is passing this regex? That would definitely help

For example, the regex from this PR accepts this as a valid IPv6 CIDR: sssss::1ddd1ddd1ddd1dd%!@#$%^&*()/0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps, here's the regex + tests we have in Cilium for the equivalent validation pattern: https://github.com/cilium/cilium/blob/25d946d3d88e0be2ccbcec7bb12a99905729ba16/pkg/policy/api/cidr_test.go#L156

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mkow yeah that one looks like a nice tester! Let me see if this works: https://www.perplexity.ai/search/what-is-the-ILs5tiG7RHWJo4ZzyL7t8A :)

heh clearly didn't, its copying from whatever sources are out there which all seem wrong.. :/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth comparing with Go unit tests too for some corner cases: https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/net/netip/netip_test.go

Copy link

@asauber asauber Mar 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While searching for prior attempts at an IPv6 regex, I came across this list of invalid and valid test cases, coming from this repo. Possibly useful for validation.

Copy link

@jspaleta jspaleta Mar 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also came across a regex that might be easier to understand as switch statement for different allowable address patterns using regex-vis.com

ref: https://ihateregex.io/expr/ipv6/

I'm not saying this has full coverage for all allowed cases, but I can comprehend what the cases are using regex-vis.com

(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))

If only we could embed a comment string for each regexp group that tools like regex-vis.com could render to provide context for intent for each grouping. The lack of 'intent' is what makes complicated regexp so so brittle and hard to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks all, I'll discuss with the repo-maintainers and k8s api reviewers to see how to move forward on this one. Considering the API is not GA-ed we might skip the regex (I find it tricky to get the absolute correct regex and don't want to risk going with the wrong regex leading to breakages) for now till 1.31 where we introduce isCDIR(). This might mean implementations need to do their own validation using net.ParseCIDR() but I think we can live with that. If anyone has strong concerns or suggestions for this please lmk here on the thread.

PS: This also stems from the fact that I don't have cycles to go off and build a v6 regex currently, so if someone is willing to do that happy to give credit there as well.

This PR adds support for implementing inline CIDR
peer blocks.

Signed-off-by: Surya Seetharaman <[email protected]>
@tssurya tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch from ec3fc38 to 6ff2bac Compare March 5, 2024 13:40
Copy link
Contributor

@Dyanngg Dyanngg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2024
Copy link
Member

@astoycos astoycos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks for all the work @tssurya and all the great conversations/ bug findings from the group, all conversations are being resolved in follow up issues (i.e regex, services and external CIDR object)

see #183

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 5, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astoycos, Dyanngg, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 5889d65 into kubernetes-sigs:main Mar 5, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.