3rd API proposal #17

mvachhar · 2025-03-04T23:50:15Z

API proposal combining ideas from proposal 1 and 2 and simplifying everything to a single Peering object between VPCs.

Co-Authored-By: Sergei Lukianov <[email protected]> Signed-off-by: Manish Vachharajani <[email protected]> Signed-off-by: Sergei Lukianov <[email protected]>

Fredi-raspall · 2025-03-05T12:36:24Z

docs/proposed-api.md

+
+Here, there are no duplicate IP restrictions, if there is multipath you just get
+ECMP. We can warn the user.  The policy is based on whatever route we pick.  However,
+there are route metrics to prefer one path to the other.


I don't understand this statement. In the general case where there's address
overlapping (general is common here), you need to guarantee uniqueness. This is
not just about routing it's about identity. Assume 3 vpcs, vpc-1, vpc-2, vpc-3,
where vpc-2 and vpc-3 have overlapping spaces and vpc-1 peers with both. Each of
them exposes a DISTINCT service that vpc-1 needs to consume. The subnets of vpc-2
and vpc-3 have to be disambiguated by exposing a distinct "external" identities.
Otherwise the gateway will not know where to send a packet received from vpc-1.
Ofc if vpc-2 and vpc-3 expose the same service, and vpc-1 does not care which is picked
then fine. However, then we'll have to address the issue of selecting to which of them
the traffic is sent, which, without state would need to resort to some form of hash
over the packet. So:
- the uniqueness restriction is there (even if I believe it could be relaxed in some cases)
- I would consider this ECMP kind of case a more advanced one for later.

Your analysis is exactly correct, if two VPCs expose a different service using the same address, they'll get ECMP routing between the two and very confusing behavior. But, in order to support multiple externals exposing the same routes, and to treat externals just like VPCs, this should be the default behavior, IMHO.

We can later warn the user about the issue, but how do we really know if it is intentional or not? I'm open to suggestions to allow the multi-home multi-external case and to address your concern, but I thought we could deal with the restriction later since we have meaningful and useful behavior in the overlap case.

Fredi-raspall · 2025-03-05T12:37:39Z

docs/proposed-api.md

+
+This helps with the multiple external cases where one VPC is routing to 2 externals
+and we want to use route metrics advertised via BGP to choose routes.
+


If you have externals, the same applies. If the externals use private addressing,
you need to know the identity of them to distinguish them. If they are public,
agreed, they could have overlap. Here, we have a new problem, that stems from trying to model externals as
VPCs. In the fabric case, a VPC represents a set of unique, distinguishable destinations.
By modelling multiple externals as VPCs, you're trying to model something differnt with
the same abstraction. You're trying to model multiple "means to get to"
unique, distinguishable destinations, not the destinations themselves. Unsure if an
alternative should be modelling this as a distinct entity to which you can peer with / have a virtual link to.
In that particular case, agreed, ECMP is an option and you could tweak routing to
prefer one over the other.

So is your suggestion to treat overlap from externals differently than normal VPCs? We can do that and may be a good middle ground between the two solutions, though like you mentioned it is sometimes useful to even allow overlap between private VPCs.

docs/proposed-api.md

Fredi-raspall · 2025-03-05T12:39:05Z

docs/proposed-api.md

+```
+
+```yaml
+# vpc-e1 is external 1 and vpc-e2 is external 2


I don't understand the next example. Are we trying to provide transit
between the externals? ... or to the externals?
In the first case, this is just a plain IP router.

Yes, this is to configure a plain IP router in the case that gateway is used without fabric.

Fredi-raspall · 2025-03-05T12:40:29Z

docs/proposed-api.md

+        # or should this be on vpc-1
+      - 192.168.1.0/30
+```
+


The major issue I see with the above is that it mixes all together and breaks the multi-tenancy.

you're specifying what ips can use the peering and how on the peering itself.

if you have to peer with many other vpcs, you have to repeat that all around, for each peering you create.

A model with a pif (don't call it like that, see it just as: 1) an indirection 2) anchor) allows VPCs independently choose what they expose to the outside, how and what
internal entities can reach the outside and how, independently of what other vpcs choose to do, and be able to change that at will regardless of the rest.

To me:

each VPC should be able to specify:
- what sets of internal addresses can go outside
- which of them can be reachable from the outside, in general

If a vpc needs to "except the above", it can just create more "pifs" (reachable entities).

Suppose a VPC hosted MS365. MS folks would just specify a single pif or the like (and not 10K) and then just specify which remote parties would be able to access it,
regardless of anything else.

Each side owns its thing.

Then, if you want to allow somebody to reach that thing you create a peering / virtual-link / peering-policy. That just tells the gateway upon reception of a packet
if the packet is allowed to be sent to the destination (MS365), how to get there, and enables the return path.

The same applies for a "consuming-services" vpc. It may consume services from 10 distinct other vpcs. You'd

specify who in the vpc can consume the service, once.

then add links to the 10 services. Each link would allow each of the 10 service vpcs to reach you back.

The plan here was to allow reference to external objects to specify the various parts that get repeated to address this concern. In the simple no-reuse case I think it is not friendly to make people configure lots of objects just to link two VPCs which is why we simplified to this.

Fredi-raspall

I think we're running in circles. The more proposals I see, the more I like the original one. True, it had rough edges and stuff that was not defined. But it had good properties, and I don't think the rough edges justifies a complete amendment. Unsure if we should just stick to it in the parts that were clear. I think we should focus on that and move on. It' s not just about the time. A complete rewrite of the model is going to confuse people a lot. I'd propose enumerating the defects you believe it had and discuss them.

mvachhar · 2025-03-05T15:44:47Z

I think we're running in circles. The more proposals I see, the more I like the original one. True, it had rough edges and stuff that was not defined. But it had good properties, and I don't think the rough edges justifies a complete amendment. Unsure if we should just stick to it in the parts that were clear. I think we should focus on that and move on. It' s not just about the time. A complete rewrite of the model is going to confuse people a lot. I'd propose enumerating the defects you believe it had and discuss them.

The two main defects the original PIF proposal had are that apart from static NAT, it is not clear how to add firewall rules and such so that everything semantically makes sense. Plus, the original proposal's IP address restrictions meant that you could not use the same IPs to talk to an external and expose services using those IPs. Without those restrictions, you could no longer clearly define which PIF or peering policy would be in effect for a given packet.

I also found the whole producer/consumer thing very confusing, and I like that in your proposal 2 that you got rid of that stuff.

Once I realized that my proposal (proposal-1) had the same implied/explicit route overlap issue that the original proposal had, and that I objected to in your proposal, it made sense to move to something more like what you proposed, but we wanted to centralize the configuration so there is no confusion about what policy is applied when (apart from the ECMP routing, which we can discuss). So we dropped the PIFs.

I don't see the value in PIFs apart from config reuse, and we can achieve that config reuse in other ways without creating the impression that there is something more going on where we are "selecting" interfaces, etc. Moreover, we force users to configure 3 objects instead of 1, which is less convenient when there is no policy reuse. Let's discuss more today.

mvachhar · 2025-03-05T18:03:05Z

How much cleanup do we want before merge? Or do we merge and then have another PR with more concrete examples and work on clarifying exact syntax, etc.?

Signed-off-by: Manish Vachharajani <[email protected]>

Frostman · 2025-03-06T16:14:10Z

How much cleanup do we want before merge? Or do we merge and then have another PR with more concrete examples and work on clarifying exact syntax, etc.?

@mvachhar I think we can probably mainly clean up right here, but it doesn't matter to me

Signed-off-by: Sergei Lukianov <[email protected]>

Co-Authored-By: Sergei Lukianov <[email protected]> Signed-off-by: Manish Vachharajani <[email protected]>

Signed-off-by: Sergei Lukianov <[email protected]>

mvachhar requested review from Frostman, daniel-noland, Fredi-raspall, qmonnet and sergeymatov March 4, 2025 23:50

mvachhar force-pushed the pr/mvachhar/proposed-api-3 branch from 52f8742 to 051dbe0 Compare March 4, 2025 23:51

Frostman force-pushed the pr/mvachhar/proposed-api-3 branch 3 times, most recently from 6fd7265 to 1c5ed39 Compare March 5, 2025 00:09

docs(api): Add proposed gateway api

1f648cc

Co-Authored-By: Sergei Lukianov <[email protected]> Signed-off-by: Manish Vachharajani <[email protected]> Signed-off-by: Sergei Lukianov <[email protected]>

Frostman force-pushed the pr/mvachhar/proposed-api-3 branch from 1c5ed39 to 1f648cc Compare March 5, 2025 00:12

Fredi-raspall reviewed Mar 5, 2025

View reviewed changes

docs/proposed-api.md Outdated Show resolved Hide resolved

Fredi-raspall reviewed Mar 5, 2025

View reviewed changes

Fredi-raspall requested changes Mar 5, 2025

View reviewed changes

mvachhar force-pushed the pr/mvachhar/proposed-api-3 branch from 8e74d3e to 980f46a Compare March 5, 2025 18:18

docs(api): Fix vpc2 ip config mismatch vs. comment

6062977

Signed-off-by: Manish Vachharajani <[email protected]>

mvachhar force-pushed the pr/mvachhar/proposed-api-3 branch from 980f46a to 6062977 Compare March 5, 2025 22:17

Frostman and others added 3 commits March 6, 2025 08:22

docs(api): fix internet access for vpc example

d750a2a

Signed-off-by: Sergei Lukianov <[email protected]>

docs(api): Add some examples with gateway setup for each

794da88

Co-Authored-By: Sergei Lukianov <[email protected]> Signed-off-by: Manish Vachharajani <[email protected]>

docs(api): clean up gw api syntax proposal

2e6b577

Signed-off-by: Sergei Lukianov <[email protected]>

Frostman marked this pull request as ready for review March 6, 2025 23:05

Frostman approved these changes Mar 6, 2025

View reviewed changes

Frostman merged commit 70010a5 into master Mar 6, 2025
6 checks passed

Frostman deleted the pr/mvachhar/proposed-api-3 branch March 6, 2025 23:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3rd API proposal #17

3rd API proposal #17

mvachhar commented Mar 4, 2025

Fredi-raspall Mar 5, 2025

mvachhar Mar 5, 2025

Fredi-raspall Mar 5, 2025

mvachhar Mar 5, 2025

Fredi-raspall Mar 5, 2025

mvachhar Mar 5, 2025

Fredi-raspall Mar 5, 2025

mvachhar Mar 5, 2025

Fredi-raspall left a comment

mvachhar commented Mar 5, 2025

mvachhar commented Mar 5, 2025

Frostman commented Mar 6, 2025


		This helps with the multiple external cases where one VPC is routing to 2 externals
		and we want to use route metrics advertised via BGP to choose routes.

3rd API proposal #17

3rd API proposal #17

Conversation

mvachhar commented Mar 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fredi-raspall left a comment

Choose a reason for hiding this comment

mvachhar commented Mar 5, 2025

mvachhar commented Mar 5, 2025

Frostman commented Mar 6, 2025