-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Network Operator APIs #6016
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might think about generating the SDK and CLI as you evolve this API. In particular, I'd love to see this list of CLI commands that require the --json-body
parameter to shrink: https://github.com/oxidecomputer/oxide.rs/blob/main/cli/tests/data/json-body-required.txt (note about half of them are networking-related)
openapi/nexus.json
Outdated
@@ -9727,7 +9816,7 @@ | |||
"last_address" | |||
] | |||
}, | |||
"AddressLotBlockCreate": { | |||
"AddressLotBlock2": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just looking through this quickly, but I think we should avoid adding more of these name conflicting types that result in the 2
suffix.
@ahl do we have a preferred pattern that we want to use to avoid this? |
There are some structures that are challenging to generate, in particular deeply nested structs. Perhaps put another way: you might keep in mind how you'd like the CLI to work for a particular API concept and then organize the API in a way where you can imagine the direct translation e.g. to CLI arguments. There's more that progenitor could and can do, but some structures--such as |
@ahl I generated a local CLI with this PR's openapi spec and it seems to avoid the need for
|
well that's good news! |
Local testing against a running control plane: $ ./target/debug/oxide system networking address-lot list
[
{
"description": "initial infrastructure ip address lot",
"id": "4d2fc8ca-8759-46e4-9c36-7e9bbd14ecc1",
"kind": "infra",
"name": "initial-infra",
"time_created": "2024-07-17T21:39:29.384680Z",
"time_modified": "2024-07-17T21:39:29.384680Z"
}
]
$ ./target/debug/oxide system networking address-lot block list --address-lot initial-infra
[
{
"first_address": "10.85.0.30",
"id": "60efc85a-34d0-4e07-b3e2-c11f203e054d",
"last_address": "10.85.0.30"
}
]
$ ./target/debug/oxide system networking address-lot block add \
--address-lot initial-infra \
--first-address 10.100.0.0 \
--last-address 10.100.0.255
{
"first_address": "10.100.0.0",
"id": "5eff580c-9e96-4c01-a580-b9c782bc4196",
"last_address": "10.100.0.255"
}
$ ./target/debug/oxide system networking address-lot block list --address-lot initial-infra
[
{
"first_address": "10.100.0.0",
"id": "5eff580c-9e96-4c01-a580-b9c782bc4196",
"last_address": "10.100.0.255"
}, {
"first_address": "10.85.0.30",
"id": "60efc85a-34d0-4e07-b3e2-c11f203e054d",
"last_address": "10.85.0.30"
}
]
$ ./target/debug/oxide system networking address-lot block remove \
--address-lot initial-infra \
--first-address 10.100.0.0 \
--last-address 10.100.0.255
$ ./target/debug/oxide system networking address-lot block list --address-lot initial-infra
[
{
"first_address": "10.85.0.30",
"id": "60efc85a-34d0-4e07-b3e2-c11f203e054d",
"last_address": "10.85.0.30"
}
] |
I think we will need to un-nest the communities and import / export policies if we want to be able to add peers without using |
While you're messing with these APIs... #6497 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work here Levon! I spent some time reading over this; I really like the more granular updates we can have over switch port configuration. I've left some comments throughout.
let port_settings_id = | ||
switch_port_configuration_id(&conn, identity).await?; | ||
|
||
let port_config = SwitchPortConfig { | ||
port_settings_id, | ||
geometry: new_geometry, | ||
}; | ||
|
||
// create or update geometry | ||
diesel::insert_into(dsl::switch_port_settings_port_config) | ||
.values(port_config.clone()) | ||
.on_conflict(dsl::port_settings_id) | ||
.do_update() | ||
.set(dsl::geometry.eq(new_geometry)) | ||
.execute_async(&conn) | ||
.await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this ID check -> use pattern is safe, given that CRDB works with transations set as SERIALIZABLE
. I'd have to defer to someone with more knowledge there though!
NameOrId::Id(id) => { | ||
// verify id is valid | ||
bgp_config::table | ||
.filter(bgp_config::time_deleted.is_null()) | ||
.filter(bgp_config::id.eq(id)) | ||
.select(bgp_config::id) | ||
.limit(1) | ||
.first_async::<Uuid>(&conn) | ||
.await | ||
.map_err(|e: diesel::result::Error| { | ||
match e { | ||
diesel::result::Error::NotFound => { | ||
err.bail(Error::not_found_by_id(ResourceType::BgpConfig, &id)) | ||
}, | ||
_ => { | ||
let message = "error while looking up bgp config for bgp peer"; | ||
error!(opctx.log, "{message}"; "error" => ?e); | ||
err.bail(Error::internal_error(message)) | ||
}, | ||
} | ||
}) | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should most of the linked config elements here have a lookup_resource!
definition? That doesn't give us LookupPath
on a conn
, but...
I'm also starting to wonder about the NameOrId resolution within each transaction -- I think that resolving to an ID should still occur only once outside the transaction, and then we {use, fetch, check liveness of} the target resource by ID as part of the transaction. I don't see a way that these transactions can cause a retry today, but if that changes I imagine you could hypothetically be working on renamed resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we allow renaming of resources today? It is my understanding that Name and ID are not changeable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ID is immutable, yeah -- I had meant to select the NameOrId::Name
match branch at the time.
My thinking is in case an update endpoint does crop up in future -- e.g., not all project scoped resources have update endpoints, but the resource name is almost always mutable when such an endpoint exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments. I did a quick review for consistency, but did not go deep into the DB transactions.
keepalive: new_settings.keepalive.into(), | ||
remote_asn: new_settings.remote_asn.map(Into::into), | ||
min_ttl: new_settings.min_ttl.map(Into::into), | ||
md5_auth_key: new_settings.md5_auth_key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a clone required?
Also check line 638.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I am unsure of what you are referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I should have noticed/clarified that line 638 was from a different file that has the same name but different directory: nexus/db-model/src/switch_port.rs:638
For example, nexus/db-queries/src/db/datastore/switch_port.rs:590 has code that looks like:
md5_auth_key: p.md5_auth_key.clone(),
I was raising the question of, for any line of code that has a style of md5_auth_key: xyz.md5_auth_key
(includes this line here for this comment, and above other file line 638), whether to consider if it should actually be md5_auth_key: xyz.md5_auth_key.clone()
, because I've seen usage of clone()
for lines of code related to md5_auth_key
in both the omicron
and oxide.rs
repositories in the past.
See:
- https://github.com/oxidecomputer/oxide.rs/blob/e3e972bdc0d8d49c21b6b12c84b90c4d7a30eafc/cli/src/cmd_net.rs#L499
- https://github.com/oxidecomputer/oxide.rs/blob/e3e972bdc0d8d49c21b6b12c84b90c4d7a30eafc/cli/src/cmd_net.rs#L1097
This was part of my quick checking for consistency throughout the code that I looked at.
Many thanks for the reviews, going to chip away at these fixes! |
Summary
Our currently deployed APIs for operator networking aren't as user friendly as we'd like them to be. After landing these changes we hope to be able to start breaking ground on Terraform provider updates for rack / operator networking, which should increase user friendliness as well as make testing more streamlined in some scenarios. This will also allow us to migrate from using
--json-object
with Oxide cli commands, for example:Changes
The following changes should mostly preserve backwards compatibility with the existing API, so we should be able to continue to use all of our existing automation / tooling / scripts for the time being.
Change namespace for api endpoints / commands to
switch-port-configuration
(switch-port-settings is not plural-friendly)Break AddressLot and AddressLotBlock into separate APIs
Add endpoint for viewing which configuration has been applied to a switch port (i.e.
configuration_view
)Rename
settings_apply
andsettings_clear
toconfiguration_apply
andconfiguration_clear
Create separate API endpoints for each child object under
SwitchPortSettings
Create integration tests for new endpoints
update CLI client library: WIP: refactor network operator APIs oxide.rs#817
Additional Acceptance Criteria
name_or_id
path param #6497