-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement dynamic selection of parent prefix from a set of custom fields #90
base: main
Are you sure you want to change the base?
Conversation
335e04b
to
97c512d
Compare
We agree that the The |
Improved CRD validation: According to [2], the Reference: |
787fecd
to
1ad0f81
Compare
For some reason, when we are querying for custom fields, for example using this URL Reproduction step:
The demo SQL files have adhered to this "observation", but I am not able to wrap my head around this, since for the restoration hash, we have no issue with it |
To maintain the backward compatibility of the restoration hash and extend it to support parentPrefixSelection, we have made the following changes to the hash:
With the aforementioned change, the hash can by design distinguish the prefix claimed by ParentPrefix or parentPrefixSelection, thus, making the restoration process relying on the hash achievable. |
1ad0f81
to
f65550b
Compare
Done |
Please add me as reviewer. |
5c97a94
to
592375a
Compare
So as @alexandernorth explained, on browsers, they might modify the URLs. But for the API endpoints what we send is directly being processed. So in the case, this is not an issue for us when querying using API endpoints! But when we are doing debugging using URLs, we might run into issues. |
592375a
to
cac0ce6
Compare
305bbbb
to
0f2498c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @henrybear327
Thanks for the work. I did some black box testing without digging too much into the code so far. I've left some comments regarding wording and documentation. Please remove "custom field" where it's not explicitly custom fields to avoid confusion for users.
Manual testing on a local kind cluster didn't work for me:
make create-kind && make deploy-kind
kubectl apply -f config/samples/netbox_v1_prefixclaim_customfields.yaml
kubectl describe pxc prefixclaim-customfields-sample
Name: prefixclaim-customfields-sample
Namespace: default
Labels: app.kubernetes.io/managed-by=kustomize
app.kubernetes.io/name=netbox-operator
Annotations: <none>
API Version: netbox.dev/v1
Kind: PrefixClaim
Metadata:
Creation Timestamp: 2024-11-12T08:52:02Z
Generation: 1
Resource Version: 1495
UID: ab6c76d1-08b4-4a2d-894c-da04f3961b4e
Spec:
Comments: your comments
Description: some description
Parent Prefix Selector:
Environment: Production
Pool Name: Pool 1
Prefix Length: /31
Preserve In Netbox: true
Site: DM-Akron
Tenant: MY_TENANT
Events: <none>
And the operator logs are rather short:
2024-11-12T09:02:57Z INFO prefixClaim reconcile loop started {"controller": "prefixclaim", "controllerGroup": "netbox.dev", "controllerKind": "PrefixClaim", "PrefixClaim": {"name":"prefixclaim-customfields-sample","namespace":"default"}, "namespace": "default", "name": "prefixclaim-customfields-sample", "reconcileID": "2b0e902f-aae0-42a4-860e-f96f0aba8ea3"}
2024-11-12T09:02:57Z ERROR Reconciler error {"controller": "prefixclaim", "controllerGroup": "netbox.dev", "controllerKind": "PrefixClaim", "PrefixClaim": {"name":"prefixclaim-customfields-sample","namespace":"default"}, "namespace": "default", "name": "prefixclaim-customfields-sample", "reconcileID": "2b0e902f-aae0-42a4-860e-f96f0aba8ea3", "error": "resource name may not be empty"}
Will try again later with running it locally against demo.netbox.dev.
config/samples/netbox_v1_prefixclaim_customfields_bool_int.yaml
Outdated
Show resolved
Hide resolved
config/samples/netbox_v1_prefixclaim_customfields_bool_int.yaml
Outdated
Show resolved
Hide resolved
prefixLength: "/31" | ||
parentPrefixSelector: # The keys and values are case-sensitive | ||
environment: "Production" | ||
poolName: "Pool 1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit confused here. The custom field name is poolName
. Shouldn't the filter in here then be cf_poolName
? Or how do you distinguish custom fields from built in fields? I think it would make sense to be consistent with the netbox filtering https://demo.netbox.dev/static/docs/rest-api/filtering/#filtering-by-custom-field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I manually filter out tenant
and site
, so we don't force the user to build the netbox API query string in the CRs.
Should we have the user enter the prefix cf_
when adding the entries to save us some implementation hassle, or, we handle this for the user, which looks nicer on the UX standpoint. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine both ways. If we handle cf_
for the user it's one one side nice but on the other side also confusing. Depends a bit whether we want to stick to the UI (handle it for the user) or API (the user has to specify cf_
himself). wdyt @bruelea ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For spec.customFields the cf_
is handled for the use, so we could do the same for parentPrefixSelector.
If you use a PrefixClaim using parentPrefixSelector with an invalid Tenant (a Tenant that doesn't exist in NetBox), the Event will show:
Spec example: apiVersion: netbox.dev/v1
kind: PrefixClaim
metadata:
name: prefixclaim-customfields-sample
spec:
tenant: "MY_TENANT that doesn't exist"
site: "DM-Akron"
preserveInNetbox: true
prefixLength: "/31"
parentPrefixSelector: # The keys and values are case-sensitive
environment: "PostProduction"
poolName: "Pool 1" This is quite misleading. Do you think there is an easy way to enhance this? |
Under some circumstances the parent prefix will be computed but it's already exhausted. Even if you add more prefixes, the system will not recover and compute/chose a new parent prefix. Example describe:
Could we fix this by removing |
@jstudler please take another pass when you have time! :) Thanks |
We actually throw an error indicating this situation, please see the code snippet below: func (r *NetboxClient) GetTenantDetails(name string) (*models.Tenant, error) {
request := tenancy.NewTenancyTenantsListParams().WithName(&name)
response, err := r.Tenancy.TenancyTenantsList(request, nil)
if err != nil {
return nil, utils.NetboxError("failed to fetch Tenant details", err)
}
if len(response.Payload.Results) == 0 {
return nil, utils.NetboxNotFoundError("tenant '" + name + "'")
}
return &models.Tenant{
Id: response.Payload.Results[0].ID,
Slug: *response.Payload.Results[0].Slug,
Name: *response.Payload.Results[0].Name,
}, nil
} I think this is related to the discussion this morning - we need to surface the underlying errors properly to the user. As in this case, we did throw an error clearly specifying the problem, but it's not visible to the user. |
Quick question, what's the desired behavior? IMO, prefix exhaustion might be a temporary issue, so we should requeue and keep trying. WDYT? |
The desired behavior is that the Operator can recover from this case. Yes, we should assume that prefix exhaustion is a temporary issue only. The behaviour I've seen above was deadlock: The parent prefix was chosen but it had no space left, but there were other parent prefix candidates that weren't chosen. So I think removing |
But I think the event is wrong. The event indicates "The parent prefix was not able to be computed" when in fact the problem is that the user set an invalid tenant. Could we create an event in |
Co-authored-by: Sergio <[email protected]>
5ae7b50
to
1b9b95e
Compare
h := generatePrefixRestorationHash(prefixClaim) | ||
canBeRestored, err := r.NetboxClient.RestoreExistingPrefixByHash(h) | ||
if err != nil { | ||
return ctrl.Result{Requeue: true}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the log message/condition update is missing in case of an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as requested! Can you see if this is what you are thinking about :)
Thanks
Tracking under bug #132 |
Tracking under bug #131 |
Design
We introduce a new
parentPrefixSelector
field in the CR. It's a list of custom field key-value mapping, where both the key and value are of the typestring
. For exampleWe now have 3 cases:
parentPrefix
is set, we continue on as-isparentPrefixSelector
is set, we would take all Prefixes that exactly match all custom field's key-value specified inparentPrefixSelector
. We would then pick the first prefix that is able to satisfy theprefixLength
requirement as the parent prefix.parentPrefix
andparentPrefixSelector
present in the CR, the CRD validation will reject this yaml fileThe parentPrefix will be set in the
status
fieldParentPrefix
, and this will be used as the source of truth for the rest of the reconcile function.Known issue
TODO
status
field to store the computedParentPrefix
value is desiredparentPrefixSelector
would require some documentation, as there might be the case there the picked parent prefix is different for the sameparentPrefixSelector
(e.g. due to prefix space exhaustion), and thus, the restoration might not work as user anticipated.Notes