-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downtime set by origin and cache #1996
base: main
Are you sure you want to change the base?
Conversation
c7714da
to
4fc17d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small items and one big one that might result in some refactoring.
It is unwieldy to work with one off endpoints like this that only are there to mark a boolean. Is there any way that we can design this more RESTfully so that we are instead POSTing a body like {downtime: true/false} to /director/servers/.
This way if we ever want to adjust a servers state we can just tack it on the same POST request rather then creating a whole new endpoint.
I realize this is not easy as the ServerName in this case is a url which is a unwieldy url key. Is there anyway that we can get the ServerAd from the Servername?
There are other endpoints in Pelican that follow this model that I hope to change someday as well: https://github.com/orgs/PelicanPlatform/discussions/1688
return | ||
} | ||
ctx.JSON(http.StatusOK, server_structs.SimpleApiResp{ | ||
Status: server_structs.RespFailed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status: server_structs.RespFailed, | |
Status: server_structs.RespOK, |
// Construct the request body | ||
downtimeRequest := server_structs.DowntimeRequest{ | ||
ServerUrl: serverUrl, | ||
Hostname: hostname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why pass in Hostname if you don't use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this approach will work:
- The information is sent once, meaning the director has to actually be up and functional or the downtime is lost. That conflicts with the whole idea of periodically sending the service's state to the director.
- Instead of adding the new API, make the service state part of the ad and send out a new ad. Basically, adding a new field to what we tell the director about ourselves.
- If you restart the origin/cache, it doesn't know what state it is in. That's fundamental concept: the service is an independent entity, it can't use the director as its "memory".
- This information needs to be persisted at the service level. Make sure to talk to @CannonLock about what information should go into a downtime (it's not a boolean - e.g., there should be a start time, end time, who created the downtime, a description. We should model loosely on the downtimes we create in topology)..
Also - there's a new API and no corresponding documentation in the OpenAPI schema.
Additionally - when creating commit messages, please follow the advice starting in this paragraph:
Not every commit message is going to be perfect but we should always strive to better. |
Proposed system design based on @bbockelm's comment and conversation with @CannonLock. Rules between OSDF Federation admin and Origin/Cache admin: (checked with @williamnswanson)
Some design thoughts:
|
Now origin and cache server can set their own downtime in the director by simply sending a request with payload
{"enableDowntime": true/false}
to API/api/v1.0/downtime
. Then this PR will take care of the setting in the backend.This is a backend change. Frontend UI is not included.
Closes #1251