Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Origin and cache should be able to control downtime #1251

Open
bbockelm opened this issue May 8, 2024 · 6 comments · May be fixed by #1996
Open

Origin and cache should be able to control downtime #1251

bbockelm opened this issue May 8, 2024 · 6 comments · May be fixed by #1996
Assignees
Labels
enhancement New feature or request origin Issue relating to the origin component
Milestone

Comments

@bbockelm
Copy link
Collaborator

bbockelm commented May 8, 2024

The director administrator can force a cache or origin into downtime centrally.

However, I don't want this power to only exist in the hands of the director administrator. The origin/cache administrator should have a web UI (and a corresponding CLI) where they can put their own service in and out of downtime.

@bbockelm bbockelm added enhancement New feature or request origin Issue relating to the origin component labels May 8, 2024
@bbockelm bbockelm added this to the v7.9.0 milestone May 8, 2024
@haoming29 haoming29 modified the milestones: v7.9.0, v7.10.0 Jun 12, 2024
@jhiemstrawisc jhiemstrawisc modified the milestones: v7.10.0, v7.13.0 Dec 11, 2024
@CannonLock CannonLock assigned h2zh and CannonLock and unassigned CannonLock Dec 30, 2024
@CannonLock
Copy link
Contributor

@howard we are going to have to split this ticket into a origin/cache api endpoint and the appropriate UI. Typically Haoming would build the api and I would build on top of it. Does that sound good to you?

@howard
Copy link

howard commented Dec 30, 2024

@CannonLock I'm not sufficiently familiar with this project to give any advice.

@h2zh
Copy link
Collaborator

h2zh commented Dec 30, 2024

Sounds good, @CannonLock Could you send me a few links to your previous collaborations so that I can learn more?

@CannonLock
Copy link
Contributor

@CannonLock I'm not sufficiently familiar with this project to give any advice.

@howard TIL all Howards are not interchangeable. :) My mistake I will ask our Howard (@h2zh) to give it a shot.

@h2zh
Copy link
Collaborator

h2zh commented Jan 15, 2025

This ticket is 80% done. However, it is blocked by an existing problem: The Director serverAds is a dictionary with key-value pairs as {serverURL:serverAd}. The format of serverURL looks like https://r20abdfaecf7f:8443. The namespaces registered in the registry sqlite db also follow this pattern. However, they are storing different PORTS - the director stores xrootd data port (e.g. 8843), while the registry saves web port (e.g. 8847)

When an origin wants to put itself into downtime, it sends a request to the director. Director first checks if the sender's server url is stored in director's cached serverAds. If so, it will retrieve this advertisement and carry the server namespace (e.g. /origins/test-server-name) to ask the registry if this namespace is registered. The inconsistency of PORTS break this workflow.

I talked to Justin about this. He mentioned there is another ticket #1241 that could solve this blocker. So I will just put this ticket on hold. When I completed that ticket, I'll go back to this one.

@h2zh
Copy link
Collaborator

h2zh commented Jan 17, 2025

I should've tagged you here for the update! @CannonLock

@h2zh h2zh modified the milestones: v7.13.0, v7.14 Jan 18, 2025
@h2zh h2zh linked a pull request Feb 6, 2025 that will close this issue
@CannonLock CannonLock modified the milestones: v7.14, v7.15 Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request origin Issue relating to the origin component
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants