Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: separate zebra ZAPI server open and accept #17313

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mjstapp
Copy link
Contributor

@mjstapp mjstapp commented Oct 30, 2024

It sounds from #16747 as if it's possible for zebra to get privs wrong when it sets up its zapi server socket when running in privs-per-process mode. The current zserv code opens and starts listening on that server socket pretty late during startup, since zebra needs to be ready to do what clients start asking it to do.
This PR separates zebra's ZAPI server socket handling into two phases: an early phase that opens the socket, and a later phase that starts listening for client connections. The early 'open' phase is called quite early, before other zebra subsystems are started. The 'start' phase is still called later on.

@KanjiMonster
Copy link
Contributor

Currently giving this a spin with the shell script from #16747, looks good so far.

@KanjiMonster
Copy link
Contributor

KanjiMonster commented Nov 1, 2024

At least a quick smoke-test with our internal testing didn't show any new failures (with this patch and #16749 applied on top of 9.1.2).

@ton31337 ton31337 added this to the 10.2 milestone Nov 4, 2024
@ton31337
Copy link
Member

ton31337 commented Nov 4, 2024

@Mergifyio backport dev/10.2 stable/10.1 stable/10.0 stable/9.1 stable/9.0

Copy link

mergify bot commented Nov 4, 2024

backport dev/10.2 stable/10.1 stable/10.0 stable/9.1 stable/9.0

🟠 Waiting for conditions to match

  • merged [📌 backport requirement]

@mjstapp
Copy link
Contributor Author

mjstapp commented Nov 6, 2024

rebased to latest upstream

Separate zebra's ZAPI server socket handling into two phases:
an early phase that opens the socket, and a later phase that
starts listening for client connections.

Signed-off-by: Mark Stapp <[email protected]>
@mjstapp
Copy link
Contributor Author

mjstapp commented Nov 6, 2024

Fixed a path that's called once per-netns, not once per startup.

@mwinter-osr
Copy link
Member

CI:rerun Check ANVL results again

@KanjiMonster
Copy link
Contributor

Might be worth to point out that this does not fix the core issue, but only works around it.

The core issue is that frr_with_privs(NULL) does nothing meaningful, especially does not ensure that code is run without privileges in the per-process privileges case.

While zebra's zserv is the only user of frr_with_privs(NULL), I wouldn't be surprised if there are other places that also assume that things are done without privileges, but don't use frr_with_privs(NULL).

@mjstapp
Copy link
Contributor Author

mjstapp commented Nov 12, 2024

Sure - I'm not claiming to have fixed everything, just wanted to try to help with the immediate symptom, I guess.
I've opened a more general issue about this more general ... issue: #17420

Might be worth to point out that this does not fix the core issue, but only works around it.

The core issue is that frr_with_privs(NULL) does nothing meaningful, especially does not ensure that code is run without privileges in the per-process privileges case.

While zebra's zserv is the only user of frr_with_privs(NULL), I wouldn't be surprised if there are other places that also assume that things are done without privileges, but don't use frr_with_privs(NULL).

@KanjiMonster
Copy link
Contributor

Sure - I'm not claiming to have fixed everything, just wanted to try to help with the immediate symptom, I guess. I've opened a more general issue about this more general ... issue: #17420

Might be worth to point out that this does not fix the core issue, but only works around it.
The core issue is that frr_with_privs(NULL) does nothing meaningful, especially does not ensure that code is run without privileges in the per-process privileges case.
While zebra's zserv is the only user of frr_with_privs(NULL), I wouldn't be surprised if there are other places that also assume that things are done without privileges, but don't use frr_with_privs(NULL).

Just to be clear this wasn't meant as a critique to the approach; I think the solution is right for now and should be merged, as it does fix a race condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants