-
Notifications
You must be signed in to change notification settings - Fork 387
Refactor how we build unikernels for testing #2316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Unfortunately git didn't understand that |
7c4958f to
0f49331
Compare
|
Tests are still passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think this is a good step forward. I'm approving these on the condition that you update the unsandbox_list with all the network tests that require additional capabs, and with an explanation of why they are not sandboxed as a comment next to each item in that list. If you don't know the answer, just describe the error message you get from nix so that others can see what to expect if they try to run the same test sandboxed.
Also, I'd like to hear @MagnusS thoughts on this. We have discussed alternative ways of running tests in the past, to not require any privileges at all, but I think that's a bigger discussion.
test.sh
Outdated
| "websocket" # Linking fails, undefined ref to http_parser_parse_url, http_parser_execute | ||
| ) | ||
|
|
||
| sandboxed=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little confusing that you have two mechanisms for adding and removing from sandbox - setting / unsetting this and then also the unsandbox list. I would prefer a single mechanism, and that we explicitly add the tests that can't run sandboxed do the unsandbox list, with an explanation of why.
For the kernel/integration/term test I think it's because it's trying to open a telnet connection directly. For the others I thought they didn't require capabilities themselves to run, but that creating the bridge needed to happen ahead of time, and that requires additional network / admin capabs. Assuming a CI machine was set up with that bridge by default, would the networking tests then also require extra capabs? This is just a question, I don't know the answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little confusing that you have two mechanisms for adding and removing from sandbox
Agreed. Let's just explicitly list them. It's better to know why than to just glob it. I didn't list them before because there was quite many in the networking list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought they didn't require capabilities themselves to run,
vmrunner requires permission to attach to the bridge. I'm pretty sure it's vmrunner's startup that requires the capability, not the unikernel or messaging.
Assuming a CI machine was set up with that bridge by default, would the networking tests then also require extra capabs? This is just a question, I don't know the answer.
I think so. I don't know either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the kernel/integration/term test I think it's because it's trying to open a telnet connection directly.
I just checked the test. It seems to connect to LiveUpdate (I still haven't understood what that is) over telnet, as you mentioned. Maybe we should rename the test to avoid confusion?
0f49331 to
e8e7224
Compare
|
I think I've addressed the issues mentioned (tests are still passing, hopefully this time without skipping any). Did I miss something? |
All good from my side, but I requested a review from @MagnusS as well as mentioned |
In order to fix #2310, I have migrated the building of the unikernels to nix instead of manually calling cmake in a shellHook.
This changes how we run tests. nix has a
checkPhasespecifically designed to test builds. We now use this whenever we can. Because we're now usingnix-buildfor all the building, this implicitly becomes pure.Another change with this is that we will no longer need to rebuild and recheck tests we already know are passing. We could enable
--checkto force rechecking unchanged derivations. Do we want that?Furthermore, we should now be able to pass an architecture target for testing.
Annoyingly, not all tests are able to run through the checking phase because capabilities (specifically, we need
cap_net_adminandcap_net_rawonqemu-bridge-helper) are dropped upon entering this phase. To work around this, we run the few tests that actually need to be run without a proper reproducible sandbox in a shellHook (this is partially what we were doing before). Ideally, we'd get rid of this. I have seen that security.wrappers.<name>.capabilities exists: maybe we can handle the capabilities through this instead: but I suppose that is part of includeos/vmrunner instead.I have also renamed
example.nixtounikernel.nixbecause it's not really specific toexample/at all. I considered moving it totests/default.nix(it would make./test.shslightly cleaner), instead, but wasn't sure how we feel about "hiding" these entry points. Personally I have a preference towards keeping the root directory uncluttered.shell.nixis not involved in testing at all anymore, but it does provide some useful PATHs upon entering the shell. Maybe we can mergedevelop.nixback intoshell.nixafter this...In summary:
checkPhaseovershellHookwhen possibleshell.nixis now mostly a stubexample.nixtounikernel.nixunikernel.nixthrough nix artifactscap_net_admin+ep qemu-bridge-helperwas missing!)All tests are passing.