Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phd: lspci_lifecycle_test passes incorrectly #792

Open
2 tasks
iximeow opened this issue Oct 16, 2024 · 1 comment
Open
2 tasks

phd: lspci_lifecycle_test passes incorrectly #792

iximeow opened this issue Oct 16, 2024 · 1 comment
Labels
testing Related to testing and/or the PHD test framework.

Comments

@iximeow
Copy link
Member

iximeow commented Oct 16, 2024

disclaimer: this is because of a framework bug, not a Propolis bug. i noticed that lspci_lifecycle_test fails on an Ubuntu 22.04 guest image i'd put together, and at first thought it was wrong, but the truth is stranger than fiction...

in lspci_lifecycle_test we run both lspci and lshw:

let lshw = vm.run_shell_command(LSHW).await?;
ctx.lifecycle_test(vm, &[Action::StopAndStart], move |vm| {
let lspci = lspci.clone();
let lshw = lshw.clone();
Box::pin(async move {
let new_lspci = vm.run_shell_command(LSPCI).await.unwrap();
assert_eq!(new_lspci, lspci);
let new_lshw = vm.run_shell_command(LSHW).await.unwrap();
assert_eq!(new_lshw, lshw);

on an Ubuntu guest, the lshw assert fails because before and after messages don't match. the difference in the (rather large) strings of output is only that the machine's serial does not match after being stopped and started. i double-checked on a real instance, and a Debian 11 guest's observed value for serial is in fact the instance's ID, and that ID is stable across a stop and start. again, test bug not real bug.

the immediate issue in the test framework is that in the test we validate that lshw and lspci agree across a StopAndStart, but that action involves spawning a successor VM which makes a new TestVm and in turn gets a new id.

  • we probably should just keep the same TestVm if all we're doing is a StopAndStart. seems like this is the only test using StopAndStart, so that's simple enough.

why in the world did this pass with Alpine or Debian images though? i'm glad you've asked!

# on Alpine:
localhost:~# lshw
-ash: lshw: not found
localhost:~#

# or on a different Alpine:
localhost:~# sudo lspci -vvx
-sh: sudo: not found
localhost:~#

# on Debian:
root@debian:~# lshw
-bash: lshw: command not found
root@debian:~#

assert_eq!("-ash: lshw: not found", "-ash: lshw: not found") or equivalent error from bash will pass every time :) i only have an lshw out-of-the-box on this Ubuntu image, which seems to be why it only fails there.

  • run_shell_command ought to check that the shell command that was run returned 0, and force test authors to deal with unexpected test command failures
@iximeow iximeow added the testing Related to testing and/or the PHD test framework. label Oct 16, 2024
@iximeow
Copy link
Member Author

iximeow commented Oct 16, 2024

... realized this morning that the immediate failure i'd observed above was because my Ubuntu guest adapter included some changes i was also going to propose: use passwordless root instead of ubuntu there. with an image configured more like the in-tree PHD adapter expects it just hangs at [sudo] password for ubuntu: and timeout fails instead :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Related to testing and/or the PHD test framework.
Projects
None yet
Development

No branches or pull requests

1 participant