Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random-ish I/O errors when using unionfs with nfs #96

Open
mrvn opened this issue May 14, 2021 · 2 comments
Open

random-ish I/O errors when using unionfs with nfs #96

mrvn opened this issue May 14, 2021 · 2 comments

Comments

@mrvn
Copy link

mrvn commented May 14, 2021

I have a kind of convoluted setup for building boot images for a high performance computing cluster consisting of 4 layers:

  1. a unionfs-fuse over a pristine chroot, a dir with stuff for the image, a dir with stuff with test cases only used during build. This also has a plugin that outputs all accessed files into a log.
  2. the unioned filesystem is then exported via NFS kernel server
  3. kvm with mini initramfs to setup step 4 and pivot_root to unionfs
  4. unionfs over a tmpfs and NFS

The KVM instance boots up and runs a bunch of test cases for all the tools that belong in the boot image and every accessed file is logged by step 1. This gives us a list of files needed in the boot image allowing us to create minimal boot images.

Now the problem is that randomly the test cases get an I/O error. This either causes a Bus Error in an application itself or reading some file fails. This is fatal to ~80% of build attempts at the moment for one specific image and one user as it hits a essential systemd service file. Works fine for another user. Works better when the build server is freshly booted and seem to get slightly worse over time. Something fishy is going on there.

Are there any known random failures with either tmpfs or nfs as branches? Or do you have tips for debugging this without getting a billion lines of strace output?

@rpodgorny
Copy link
Owner

hi!

...unfortunately, i don't know any known bugs that would be somehow specific to nfs or tmpfs. also, your setup seems too complicated to draw any conclusion or to give better advice than "try to remove some of the layers" (just for testing purposes)... :-(

anyway, if you manage to find the problem and it's really caused by unionfs, i'd love to hear back from you, thanks!

@rpodgorny
Copy link
Owner

@mrvn hi! i've just release the v3.2 version with some nfs fixes (among others) -> could you please try if this fixes your problems? thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants