Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use at-family of syscalls instead of building paths #71

Open
mrvn opened this issue Jun 22, 2017 · 4 comments
Open

Use at-family of syscalls instead of building paths #71

mrvn opened this issue Jun 22, 2017 · 4 comments

Comments

@mrvn
Copy link

mrvn commented Jun 22, 2017

We use unionfs to overlay a ramdisk and NFS server to create the / filesystem of network booted nodes. This makes it kind of hard to have the individual branches visible and reachable by path. Any access to attempts to access /unionfs/ramdisk/ or /unionfs/nfs/. Which first needs to access /, so you have a loop and deadlock. To work around there is the chroot option. Which keeps the unionfs process running with a different /, usualy one where the branches are mounted, from the rest of the system.

But there is a better option. The *at family of syscalls allows accessing files relative to a file descriptor of a directory. Unionfs can open the directory for each branch and store the FD. Then when trying to access a file on a branch instead of building a new, branch specific, path the respective *at function can be used with the stored FD. E.g. readlink(branch_path, buf, bufsize) becomes readlink(branch_fd, path, bug, bufsize).

This has 2 benefits:

  1. No need to create a individual path for each branch. Which means this is faster and avoids the problem of hitting ENAMETOOLONG because the combined path is too long.
  2. avoids the looping and the need to chroot

We've been using a patched unionfs 0.24 with *at() support for years and are now looking into updating to 2.0. So I'm wondering if there is interest for an updated patch and how it should look like.

The current patch for 0.24 adds a compile time option and (if enabled) replaces all filesystem syscalls with vardiac macros that take a branch and paths, e.g.

+int branch_symlink(const char *oldpath, int branch, ...);
+#define SYMLINK(oldpath, branch, ...) branch_symlink(oldpath, branch, VA_ARGS, NULL)

  •   if (symlink(link, cow->to_path)) {
    
  •   if (SYMLINK(link, cow->to_branch, cow->to_path)) {
    

The reason for the vardiac macro is that in places paths are created inside unionfs, e.g. to add _HIDDEN~ to the filename in filedir_hidden(). The macro avoid constructing one path to add the _HIDDEN~ and then another to add the branch path in front (and fix the bug that ENAMETOOLONG is not checked here).

@rpodgorny
Copy link
Owner

yes, i'll be interested in that! ...do you have the 0.24 patch somewhere?

@mrvn
Copy link
Author

mrvn commented Jun 22, 2017

@rpodgorny
Copy link
Owner

oh, boy, that's huge... :-( ...any special reason you picked macros and any special reason you picked varargs? (why not just add ifdefs?)

@mrvn
Copy link
Author

mrvn commented Jun 22, 2017

In some places unionfs adds things to the path, like the _HIDDEN~. With the varargs the code can just pass all parts of the path as argument and the final build_path call will build a complete path once. The idea was to only copy all the path parts and check for ENAMETOOLONG once. And iirc the macros are needed so the varargs work.

Ideally the path shouldn't be copied at all for most operations when using *at functions, only a few places need it. But without them it needs to copy. The macros and varargs allowed to use the same API in all cases at the cost of copying always once. Something worth reconsidering.

Note: The original code did not check for ENAMETOOLONG in several places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants