Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] Fix dentry of open file handles related to a rename (revised version) #1979

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

g2flyer
Copy link
Contributor

@g2flyer g2flyer commented Aug 22, 2024

Description of the changes

During rename, the dentry of the corresponding handle is not updated. Keeping that consistent is necessary to resolve issue #1835 in Draft PR #1856.

This is a revised (simplified and cleaned-up) version of PR #1874.

How to test this PR?

Usual regression tests. Note, that PR #1874 added additional regression tests to cover the cases which lead to discovering this issue but where then incorporated in (already merged) PR #1875.


This change is Reviewable

@g2flyer g2flyer marked this pull request as ready for review September 3, 2024 19:31
Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 15 files at r1, all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @g2flyer)


-- commits line 9 at r1:
To be honest, I would just merge these two commits into one. They both pertain to the same problem, and tracking which fixups go to which commit is a pain.


libos/include/libos_handle.h line 168 at r1 (raw file):

    struct libos_fs* fs;
    /* dentry can change due to rename, so to derefence or update requires holding `lock`. */

I think this is simpler:

/* dentry field can change due to rename, so all accesses should be protected by `handle->lock` */

libos/src/bookkeep/libos_handle.c line 328 at r1 (raw file):

static int clear_posix_locks(struct libos_handle* handle) {
    int ret = 0;
    if (handle) {
  • This double-if looks ugly.
  • Not having a clear error code path on ret < 0 is super error prone (what if in the future we'll add another func call? we'll get lost in the control paths)

Can we refactor to:

    if (!handle)
        return 0;

    int ret;
    lock(&handle->lock);
    if (handle->dentry) {
        ... as it is in the old code ...
        ret = file_lock_set(handle->dentry, &file_lock, /*block=*/false);
        if (ret < 0) {
            log_warning("error releasing locks: %s", unix_strerror(ret));
            goto out;
        }
    }
    ret = 0;
out:
    unlock(&handle->lock);
    return ret;

libos/src/bookkeep/libos_handle.c line 519 at r1 (raw file):

    /* Clear flock (BSD) locks for a file. We are required to do that when the handle is closed. */
    int ret = 0;
    if (hdl) {

ditto (refactoring)


libos/src/fs/libos_fs_pseudo.c line 271 at r1 (raw file):

    /* Note: derefence handle->dentry in general has to be protected by handle->lock as it could
     * change due to rename.  However, as pseudo-fs does not support rename we can safely omit it
     * here (or push it on the numerous callers of fs_op->hstat). */

I'm confused by this comment. In libos_stat.c file you do exactly the second option: you push acquiring/releasing the lock to the caller of fs_ops->hstat. So then why do you need this comment here, as pseudo_hstat() will be called anyway under the required lock?

I looked at all usages of hstat(), please see my other comments. I think you missed one place.


libos/src/fs/libos_namei.c line 370 at r1 (raw file):

    assert(dent->inode);

    hdl->dentry = dent; /* not-yet-shared handle, so no look needed. */

look -> lock


libos/src/fs/libos_namei.c line 384 at r1 (raw file):

    assert(locked(&g_dcache_lock));
    assert(dent->inode);
    assert(!hdl->dentry); /* not-yet-shared handle, so no look needed. */

look -> lock


libos/src/fs/libos_namei.c line 434 at r1 (raw file):

    if (hdl)
        assert(!hdl->dentry); /* not-yet-shared handle, so no look needed. */

look -> lock


libos/src/fs/libos_namei.c line 746 at r1 (raw file):

    lock(&hdl->lock); /* while hdl->is_dir is immutable, hdl->dentry can change due to rename */
    get_dentry(hdl->dentry);
    *dir = hdl->dentry;

But if rename happens in parallel, we return some old dentry for this dir. Shouldn't we look at the callers of this function and somehow rearrange both the callers and this callee?

Or your analysis of the callers shows that it's ok for this func to return old dentry?


libos/src/ipc/libos_ipc_process_info.c line 106 at r1 (raw file):

                   if (g_process.exec) {
                       lock(&g_process.exec->lock);
                       dent = g_process.exec->dentry;

This is clearly wrong. We could have a rename in parallel, and dent could point to a dangling pointer (if g_process.exec->dentry has no other references and is garbage-collected in the meantime). You can either get_dentry(dent) and then put_dentry() after it was used, or move the locking to a wider scope.


libos/src/sys/libos_fcntl.c line 357 at r1 (raw file):

    lock(&hdl->lock);
    ret = file_lock_set(hdl->dentry, &file_lock, !(cmd & LOCK_NB));

This feels wrong to me. This is a long operation, and also file_lock_set() is not really about the hdl object, it's about the underlying dentry object. So I think it's more logical to refactor like this:

    libos_dentry dent = NULL;
    lock(&hdl->lock);
    if (hdl->dentry) {
        dent = hdl->dentry;
        get_dentry(dent);
    }
    unlock(&hdl->lock);

    ret = file_lock_set(dent, &file_lock, !(cmd & LOCK_NB));
    if (dent)
        put_dentry(dent);

libos/src/sys/libos_file.c line 365 at r1 (raw file):

        struct libos_handle* handle = fd_handle->handle;
        /* see comment in libos_handle.h on loocking strategy protecting handle->lock */
        assert(locked(&g_dcache_lock));

I don't think we need this assert, we already have it at the top of the func. And the comment here is enough to figure which locks are needed.


libos/src/sys/libos_file.c line 373 at r1 (raw file):

        }
        unlock(&handle->lock);
    }

Please add an empty line after this one


libos/src/sys/libos_getcwd.c line 112 at r1 (raw file):

out:
    put_handle(hdl);

This put_handle() should happen after all unlock operations.


libos/src/sys/libos_ioctl.c line 133 at r1 (raw file):

            if (fs->fs_ops->hstat) {
                struct stat stat;
                ret = fs->fs_ops->hstat(hdl, &stat);

Aren't you supposed to surround this with locking too?


libos/src/sys/libos_pipe.c line 249 at r1 (raw file):

    hdl1->acc_mode = MAY_READ;
    get_dentry(dent);
    hdl1->dentry = dent; /* new not-yet-exported handle, so skip unnecessary handle locking */

This is actually obvious, and in our codebase we don't add comments that we skip locking on currently-being-created objects. I would just remove these two comments (so, revert this file to its initial state).


libos/src/sys/libos_pipe.c line 256 at r1 (raw file):

    hdl2->acc_mode = MAY_WRITE;
    get_dentry(dent);
    hdl2->dentry = dent; /* new not-yet-exported handle, so skip unnecessary handle locking */

ditto


libos/src/sys/libos_stat.c line 42 at r1 (raw file):

        return -EACCES;

    lock(&hdl->lock);

This lock actually protects both hstat() call and hdl->dentry accesses, right? Could you add this comment?


libos/src/sys/libos_stat.c line 44 at r1 (raw file):

    lock(&hdl->lock);
    int ret = fs->fs_ops->hstat(hdl, stat);
    if (ret >= 0 && hdl->dentry) {

This looks unconventional, but the code snippet is really small, so I let it pass.


libos/src/utils/log.c line 39 at r1 (raw file):

        lock(&g_process.exec->lock);
        if (g_process.exec->dentry) {
            exec_name = g_process.exec->dentry->name;

This looks wrong. What if the g_process.exec->dentry object is garbage collected in parallel, and then exec_name is a dangling pointer? Looks like we need to do strdup() here, and free() at the end of the func.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants