Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Custom Keycloak's groups not populated into Shared Volume level #2476

Open
viniciusdc opened this issue May 21, 2024 · 3 comments
Open

Comments

@viniciusdc
Copy link
Contributor

Describe the bug

Since Nebari 2024.3.3 version, it looks like when creating custom groups on Keycloak, those new groups are not showing up in the user jupyterlab's pod after re-spawning the instance. This looks like was caused by upgrading the Group fetching logic after #2447 was integrated.

Expected behavior

At least until the new permission model update is in place, we expect that new groups created on Keycloak will automatically generate a subfolder in the share dir of the user's home.

OS and architecture in which you are running Nebari

Linux

How to Reproduce the problem?

  • Sping up the latest release o Nebari, or any version (>= 2024.3.3)
  • Create a custom group on Keycloak and assign a user to it;
  • Logged as such user, spin up a jupyterlab instance

Command output

No response

Versions and dependencies used.

No response

Compute environment

None

Integrations

Keycloak

Anything else?

This is mostly likely a parsing error with our current base_profile_shared_mounts function

@Adam-D-Lewis Adam-D-Lewis removed the needs: triage 🚦 Someone needs to have a look at this issue and triage label May 21, 2024
@viniciusdc
Copy link
Contributor Author

Before I discuss further on my findings, just an overview on what is happening under the hood, at least on the components related to this.

After updating to the latest Oauth and the changes introduced in 2447 for Keycloak's group fetching, we started getting issues with the group EFS shared subfolders, as they don't appear in the user's jupyterlab after the instance is launched when the user opts for a custom Keycloak Group.

After checking our source code, I found that base_profile_shared_mounts, as referenced bellow:

extra_container_config = {
"volumeMounts": [
{
"mountPath": pod_shared_mount_path.format(group=group),
"name": "shared" if home_pvc_name != shared_pvc_name else "home",
"subPath": pvc_shared_mount_path.format(group=group),
}
for group in groups
]
}
MKDIR_OWN_DIRECTORY = "mkdir -p /mnt/{path} && chmod 777 /mnt/{path}"
command = " && ".join(
[
MKDIR_OWN_DIRECTORY.format(path=pvc_shared_mount_path.format(group=group))
for group in groups
]
)
init_containers = [
{
"name": "initialize-shared-mounts",
"image": "busybox:1.31",
"command": ["sh", "-c", command],
"securityContext": {"runAsUser": 0},
"volumeMounts": [
{
"mountPath": f"/mnt/{pvc_shared_mount_path.format(group=group)}",
"name": "shared" if home_pvc_name != shared_pvc_name else "home",
"subPath": pvc_shared_mount_path.format(group=group),
}
for group in groups
],
}
]

Expect the groups object to be filtered out from the OAuath class user_oauth object, here:

# only return the lowest level group name
# e.g. /projects/myproj -> myproj
# and /developers -> developers
groups = [Path(group).name for group in auth_state["oauth_user"]["groups"]]

Where such an object is generated by the Oauth class, the generic authenticator class here (upstream):

    async def update_auth_model(self, auth_model):
        """
        Sets admin status to True or False if `admin_groups` is configured and
        the user isn't part of `admin_users` or `admin_groups`. Note that
        leaving it at None makes users able to retain an admin status while
        setting it to False makes it be revoked.


        Also populates groups if `manage_groups` is set.
        """
        if self.manage_groups or self.admin_groups:
            user_info = auth_model["auth_state"][self.user_auth_state_key]
            user_groups = self.get_user_groups(user_info)


        if self.manage_groups:
            auth_model["groups"] = sorted(user_groups)


        if auth_model["admin"]:
            # auth_model["admin"] being True means the user was in admin_users
            return auth_model


        if self.admin_groups:
            # admin status should in this case be True or False, not None
            auth_model["admin"] = bool(user_groups & self.admin_groups)


        return auth_model

Later customized by our KeycloakOauthenticator class, in here:

I assumed initially this was an issue with the authenticator upstream code, as we were passing the required settings but the groups don't seemed to receive the correct values, which I was able to attest were being sent by Keycloak to the Oauth (you can do so by inspecting the client user response body in the Admin realm):

image

After talking with @krassowski, I am sure the issue is within our code and likely in the base mounts function outlined above (as the contents of groups could be different from before) or in the custom Keycloak Oauth class logic.

I am currently testing out the response object sent to both the Keycloak class and the base mount function to determine which part of the codebase is incorrect.

@viniciusdc viniciusdc self-assigned this May 22, 2024
@Adam-D-Lewis
Copy link
Member

Adam-D-Lewis commented May 23, 2024

BTW, the shared directory randomly showed up me for after a while in jupyter. I had previously ssh'd into the nfs pod and manually created the directory so I'm not sure if that affected it. I'm not sure why it took so long though. Maybe a redeployment of Nebari was necessary, not sure.

@viniciusdc
Copy link
Contributor Author

I re-attempted this yesterday from develop, but the error didn't show up (locally) this time. I will test again using GCP (the same as when I first found this) to make sure it's there or not.

I am trying to think of a reason why this appeared. I inspected the Kecloak Authenticator and the objects there seemed okay based on what was available.

One thing that was different in my last attempt, was tat I directly deployed the utmost most version, whereas in the previous attempt this w,as an upgrade from an older version to a more recent release. It could've been a silent issue somewhere within the EFS that I didn't catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: New 🚦
Development

No branches or pull requests

2 participants