Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix current version bug #366

Open
wants to merge 8 commits into
base: release-2025.0.1
Choose a base branch
from

Conversation

EstelleDa
Copy link
Member

@EstelleDa EstelleDa commented Dec 5, 2024

Only show unpublished superseding score set to the users who have permissions to view it. The current version to other users will only be the published one.

Fix #370

…missions to view it. The current version to other users will only be the published one.
Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this Estelle. Is there a reason why all of the function definitions got an extra indent? It also might be nice if we add a test that ensures this new behavior to superseding score sets.

Comment on lines 107 to 116
if(
item
and item.superseding_score_set
and not owner_or_contributor
and (
urn_re.MAVEDB_OLD_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
or urn_re.MAVEDB_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
)
):
item.superseding_score_set = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of checking like this we check the permissions of the superseding score set, something like:

Suggested change
if(
item
and item.superseding_score_set
and not owner_or_contributor
and (
urn_re.MAVEDB_OLD_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
or urn_re.MAVEDB_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
)
):
item.superseding_score_set = None
from mavedb.lib.permissions import Action, assert_permission, has_permission
if item.superseding_score_set:
superseding_score_set = db.scalars(select(ScoreSet).where(ScoreSet.urn = item.superseding_score_set).one()
if not has_permission(user, item, Action.READ):
item.superseding_score_set = None

This way, we don't have to maintain any extra permission logic and can guarantee the item within the superseding score set property is only returned if the user has access to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified it. I check item.superseding_score_set directly cause it's an object.

@bencap bencap changed the base branch from release-2024.4.3 to release-2024.4.4 December 10, 2024 22:36
Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you also be able to fix the extra indentations before merging? The logic looks good!

@EstelleDa
Copy link
Member Author

Would you also be able to fix the extra indentations before merging? The logic looks good!

@bencap Do you mean the one we discussed in Slack, #371?

…score sets are unpublished yet. Haven't fixed the search score set codes.
@bencap
Copy link
Collaborator

bencap commented Jan 7, 2025

Would you also be able to fix the extra indentations before merging? The logic looks good!

@bencap Do you mean the one we discussed in Slack, #371?

No it seems like there is just an extra level of indentation that got added to each of the function definitions in the src/mavedb/routers/score_sets.py file for whatever reason.

Comment on lines 172 to 201
superseding_score_sets = (
db.query(ScoreSet)
.filter(ScoreSet.experiment_id == experiment.id)
.filter(ScoreSet.superseding_score_set.has())
.all()
)

updated_score_set_result = []
for s in score_set_result:
current_version = s
while current_version:
if current_version.superseded_score_set:
if not has_permission(user_data, current_version, Action.READ).permitted:
next_version: Optional[ScoreSet] = next(
(sup for sup in superseding_score_sets if sup.urn == current_version.superseded_score_set.urn),
None
)
# handle poetry run mypy src/ error so that add next_version
if next_version:
current_version = next_version
else:
break
else:
break
else:
break
if current_version:
updated_score_set_result.append(current_version)
else:
updated_score_set_result.append(s)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts on this are that it is something we may do often enough that we will want a separate utility function. What I would suggest is:

  • Create a new function find_superseded_score_set_tail. We'll be passing the function a ScoreSet database object, and then I think it also makes sense to pass it an optional user_data object and an action object. Basically, if we call the function with action=<some_action>, we will return the permitted tail of the superseded score set chain rather than the 'actual' tail. So if we have a chain looking like urn1 -> urn2 -> urn3 -> tmp1 -> tmp2 and the user has permission for all the urn objects but not the tmp objects, we would return the urn3 object rather than tmp2. If we call the function with action=None, we ignore the user permissions and just return the end of the chain (tmp2) in this case. Notice, we presume the permissions on the passed score set. This permission should be checked by the caller.
def find_superseded_score_set_tail(score_set: ScoreSet, action: Optional[Action] = None, user_data: Optional[UserData] = None):
    ...
  • I like the iterative approach you took above. We're basically just traversing a singly linked list, but with the added caveat that we have to look out for permissions and keep track of the previous score set so we can retain and return it. I think it'd look something like:
def find_superseded_score_set_tail(score_set: ScoreSet, action: Optional[Action] = None, user_data: Optional[UserData] = None):
    while score_set.superseding_score_set is not None:
        next_score_set_in_chain = score_set.superseding_score_set

        # If we were given a permission to check and the next score set in the chain does not have that permission,
        # pretend like we have reached the end of the chain. Otherwise, continue to the next score set.
        if action is not None and not has_permission(next_score_set_in_chain, action, user_data):
            return score_set

        score_set = next_score_set_in_chain

    return score_set
  • Now, we can call our new helper function on each of the score sets in the score set result and then guarantee the permissions of the base results:
...

score_set_result = (
    db.query(ScoreSet)
    .filter(ScoreSet.experiment_id == experiment.id)
    .filter(~ScoreSet.superseding_score_set.has())
    .all()
)

# First, guarantee the permissions of the base score set results. We should verify permissions can only get narrower for superseding score sets. If they can widen, that would lead to bugs with this method.
score_set_result = [
    score_set for score_set in score_set_result if has_permission(
         user_data, 
         score_set, 
         Action.READ
    ).permitted
]

# If a score set has no superseding score set, we have already guaranteed its permissions. This function guarantees the permission of the superseding score sets.
superseded_score_set_tails = [
    find_superseded_score_set_tail(
        score_set, 
        Action.READ, 
        user_data
    ) for score_set in score_set_result
]

...
  • superseded_score_set_tails now contains all the score sets which do not have a superseding score set and to which the user has permissions in addition to the deepest superseding score set to which the user has permissions for each of the score sets with a superseding score set.

Honestly, this is a super tricky little problem so I'm happy to talk more about it if any of that doesn't make sense. Also possible that I made a mistake somewhere in that, so lmk if you see one. Given the complexity, this might even deserve its own change request. But I like the way this solution will allow us to reuse the find_superseded_score_set_tail function and write some test cases for it, since its complexity means it isn't something we're going to want to maintain in multiple spots and this is probably going to be something we want to reuse in other locations where we need to serve the end of a superseding score set chain to a user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot handle a case. If the superseding score set is temporary, it'll be filtered out on the first score_set_result. An example case is experiment urn:mavedb:00000049-a. Its score set will be empty. I agree to create a separating function to check the current version of score set as you suggested. It'll be easy to handle in the future.

@EstelleDa
Copy link
Member Author

#371

I see. I think the original reason is I clicked the tidy file button, but I converted them back before pushing the codes. Maybe I didn't notice this file.

@bencap bencap changed the base branch from release-2024.4.4 to release-2025.0.1 January 25, 2025 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants