Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple variant tags filtering #3502

Merged
merged 31 commits into from
Sep 7, 2023
Merged

Multiple variant tags filtering #3502

merged 31 commits into from
Sep 7, 2023

Conversation

ShifaSZ
Copy link
Contributor

@ShifaSZ ShifaSZ commented Jul 20, 2023

Now, it updates the saved-variant filtering with multiple tags. No behavior changes to the project saved-variants.

@ShifaSZ ShifaSZ marked this pull request as ready for review July 27, 2023 13:26
Comment on lines 91 to 93
saved_variant_models = SavedVariant.objects.filter(
varianttag__variant_tag_type__name__in=tags,
varianttag__variant_tag_type__project__isnull=True,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns more saved variants than we need. But the front end will filter out the proper variants to display.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be doing all the filtering on the backend for the summary data page. Why not keep the 2 distinct queries we had before so that we actually only retunr the correct variants:

tag_types = VariantTagType.objects.filter(name__in=tags, project__isnull=True)
saved_variant_models = SavedVariant.objects.filter(varianttag__variant_tag_type__in=tag_types)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tag_types = VariantTagType.objects.filter(name__in=tags, project__isnull=True)
saved_variant_models = SavedVariant.objects.filter(varianttag__variant_tag_type__in=tag_types)

I did some tests and found the result is the same. I can't find a Django filter that means our requirements. What I could do is add a Python filter afterward.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you just chain the filters?

saved_variant_models = SavedVariant.objects
for tag_type in tag_types:
    saved_variant_models = saved_variant_models.filter(varianttag__variant_tag_type=tag_type)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works. Can it be a concern about having many DB queries?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

django converts that to 1 db query. Chaining filter statements does not execute a query

Comment on lines 68 to 71
} else if (gene) {
dispatch({
type: RECEIVE_SAVED_VARIANT_TAGS,
updates: SUMMARY_PAGE_SAVED_VARIANT_TAGS.reduce((acc, t) => ({ ...acc, [t]: false }), {}),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear loaded tags flags. It is better to detect gene changes. But I wonder if it's worth doing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we clear previously loaded data because we loaded more data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the data loaded for the tags must be reloaded if the gene changes because the data differs for the genes with the same tags. I don't detect gene changes (I probably should). So clear tags loaded flag if the gene is not null to ensure correct results.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if we previously loaded like all the variants for a different tag, why would clear that? We should only clear for the tag(s) that were loaded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the keys from tags to tag and gene pairs? It is cleaner and has better performance.

@ShifaSZ ShifaSZ requested a review from hanars July 27, 2023 15:40
seqr/views/apis/summary_data_api.py Show resolved Hide resolved
Comment on lines 91 to 93
saved_variant_models = SavedVariant.objects.filter(
varianttag__variant_tag_type__name__in=tags,
varianttag__variant_tag_type__project__isnull=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be doing all the filtering on the backend for the summary data page. Why not keep the 2 distinct queries we had before so that we actually only retunr the correct variants:

tag_types = VariantTagType.objects.filter(name__in=tags, project__isnull=True)
saved_variant_models = SavedVariant.objects.filter(varianttag__variant_tag_type__in=tag_types)

@@ -1041,6 +1041,29 @@ export const REVIEW_TAG_NAME = 'Review'
export const KNOWN_GENE_FOR_PHENOTYPE_TAG_NAME = 'Known gene for phenotype'
export const DISCOVERY_CATEGORY_NAME = 'CMG Discovery Tags'
export const MME_TAG_NAME = 'MME Submission'
export const SUMMARY_PAGE_SAVED_VARIANT_TAGS = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is confusing to have a page-specific constant in the global constants list, exspecially sicnce its used in other forms elsewhere. It is better to have a single VARIANT_TAGS constant and then if there need to be different tag options on different pages you can filter out the ones you don;t want for that specific page

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dropdown tag options for the summary page and project page are different.
On the summary page, they are as the following figure:

On the project pages, they are:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should not be a page specific constant in the shared constants file. If things are different on different pages, there should be page specific constants in the page specific constants files. If this constant is used in multiple places, then it should not be named SUMMARY_PAGE_SAVED_VARIANT_TAGS it should just be SAVED_VARIANT_TAGS. If it is just used in one place for one page, then it should be in a different file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the constant into the SummaryData folder.

(responseJson) => {
if (tag && !gene) {
if (tags[0] === SHOW_ALL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this case is impossible - you can't return all variants with no gene filter applied

Comment on lines 68 to 71
} else if (gene) {
dispatch({
type: RECEIVE_SAVED_VARIANT_TAGS,
updates: SUMMARY_PAGE_SAVED_VARIANT_TAGS.reduce((acc, t) => ({ ...acc, [t]: false }), {}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we clear previously loaded data because we loaded more data?

Comment on lines 1 to 2
/* eslint-disable import/prefer-default-export */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a warning that the eslint-disable doesn't change anything because there are no import/prefer-default-export violations.

Comment on lines 145 to 150
const isCategory = categoryOptions.includes(newTag)
updateTableField('categoryFilter')(isCategory ? newTag : null)
const lastNewTag = newTag.length > 0 ? newTag[newTag.length - 1] : null
const isCategory = categoryOptions.includes(lastNewTag)
const [firstTag, ...otherTag] = newTag
const updatedTag = firstTag === ALL_FILTER || categoryOptions.includes(firstTag) ? otherTag : newTag
updateTableField('categoryFilter')(isCategory ? lastNewTag : null)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviors include: if the latest selected option is a category, remove all selected options; otherwise if the previous (or the first) selected option is a category, remove the category option.
The issue is that when a category is selected, the category disappears from the option list.
No category is selected:

After a category (Collaboration) is selected:

Copy link
Collaborator

@hanars hanars Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a category is selected, all the tags in the category should be shown as selected and should be added to the list of selected tag breadcrumbs, the category itself should remain in the dropdown and clicking it again will have no effect

Also, selected tags need to be the color of the tag, similar to how they behave in the UI for adding tags to a variant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the category itself should remain in the dropdown and clicking it again will have no effect

How about the category also disappear from the dropdown list? It is much easier to implement and it doesn't make sense to click on a selected category.

Comment on lines 46 to 48
const stateKey = tag && `${tag.split(TAG_URL_DELIMITER).sort().join(TAG_URL_DELIMITER)}${gene}`
if (tag) {
if (getState().savedVariantTags[tag]) {
if (getState().savedVariantTags[stateKey]) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the gene to the key for the state so that no states have to be removed when gene changed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be adding a lof of complexity that will be hard to maintain and have almost no performance improvement - we don't anticipate users toggling back and forth to the same gene, so refetching a gene is not an issue. Please explain why this was not needed before but is now needed because we are adding support for multiple tags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a bug before when the gene changed. The UI still displayed the variant data for the previous gene.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then aither a) make a separate PR to fix the bug and branch from that. or b) make a ticket to fix that bug and fix it later and not as part of this PR. This PR is really hard to understand because there is a lot in here that is unrelated to the functionality change requested in the ticket, and I can't review it effectively

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can leave the reducer unchanged for this PR and create a bug ticket.

@@ -29,29 +29,7 @@ const FILTER_FIELDS = [
VARIANT_PER_PAGE_FIELD,
]

const TAG_OPTIONS = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the advantage of moving this? The constant is only ever used here so it is better to keep it in the file. In general, PRs should really focus on changing the behavior needed for the actual functional change, adding unrelated changes makes reviewing the actual functional changes harder

Copy link
Contributor Author

@ShifaSZ ShifaSZ Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right. I used the tag options for the reducer to update the tag-loaded state before my latest update. We don't need it now.

@ShifaSZ ShifaSZ requested a review from hanars August 4, 2023 14:34
Copy link
Collaborator

@hanars hanars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has gotten unwieldy, in large part because the scope has been really blown up. The initial ticket requested only that behavior for the summary data page change. In an attempt to simplify the change we decided to also change the functionality on the project page, but this has gotten way too complicated in how categories will be handled etc. I am restricting the scope of this back to only the summary page, ad the project page will continue to be a single select

@ShifaSZ ShifaSZ requested a review from hanars August 7, 2023 16:34
tag_type = VariantTagType.objects.get(name=tag, project__isnull=True)
saved_variant_models = SavedVariant.objects.filter(varianttag__variant_tag_type=tag_type)
tags = tag.split(';')
tag_type = VariantTagType.objects.filter(name__in=tags, project__isnull=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be tag_types plural, as it is a collection of models not a single model

seqr/views/apis/summary_data_api_tests.py Show resolved Hide resolved
@@ -231,14 +236,18 @@ class LocalSummaryDataAPITest(AuthenticationTestCase, SummaryDataAPITest):
MANAGER_VARIANT_GUID = 'SV0000006_1248367227_r0004_non'


def assert_has_expected_calls(self, users, skip_group_call_idxs=None):
def assert_has_expected_calls(self, users, skip_group_call_idxs=None, has_ws_access_level_call=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is only used twice so theres no reused behavior for mock_get_ws_access_level at all anymore - it wold be better to take that check out of the helper and explicitly add the correct check for it after calling this helper

const tags = tag ? tag.split(TAG_URL_DELIMITER) : tag
const tagCategoryFilters = tags || (variantGuid ? [] : [(tableState.categoryFilter || ALL_FILTER)])
const firstAppliedTagCategoryFilter = tagCategoryFilters.length > 0 ? tagCategoryFilters[0] : null
const appliedTagCategoryFilter = multiple ? tagCategoryFilters : firstAppliedTagCategoryFilter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this functionality should not be shared, its really page specific and hard to understand whats happening here. Having a dropdownValue prop explicitly passed through here would make the page specific logic much clearer


let shownFilters = (discoveryFilters && appliedTagCategoryFilter === DISCOVERY_CATEGORY_NAME) ?
let shownFilters = (discoveryFilters && firstAppliedTagCategoryFilter === DISCOVERY_CATEGORY_NAME) ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to above, discoveryFilters is page specific so its confusing to have this logic here, it would be better to move this logic into the page-specific component and conditionally pass the correct filters prop

pairedVariants = matchingVariants(
pairedVariants, ({ tagGuids }) => tagGuids.some(tagGuid => tagsByGuid[tagGuid].name === tag),
)
const tags = tag.split(TAG_URL_DELIMITER)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't doing the split here means all the above logic for MME/notes would not work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MME Submission/Has Notes won't work for multiple-tag filtering. They are not required since they are not on the summary data options list. But we should have a more systemic solution for handling tag/tags here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we filter the tags in the backend and don't support the project saved-variant multiple-tag filtering, these changes are unnecessary.


const appliedTagCategoryFilter = tag || (variantGuid ? null : (tableState.categoryFilter || ALL_FILTER))
const { familyGuid, variantGuid } = match.params
const optionValue = getSelectedTag(variantGuid ? null : (tableState.categoryFilter || ALL_FILTER))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a function prop to get the selected option value because the tableState is only available here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSelectedTag should then take tableState as an argument and have page specific handling for that, as this is still page specific logic. Moreover, variantGuid is only a valid prop on the project page, not the summary page, so that logic also belongs in the project page

@ShifaSZ ShifaSZ requested a review from hanars August 9, 2023 17:59
@@ -67,11 +68,17 @@ TAG_OPTIONS.push({

const PAGE_URL = '/summary_data/saved_variants'

const getUpdateTagUrl =
(selectedTag, match) => `${PAGE_URL}/${selectedTag}${match.params.gene ? `/${match.params.gene}` : ''}`
const getUpdateTagUrl = (selectedTag, match) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A multiple dropdown should have a list as its value, and it should therefore handle adding or removing elements to that list smoothly, and you should not be writing this sort of custom logic

Copy link
Contributor Author

@ShifaSZ ShifaSZ Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the custom logic is about the SHOW_ALL option. We need to remove the existing SHOW_ALL when adding a new option and remove all current options if the last option is SHOW_ALL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show all should not act like a multi-option - there should be no way to select "show all" and also select other tags. Is there a reason users should even have the show all option anymore on the summary page? Presumably, if they send a request with no tags we will return results for all tags, so just have "All" be something thats shown by default when the tag list is empty should be suufficient


const appliedTagCategoryFilter = tag || (variantGuid ? null : (tableState.categoryFilter || ALL_FILTER))
const { familyGuid, variantGuid } = match.params
const optionValue = getSelectedTag(variantGuid ? null : (tableState.categoryFilter || ALL_FILTER))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSelectedTag should then take tableState as an argument and have page specific handling for that, as this is still page specific logic. Moreover, variantGuid is only a valid prop on the project page, not the summary page, so that logic also belongs in the project page

Comment on lines 135 to 139
if (tags.length > 1) {
pairedVariants = matchingVariants(
pairedVariants, ({ tagGuids }) => {
const tagNames = tagGuids.map(tagGuid => tagsByGuid[tagGuid].name)
return tags.every(tagName => tagNames.includes(tagName))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The combined DB filter you had implemented in the last round of review worked?

Comment on lines 57 to 58
dispatch({ type: RECEIVE_DATA, updatesById: responseJson })
dispatch({ type: RECEIVE_SAVED_VARIANT_TAGS, updates: { [tag]: responseJson.savedVariantsByGuid } })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works but is risky because the first dispatch stops the loading state, and the second dispatch update the data state later. But it won't work if we do it in another order because dispatching RECEIVE_SAVED_VARIANT_TAGS triggers the selector running, which needs the data from the other dispatch.
Are there ways to do these two dispatches together?

Copy link
Collaborator

@hanars hanars Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there ways to do these two dispatches together?

yes, instead of making the new state field have its own action type the way you do here, have it use RECEIVE_DATA as well. However you should not be storing the exact same data twice. Just save the guids associated with the compound tag and then get the full variants in the selector
Note: edited because my first suggestion was unneccessarily cmplex

dispatch({ type: RECEIVE_DATA, updatesById: { ...responseJson,  multiTagVariants: { [tag]: Object.keys(responseJson.savedVariantsByGuid) } })
...
// In reducer definition
savedVariantsByTag: createObjectsByIdReducer(RECEIVE_DATA, 'multiTagVariants'),

@@ -67,7 +79,7 @@ test('getPairedFilteredSavedVariants', () => {

test('getVisibleSortedSavedVariants', () => {
const savedVariants = getVisibleSortedSavedVariants(
STATE_WITH_2_FAMILIES, { match: { params: {} } }
STATE_WITH_2_FAMILIES, { project: { projectGuid: 'R0237_1000_genomes_demo' }, match: { params: {} } },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also be testing without the project and with the new state field to check that that works correctly too

@ShifaSZ ShifaSZ requested a review from hanars August 11, 2023 19:12
dispatch({ type: RECEIVE_DATA, updatesById: responseJson })
dispatch({
type: RECEIVE_DATA,
updatesById: { ...responseJson, multiTagVariants: { [tag]: Object.keys(responseJson.savedVariantsByGuid) } },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (tag && !gene) still seems like an important conditional, why was it removed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed naming this multiTagVariants because I thought this would be special-cased only for multi variant tags, but this implementation is nicer. However, I would rename the key savedVariantsByTag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (tag && !gene) still seems like an important conditional, why was it removed?

Not do caching when there is a gene? It makes sense. I could add the condition to the value of the savedVariantsByTag generation. But it still has corner cases for switching from no gene to having a gene.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that a corner case? The filtering in the selector will work fine for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When switching from no gene to having a gene, we should ignore the cache and reload data for the summary page. We can add a !gene condition to the data loaded check in the reducer. But the data loaded check will fail when it switches back to no gene.
Another case is that if the cached data is loaded for a particular gene, the selector may mistakenly use it for the project page for any gene.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to add the gene to the key to solve the issues. So the savedVariantsByTag will become savedVariantsByTagGene.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the amount we are caching is overkill. This is too complex and it will save basically no time for the user. Do not add this additional level of caching. Just do this like search results, where theres a single state key with a list (for search its searchedVariants, here it will be something like summaryDataVariants) that just gets overwritten every time new summary variants are loaded, and then the selector checks that first and uses that if its there


const appliedTagCategoryFilter = tag || (variantGuid ? null : (tableState.categoryFilter || ALL_FILTER))
const { familyGuid, variantGuid } = match.params
const optionValue = getSelectedTag(tableState.categoryFilter)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notpick, but theres no need to assign this to a variable, it would be more readable if you directly assigned it to the prop

@hanars hanars requested review from bpblanken and hanars and removed request for hanars September 6, 2023 16:27
@hanars hanars merged commit 48143e0 into dev Sep 7, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants