-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saved variant timeout #3510
Saved variant timeout #3510
Conversation
for family_guid in var['familyGuids']: | ||
family_genes[family_guid].update(var.get('transcripts', {}).keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to include the families who have no transcripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes overall, as this is used for the the list of families for filtering which projects to return etc. However, I can add an extra filter before we get the family list for rna data, as theres no reason to include rna data for families with no genes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this below
seqr/views/utils/variant_utils.py
Outdated
return { | ||
agg['sample__individual__family__guid']: {'tpmGenes': [ | ||
gene for gene in agg['genes'] if gene in family_genes[agg['sample__individual__family__guid']] | ||
]} for agg in tpm_family_genes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we filter out the families with no genes in the family_genes
?
The issue appears to be that the increase in the number of TPM models now that we loaded TPM==0 has increased the size of the response and we are actually losing most of the time serializing and deserializing the response. This updates the behavior so we only return TPM genes for a family if the family has a variant in that gene, instead of if any family has a variant in that gene. For RGP this redoes the response size by over 100Mb