-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hail backend - SV WES #3575
Hail backend - SV WES #3575
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly, it looks good to me.
QUERY_CLASS_MAP = {cls.DATA_TYPE: cls for cls in [VariantHailTableQuery, SvHailTableQuery]} | ||
class GcnvHailTableQuery(SvHailTableQuery): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the file is getting bigger, how about splitting them into per-class files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, Ben suggested the same thing on the other PR. I agree, but to keep the diff sane I will do a follow up PR to move everything into its own file
…te/seqr into hail-backend-sv-wes
…l-backend-sv-wes
|
||
def get_allowed_sv_type_ids(self, sv_types): | ||
return super().get_allowed_sv_type_ids([ | ||
type.replace('gCNV_', '') for type in sv_types if type.startswith('gCNV_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case for doing this replacement in the pipeline? So that instead of passing the enum ids for gCNV_DEL
and gCNV_DUP
we instead pass the ids for DEL
and DUP
? Or are gCNV_DEL
and gCNV_DUP
used elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I had thought we already did, because gCNV_DUP
and gCNV_DEL
are in the enum, but then after I saw that they weren't I though this approach was easier. We need to return them to the client without the prefix so it would be doing an additional annotation to format them and remove the prefix after the query, and doing a one time string replace in python before doing the query seemed more efficient than having to modify every row in the table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree the string replace in python is better. I guess, if we'd like to remove gCNV_DUP
and gCNV_DEL
from the enum it's no problem on my end.
@@ -8,6 +8,8 @@ | |||
def _hl_json_default(o): | |||
if isinstance(o, hl.Struct) or isinstance(o, hl.utils.frozendict): | |||
return dict(o) | |||
elif isinstance(o, set): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be hl.set
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it was an issue with a regular python set, sets are not actually json serializable and this solved the particular issue. But I can also add a case for a hail set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, since we are only json serializing collected results we do not run into this case, as hail will have already converted all hail setexpressions to python sets. The only cases where we end up with hail-specific data types are those that hail provides which can exist as raw dat atypes and not expressions, which is just Interval, Struct, and frozendict - https://hail.is/docs/0.2/utils/index.html#utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I think that's right.
Getting this up early for review, still needs unit tests