Replies: 4 comments 4 replies
-
seqr does not support having multiple genome versions within the same project, so option 2 is not really going to work for you. If you want to try to modify the code to allow projects to work with multiple genome versions you can, but I strongly recommend against that. You also need all family members to exist within the same index for inheritance search to work. So in option 1, you would have to make a GRCh37 index with the parents AND their children in it. Option 3 is the approach we have used when we started getting new samples in aligned to GRCh38, and in general we find theres better disk space usage with one index anyways. The TL;DR is you are going to need to create one index with parents and children together, regardless of which genome version you use. In theory, the pipeline can take a comma separated list of VCFs, so you can theoretically do that with the 2 existing VCFs without needing to joint call again. However, the callset AF will be more accurate if you joint call them all together, so up to you how much you rely on that data. In terms of which genome version to make your new index in, thats really up to you. For us, all our new data is being delivered as GRCh38 so once that started happening we decided it was easier to switch the project genome versions over. If you don't anticipate getting any new samples for this project, or you anticipate getting them aligned to GRCh37, I don't see a real good reason to switch. But if you do anticipate getting new data in on GRCh38, especially if its data that you would want to add incrementally (i.e. some new probands/ trios that are unrelated to your current data set that you won't joint call with the existing data) then I would recommend switching your project to GRCh38 now, as Notes on
|
Beta Was this translation helpful? Give feedback.
-
Thanks for your elaborate and helpful response @hanars, I'm sure we'll refer to this issue all the time whenever we hit tricky loading scenarios. I've updated Option 1 to be more clear that parents and proband must belong to one index. Regarding callset AF, our projects are typically small so we have our default filters configured callset AF to 1 (i.e. ignored). I take your point though, if we joint call everything in the project then some projects will have somewhat useful callset data to refer to. I will further consult our researchers and decide on what to do. We may have more questions down the track. Either way, we'll update here on what we end up doing. |
Beta Was this translation helpful? Give feedback.
-
Hello, I have a question about reloading patients with new family members.
I suppose the error is not in Seqr, but in the way I have merged the VCF files. But, if you have successed to merge VCF files and imported in Seqf then, do you know how I can do ? Thank you in advance for your help. |
Beta Was this translation helpful? Give feedback.
-
@hanars - I've not tried this before myself, how does Hail do the merging/combining? Any links/docs I could read? |
Beta Was this translation helpful? Give feedback.
-
Strategies for reloading probands as trios
In our MCRI Seqr instance, we have the following scenario.
Researchers now have data for some parents of these individuals and would like to analyse some of them as trios, around
We would like to consult your team on best strategy to load these in?
Re-associating individuals with new individuals and grouping as family should just be a matter of uploading a new
pedigree file using Seqr's bulk edit individual functionality. However, how should we load the new VCFs in? Using
which genome version? Creating new index just for the parents and associating to same project? Joint call and reload
again as trios? What about existing saved variant details?
Here's what we have in mind.
Option 1 - keep as GRCh37
Option 2 - reload trios as GRCh38
whilst other probands will still be part of old index.
lift_project_to_hg38.py
to migrate saved variants to GRCh38. May need to modify this to only include applicable probands that are forming new
trios.
Option 3 - re-create and reload whole project as GRCh38
will now be in one index.
lift_project_to_hg38.py
to liftover all saved variants.We were thinking of Option 2. Any comments and/or suggestions?
Questions regarding
lift_project_to_hg38.py
script:--es-index
argument is expecting the new ES index in GRCh38 and NOT the old index right?Beta Was this translation helpful? Give feedback.
All reactions