Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gap regions and blacklist regions #11

Open
dewshr opened this issue Dec 6, 2019 · 6 comments
Open

gap regions and blacklist regions #11

dewshr opened this issue Dec 6, 2019 · 6 comments

Comments

@dewshr
Copy link

dewshr commented Dec 6, 2019

Can I call the SV without providing gap regions and blacklist regions? I am using mm10 reference genome, do you know from where I can download the SV blacklist region? Thank you.

@fangli80
Copy link
Collaborator

fangli80 commented Dec 6, 2019

You can generate an empty gap region file and blacklist file and the program should be able to run. However, you may get many false-positive calls in blacklist regions, because there are noisy barcode signals in some complex regions. It's hard to tell if the signals are real or not without a control data set. I don't have a mouse WGS data set so I cannot provide the blacklist files. In this case, it would be good to have a "normal control" sample and a case sample. And you detect SVs in both samples separately and remove the SVs in the normal samples.

By the way, how many samples do you have, and do you want to call germline SVs or somatic SVs?

@dewshr
Copy link
Author

dewshr commented Dec 6, 2019

for now I am only trying with one sample, and I am calling germline SV. And is it possible to call multi-sample SV calling?

@fangli80
Copy link
Collaborator

fangli80 commented Dec 6, 2019

OK. please run in germline mode.
The gap region for GRCm38 is here:
GRCm38.p5.genome.gap.zip

For now, it cannot call multiple samples simultaneously.
Do you want to find a disease SV or do you simply want to get all germline SV calls of the sample?

If you want to find a disease SV, LinkedSV will output some plots and you can manually check the candidate SVs to see if they are real.
If you want to generate a high confident germline SV call set, you may need to remove SV calls that overlap with MHC regions, telomeres, centromeres.

Best,
Li

@dewshr
Copy link
Author

dewshr commented Dec 6, 2019

I want to call all the SV. Thank you for your reply

@dewshr
Copy link
Author

dewshr commented Dec 9, 2019

do I need to provide 2D_blacklist_file? I am getting no such file or directory found error. I looked at the code in arguments.py, and that argument is passed as an empty string

@fangli80
Copy link
Collaborator

Sorry for the late reply. The 2D_blacklist_file is also needed if you are not using human reference genomes hg19, hg38 or b37. You can provide an empty file if you are working with mouse genomes and don't have the blacklist file.

Best,
Li

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants