Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bed file #14

Open
xiucz opened this issue Dec 4, 2018 · 2 comments
Open

bed file #14

xiucz opened this issue Dec 4, 2018 · 2 comments

Comments

@xiucz
Copy link

xiucz commented Dec 4, 2018

Hi,
In this part, it writes

bedtools intersect -wa -wb -b /workspace/inputs/references/transcriptome/gene_annotation.bed -a WGS_Tumor_merged_sorted_mrkdup_bqsr.2.cns > WGS_Tumor_merged_sorted_mrkdup_bqsr.2.annotated.cns

I know that bed file is 0-based but cns file is also 0-based(mimused by 1). But it seems that we should plus 1 to the start of every recode in the result cns file? Because the CNS format is 1-based.

Thanks for your reply.

@zlskidmore
Copy link
Member

hi @xiucz thanks for this report!

cnvkit outputs a 1-based copy number segment format from the documentation here:
https://cnvkit.readthedocs.io/en/stable/fileformats.html

on the page you linked we run this to convert the 1-based coordinates from cnvkit to 0-based to match the bed file

tail -n +2 WGS_Tumor_merged_sorted_mrkdup_bqsr.cns | awk '{print $1"\t"$2-1"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7"\t"$8}' > WGS_Tumor_merged_sorted_mrkdup_bqsr.2.cns

So at this point WGS_Tumor_merged_sorted_mrkdup_bqsr.cns remains 1-based but WGS_Tumor_merged_sorted_mrkdup_bqsr.2.cns is now 0-based

I often refer to this biostarts post when doing these coordinate conversions
https://www.biostars.org/p/84686/

we the run bedtools intersect on the 0-based bed file and the 0-based segment file.
bedtools intersect -wa -wb -b /workspace/inputs/references/transcriptome/gene_annotation.bed -a WGS_Tumor_merged_sorted_mrkdup_bqsr.2.cns > WGS_Tumor_merged_sorted_mrkdup_bqsr.2.annotated.cns

so at this point bedtools intersect is working on two 0-based files so everything I think should be fine

Let me know if you disagree or if i've misunderstood the issue you've presented

@xiucz
Copy link
Author

xiucz commented Dec 5, 2018

Hi,

we the run bedtools intersect on the 0-based bed file and the 0-based segment file.

This step, I agree with you, and the result file 2.annotated.cns is still 0-based. So if we want to use the result file to go on other analysis, is it better to convert it to 1-based?

And I have one more suggestion, rename ".2.annotated.cns" to ".annotated.bed", this will be more clearly to know the coordination system of the file for newers.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants