Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different behaviour scatter shell shorthand #923

Open
lvree opened this issue Oct 18, 2024 · 2 comments
Open

Different behaviour scatter shell shorthand #923

lvree opened this issue Oct 18, 2024 · 2 comments
Labels

Comments

@lvree
Copy link

lvree commented Oct 18, 2024

I'm very new to CNV analysis and run across something I don't understand. As stated in the manual for the scatter command there are two options:
image

I assumed they would produce the same output, as it was only a shorthand.

But the first command produces this plot:
image

And the second produces this:
image

What does explain this difference in behaviour, or am I misunderstanding something? And can someone explain what the difference in the plots means? Like, why does one have sort of lines and the other has a lot of points?
Also, why does the first have it lowest point around -2.5 and the second below -3? I didn't do any manual scaling of the y-axis here yet, so this is just standard output

Then also another thing when I zoomed in on one chromosome the lowest point visible is at -4:
image

But for the standard plot of the whole genome that isn't even in there, but when I lower the min size of the Y axis it is indeed in there:
image

Why does it cut the graph off when there are still points there? Is it because there are no orange points there?

Thank you very much!

@lvree
Copy link
Author

lvree commented Oct 18, 2024

Aah I'm sorry I see accidentally ran the call.cns file instead of the .cns file. But the output I got when using call.cns was the same as when I ran this part on my dataset:
cnvkit.py batch *Tumor.bam --normal *Normal.bam
--targets my_baits.bed --annotate refFlat.txt
--fasta hg19.fasta --access data/access-5kb-mappable.hg19.bed
--output-reference my_reference.cnn --output-dir results/
--diagram --scatter

Further it says this:
image
So why is the batch command then using call.cns instead of .cns as is stated?

@etal
Copy link
Owner

etal commented Nov 13, 2024

The file .call.cns is based on the first .cns, with some additional processing:

  • Absolute copy number is inferred, and certain segments are marked as having neutral copy number. These segments are plotted in gray instead of orange, so that the non-neutral segments are more distinct.
  • The confidence interval of each segment's mean is checked for overlap with neutral copy number (log2 ratio 0.0); if it overlaps, the segment is merged with any neighboring segments that also have neutral copy number. (Something like that.) So the .call.cns may also have fewer segment breakpoints, and the remaining segments may have slightly different mean log2 ratio values.

@etal etal added the question label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants