-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LT_3 > LT_10? #2
Comments
Updating this to report what I think may be a bug, at least based on given variable definitions for LT_3 and LT_10. Line 317 of the makefile currently reads
so I believe that lt10 is only being incremented if coverage at a given position in the reference is < 10 but not <3, which I don't think is the desired behavior. (I presume lt10 should be incremented whenever coverage is <10.) Does this seem correct? Thanks! Best, |
Hey David! Thanks for putting up with this bug for this long and going beyond to isolate the source. You are correct that there is an extraneous CURRENT: CORRECTED: Will add this patch in a future update before closing the issue. Best, |
Going to add another point to this in case anyone else looking might be interested. It might still be handy to include this bug as its own separate metric displaying the difference of Contrived example:
If one suspects generally low coverage for a plasmid, it would be expected to look more like sample B01 where 50% of the plasmid has less than 10X coverage with the occasional dip to below 3X coverage, perhaps 5% of the B01 reference. However, if 50% of the plasmid has < 10X coverage but all of that is also < 3X coverage as suggested in sample A01, it may very well be due to the fact that there were zero reads in that region due to the plasmid having a deletion there but strong coverage everywhere else. Depending on one's preferences, the diff metric could perhaps be something like the percentage of (LT_3 / LT_10) instead of just the difference. Of course, one could also just glance at the However, one should also keep in mind confounding sources of steep LT_diff. For example, in the cases of non-uniform coverage such as due to highly polymeric regions, this can manifest in sudden low coverage and thus relying on a diff metric alone would be misleading. Though, true deletions would be corroborated by other metrics, such as the existence of non-zero |
Hi,
Thanks for this very useful set of tools! I'm running the octopus3 pipeline (just with run-octopus-analysis.sh), and the pipeline is apparently completing successfully, but when I look at LT_3 and LT_10 in aggregated-stats.tsv, some samples have LT_10s that are smaller than their LT_3s. This seems like it should be impossible, since any part of my input reference that has <3x coverage certainly should have <10x coverage.
Unfortunately, since I'm working with proprietary data, it's difficult for me to provide a reproducible example, but if you have any insight into why this LT_10/LT_3 pattern might be cropping up, I'd really appreciate it!
Thanks so much!
Best,
David
The text was updated successfully, but these errors were encountered: