Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check option #3

Open
drchriscole opened this issue Mar 9, 2015 · 4 comments
Open

Check option #3

drchriscole opened this issue Mar 9, 2015 · 4 comments

Comments

@drchriscole
Copy link
Contributor

Like for the std md5sum command it would be really handy to have a 'check' option (-c) in order to check a fastq/bam vs it's bamhash to programmatically verify it's consistency.

Also, it's confusing that fastq read pairs are only counted once whereas they're counted twice in the bam. It would be clearer if the numbers agreed as well as the hashes.
Thanks!

@dpryan79
Copy link
Contributor

I have a branch in my fork that implements essentially this in a program called bamhash_checksum_all. It accepts multiple BAM/fastq files and will print the checksum and count for each and then say whether they differ (it'll also set the exit status accordingly).

@drchriscole
Copy link
Contributor Author

Thanks, that's neat. However, that wasn't (quite) the behaviour I was after.

With 'md5sum -c', you provide it with a file containing filenames and md5 hashes and it goes through each file checking whether hashes match. It reports OK or Fail for each as it goes along. e.g.

~/tmp> cat md5sums
0bb17fbf22eeb5ff8bc1e1e5401214ef rand1.csv
af5d00fd5c1a7e66dfa770c229eb9bac rand2.csv
74346ab1470c74e38d5c91db0b57ea23 rand3.csv
~/tmp> md5sum -c md5sums
rand1.csv: OK
rand2.csv: OK
rand3.csv: FAILED
md5sum: WARNING: 1 of 3 computed checksums did NOT match

I wonder if something similar could be acheived here, simply for the bam files? Because once the BAMs are confirmed to be consistent with the fastqs you don't the fastqs anymore.
0bb17fbf22eeb5ff8bc1e1e5401214ef aligned.bam

@dpryan79
Copy link
Contributor

Ah, I misunderstood. That would indeed be useful.

@drchriscole
Copy link
Contributor Author

That is not to say that this branch is not a useful addtion. It is. I'd like to see this branch merged with the trunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants