-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check option #3
Comments
I have a branch in my fork that implements essentially this in a program called bamhash_checksum_all. It accepts multiple BAM/fastq files and will print the checksum and count for each and then say whether they differ (it'll also set the exit status accordingly). |
Thanks, that's neat. However, that wasn't (quite) the behaviour I was after. With 'md5sum -c', you provide it with a file containing filenames and md5 hashes and it goes through each file checking whether hashes match. It reports OK or Fail for each as it goes along. e.g. ~/tmp> cat md5sums I wonder if something similar could be acheived here, simply for the bam files? Because once the BAMs are confirmed to be consistent with the fastqs you don't the fastqs anymore. |
Ah, I misunderstood. That would indeed be useful. |
That is not to say that this branch is not a useful addtion. It is. I'd like to see this branch merged with the trunk. |
Like for the std md5sum command it would be really handy to have a 'check' option (-c) in order to check a fastq/bam vs it's bamhash to programmatically verify it's consistency.
Also, it's confusing that fastq read pairs are only counted once whereas they're counted twice in the bam. It would be clearer if the numbers agreed as well as the hashes.
Thanks!
The text was updated successfully, but these errors were encountered: