-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary format differs from expectation #73
Comments
Hi, You're rather close, though the distances are float32, not float64, and unlike the packed upper-triangular distance, the asymmetric comparison has no bookkeeping in the file. I realize now that usage for asymmetric comparisons (the The You would typically only want to use both -Q and -F for asymmetric distances like containment, where Does this help? Thanks! Daniel |
Okay thanks! I knew the actual data was in nt.float.32, I updated my comment to make that clear. I ended up using Q and F because of this line in the README. "To generate a full, asymmetric distance matrix, provide the same path to -F and -Q." I tried it both ways and when I ran with
Running with
|
I see. That makes sense -- in fact, in the process of investigating the problem today, I ran into the same problem (Unknown error -1), fixed it, and incorporated into a new release which just finished building. Want to give it a try? |
Yes, your new release, v0.5.7, fixed the issue. Thanks. |
I used Dashing v0.5.6 s128 on a Linux machine to compare pre-hashed genomes. the command was:
From the specification here I was expecting a half matrix output with 1 byte specifying full or half matrix, 8 bytes specifying the length in np.float64, and ((n*(n-1)/2)*4 bytes of data in npfloat32. Note that supplying
-Q
only for the file path did not work.Instead, I get a file of exactly (n**2)*4 bytes so I'm assuming I just got a square matrix of 4-byte float32 values.
The file is 422,393,406,724 bytes for n = 324,959.
I can import the data as a Numpy memory map doing this:
I just wanted to know if this import was correct and also make you aware that the output was not what I expected. I saw in the previous issues you are working on documenting the binary format so I thought I'd pass this along. Overall, Dashing is fantastic and I really appreciate your team's hard work.
The text was updated successfully, but these errors were encountered: